hermes-agent

Author	SHA1	Message	Date
adybag14-cyber	6dcb3c4774	fix(termux): compact narrow-screen tui chrome	2026-04-09 16:24:53 -07:00
adybag14-cyber	096b3f9f12	fix(termux): add local image chat route	2026-04-09 16:24:53 -07:00
adybag14-cyber	a3aed1bd26	fix(termux): keep quiet chat output parseable	2026-04-09 16:24:53 -07:00
Brooklyn Nicholson	99fd3b518d	feat: add /copy and /agents	2026-04-09 17:19:36 -05:00
Teknium	6b437f7934	fix: /browser connect auto-launch uses dedicated profile dir (#6821 ) Chrome auto-launch now passes --user-data-dir, --no-first-run, and --no-default-browser-check so the debug instance doesn't conflict with an already-running Chrome using the default profile. The profile dir lives at {hermes_home}/chrome-debug/. Also updates the fallback manual instructions to include the same flags and removes the stale 'close existing Chrome windows' hint.	2026-04-09 14:55:45 -07:00
Teknium	f91fffbe33	Revert "fix: /browser connect auto-launch uses dedicated profile dir" This reverts commit `c3854e0f85`.	2026-04-09 14:54:37 -07:00
Teknium	c3854e0f85	fix: /browser connect auto-launch uses dedicated profile dir Chrome auto-launch now passes --user-data-dir, --no-first-run, and --no-default-browser-check so the debug instance doesn't conflict with an already-running Chrome using the default profile. The profile dir lives at {hermes_home}/chrome-debug/. Also updates the fallback manual instructions to include the same flags and removes the stale 'close existing Chrome windows' hint.	2026-04-09 14:52:58 -07:00
Greer Guthrie	775a46ce75	fix: normalize reasoning effort ordering in UI	2026-04-09 14:20:16 -07:00
Teknium	2772d99085	fix: remove /prompt slash command — footgun via prefix expansion (#6752 ) /pr <anything> silently resolved to /prompt via the shortest-match tiebreaker in prefix expansion, permanently overwriting the system prompt and persisting to config. The command's functionality (setting agent.system_prompt) is available via config.yaml and /personality covers the common use case. Removes: CommandDef, dispatch branch, _handle_prompt_command handler, docs references, and updates subcommand extraction test.	2026-04-09 11:27:27 -07:00
Yang Zhi	2f0a83dd12	fix(cli): update TUI status bar model name on provider fallback The status bar reads self.model from the CLI class, which is set once at init and never updated when _try_activate_fallback() switches to a backup provider/model in run_agent.py. This causes the TUI to display the original model name while context_length_max changes, creating a confusing mismatch. Read the model name from agent.model (live, updated by fallback) with self.model as fallback before the agent is created. Remove the redundant getattr(self, 'agent') call that was already done above.	2026-04-09 11:11:25 -07:00
Teknium	8dfc96dbbb	feat: capture provider rate limit headers and show in /usage (#6541 ) Parse x-ratelimit-* headers from inference API responses (Nous Portal, OpenRouter, OpenAI-compatible) and display them in the /usage command. - New agent/rate_limit_tracker.py: parse 12 rate limit headers (RPM/RPH/ TPM/TPH limits, remaining, reset timers), format as progress bars (CLI) or compact one-liner (gateway) - Hook into streaming path in run_agent.py: stream.response.headers is available on the OpenAI SDK Stream object before chunks are consumed - CLI /usage: appends rate limit section with progress bars + warnings when any bucket exceeds 80% - Gateway /usage: appends compact rate limit summary - 24 unit tests covering parsing, formatting, edge cases Headers captured per response: x-ratelimit-{limit,remaining,reset}-{requests,tokens}{,-1h} Example CLI display: Nous Rate Limits (captured just now): Requests/min [░░░░░░░░░░░░░░░░░░░░] 0.1% 1/800 used (799 left, resets in 59s) Tokens/hr [░░░░░░░░░░░░░░░░░░░░] 0.0% 49/336.0M (336.0M left, resets in 52m)	2026-04-09 03:43:14 -07:00
Lumen Radley	e22416dd9b	fix: handle empty sudo password and false prompts	2026-04-09 02:50:07 -07:00
Teknium	7156f8d866	fix: CI test failures — metadata key, cli console, docker env, vision order (#6294 ) Fixes 9 test failures on current main, incorporating ideas from PR stack #6219-#6222 by xinbenlv with corrections: - model_metadata: sync HF context length key casing (minimaxai/minimax-m2.5 → MiniMaxAI/MiniMax-M2.5) - cli.py: route quick command error output through self.console instead of creating a new ChatConsole() instance - docker.py: explicit docker_forward_env entries now bypass the Hermes secret blocklist (intentional opt-in wins over generic filter) - auxiliary_client: revert _read_main_provider() to simple provider.strip().lower() — the _normalize_aux_provider() call introduced in `5c03f2e7` stripped the custom: prefix, breaking named custom provider resolution - auxiliary_client: flip vision auto-detection order to active provider → OpenRouter → Nous → stop (was OR → Nous → active) - test: update vision priority test to match new order Based on PR #6219-#6222 by xinbenlv.	2026-04-08 16:37:05 -07:00
Teknium	8b0afa0e57	fix: aggressive worktree and branch cleanup to prevent accumulation (#6134 ) Problem: hermes -w sessions accumulated 37+ worktrees and 1200+ orphaned branches because: - _cleanup_worktree bailed on any dirty working tree, but agent sessions almost always leave untracked files/artifacts behind - _prune_stale_worktrees had the same dirty-check, so stale worktrees survived indefinitely - pr-* and hermes/* branches from PR review had zero cleanup mechanism Changes: - _cleanup_worktree: check for unpushed commits instead of dirty state. Agent work lives in pushed commits/PRs — dirty working tree without unpushed commits is just artifacts, safe to remove. - _prune_stale_worktrees: three-tier age system: - Under 24h: skip (session may be active) - 24h-72h: remove if no unpushed commits - Over 72h: force remove regardless - New _prune_orphaned_branches: on each -w startup, deletes local hermes/hermes-* and pr-* branches with no corresponding worktree. Protects main, checked-out branch, and active worktree branches. Tests: 42 pass (6 new covering unpushed-commit logic, force-prune tier, and orphaned branch cleanup).	2026-04-08 04:44:49 -07:00
Felipe de Leon	bdc72ec355	feat(cli): add on_session_finalize and on_session_reset plugin hooks Plugins can now subscribe to session boundary events via ctx.register_hook('on_session_finalize', ...) and ctx.register_hook('on_session_reset', ...). on_session_finalize — fires during CLI exit (/quit, Ctrl-C) and before /new or /reset, giving plugins a chance to flush or clean up. on_session_reset — fires after a new session is created via /new or /reset, so plugins can initialize per-session state. Closes #5592	2026-04-08 04:27:34 -07:00
Teknium	9692b3c28a	fix: CLI/UX batch — ChatConsole errors, curses scroll, skin-aware banner, git state banner (#5974 ) * fix(cli): route error messages through ChatConsole inside patch_stdout Cherry-pick of PR #5798 by @icn5381. Replace self.console.print() with ChatConsole().print() for 11 error/status messages reachable during the interactive session. Inside patch_stdout, self.console (plain Rich Console) writes raw ANSI escapes that StdoutProxy mangles into garbled text. ChatConsole uses prompt_toolkit's native print_formatted_text which renders correctly. Same class of bug as #2262 — that fix covered agent output but missed these error paths in _ensure_runtime_credentials, _init_agent, quick commands, skill loading, and plan mode. * fix(model-picker): add scrolling viewport to curses provider menu Cherry-pick of PR #5790 by @Lempkey. Fixes #5755. _curses_prompt_choice rendered items starting unconditionally from index 0 with no scroll offset. The 'More providers' submenu has 13 entries. On terminals shorter than ~16 rows, items past the fold were never drawn. When UP-arrow wrapped cursor from 0 to the last item (Cancel, index 12), the highlight rendered off-screen — appearing as if only Cancel existed. Adds scroll_offset tracking that adjusts each frame to keep the cursor inside the visible window. * feat(cli): skin-aware compact banner + git state in startup banner Combined salvage of PR #5922 by @ASRagab and PR #5877 by @xinbenlv. Compact banner changes (from #5922): - Read active skin colors and branding instead of hardcoding gold/NOUS HERMES - Default skin preserves backward-compatible legacy branding - Non-default skins use their own agent_name and colors Git state in banner (from #5877): - New format_banner_version_label() shows upstream/local git hashes - Full banner title now includes git state (upstream hash, carried commits) - Compact banner line2 shows the version label with git state - Widen compact banner max width from 64 to 88 to fit version info Both the full Rich banner and compact fallback are now skin-aware and show git state.	2026-04-07 17:59:42 -07:00
Teknium	ca0459d109	refactor: remove 24 confirmed dead functions — 432 lines of unused code Each function was verified to have exactly 1 reference in the entire codebase (its own definition). Zero calls, zero imports, zero string references anywhere including tests. Removed by category: Superseded wrappers (replaced by newer implementations): - agent/anthropic_adapter.py: run_hermes_oauth_login, refresh_hermes_oauth_token - hermes_cli/callbacks.py: sudo_password_callback (superseded by CLI method) - hermes_cli/setup.py: _set_model_provider, _sync_model_from_disk - tools/file_tools.py: get_file_tools (superseded by registry.register) - tools/cronjob_tools.py: get_cronjob_tool_definitions (same) - tools/terminal_tool.py: _check_dangerous_command (_check_all_guards used) Dead private helpers (lost their callers during refactors): - agent/anthropic_adapter.py: _convert_user_content_part_to_anthropic - agent/display.py: honcho_session_line, write_tty - hermes_cli/providers.py: _build_labels (+ dead _labels_cache var) - hermes_cli/tools_config.py: _prompt_yes_no - hermes_cli/models.py: _extract_model_ids - hermes_cli/uninstall.py: log_error - gateway/platforms/feishu.py: _is_loop_ready - tools/file_operations.py: _read_image (64-line method) - tools/process_registry.py: cleanup_expired - tools/skill_manager_tool.py: check_skill_manage_requirements Dead class methods (zero callers): - run_agent.py: _is_anthropic_url (logic duplicated inline at L618) - run_agent.py: _classify_empty_content_response (68-line method, never wired) - cli.py: reset_conversation (callers all use new_session directly) - cli.py: _clear_current_input (added but never wired in) Other: - gateway/delivery.py: build_delivery_context_for_tool - tools/browser_tool.py: get_active_browser_sessions	2026-04-07 11:41:26 -07:00
Teknium	d0ffb111c2	refactor: codebase-wide lint cleanup — unused imports, dead code, and inefficient patterns (#5821 ) Comprehensive cleanup across 80 files based on automated (ruff, pyflakes, vulture) and manual analysis of the entire codebase. Changes by category: Unused imports removed (~95 across 55 files): - Removed genuinely unused imports from all major subsystems - agent/, hermes_cli/, tools/, gateway/, plugins/, cron/ - Includes imports in try/except blocks that were truly unused (vs availability checks which were left alone) Unused variables removed (~25): - Removed dead variables: connected, inner, channels, last_exc, source, new_server_names, verify, pconfig, default_terminal, result, pending_handled, temperature, loop - Dropped unused argparse subparser assignments in hermes_cli/main.py (12 instances of add_parser() where result was never used) Dead code removed: - run_agent.py: Removed dead ternary (None if False else None) and surrounding unreachable branch in identity fallback - run_agent.py: Removed write-only attribute _last_reported_tool - hermes_cli/providers.py: Removed dead @property decorator on module-level function (decorator has no effect outside a class) - gateway/run.py: Removed unused MCP config load before reconnect - gateway/platforms/slack.py: Removed dead SessionSource construction Undefined name bugs fixed (would cause NameError at runtime): - batch_runner.py: Added missing logger = logging.getLogger(__name__) - tools/environments/daytona.py: Added missing Dict and Path imports Unnecessary global statements removed (14): - tools/terminal_tool.py: 5 functions declared global for dicts they only mutated via .pop()/[key]=value (no rebinding) - tools/browser_tool.py: cleanup thread loop only reads flag - tools/rl_training_tool.py: 4 functions only do dict mutations - tools/mcp_oauth.py: only reads the global - hermes_time.py: only reads cached values Inefficient patterns fixed: - startswith/endswith tuple form: 15 instances of x.startswith('a') or x.startswith('b') consolidated to x.startswith(('a', 'b')) - len(x)==0 / len(x)>0: 13 instances replaced with pythonic truthiness checks (not x / bool(x)) - in dict.keys(): 5 instances simplified to in dict - Redefined unused name: removed duplicate _strip_mdv2 import in send_message_tool.py Other fixes: - hermes_cli/doctor.py: Replaced undefined logger.debug() with pass - hermes_cli/config.py: Consolidated chained .endswith() calls Test results: 3934 passed, 17 failed (all pre-existing on main), 19 skipped. Zero regressions.	2026-04-07 10:25:31 -07:00
Ben Barclay	b2f477a30b	feat: switch managed browser provider from Browserbase to Browser Use (#5750 ) * feat: switch managed browser provider from Browserbase to Browser Use The Nous subscription tool gateway now routes browser automation through Browser Use instead of Browserbase. This commit: - Adds managed Nous gateway support to BrowserUseProvider (idempotency keys, X-BB-API-Key auth header, external_call_id persistence) - Removes managed gateway support from BrowserbaseProvider (now direct-only via BROWSERBASE_API_KEY/BROWSERBASE_PROJECT_ID) - Updates browser_tool.py fallback: prefers Browser Use over Browserbase - Updates nous_subscription.py: gateway vendor 'browser-use', auto-config sets cloud_provider='browser-use' for new subscribers - Updates tools_config.py: Nous Subscription entry now uses Browser Use - Updates setup.py, cli.py, status.py, prompt_builder.py display strings - Updates all affected tests to match new behavior Browserbase remains fully functional for users with direct API credentials. The change only affects the managed/subscription path. * chore: remove redundant Browser Use hint from system prompt * fix: upgrade Browser Use provider to v3 API - Base URL: api/v2 -> api/v3 (v2 is legacy) - Unified all endpoints to use native Browser Use paths: - POST /browsers (create session, returns cdpUrl) - PATCH /browsers/{id} with {action: stop} (close session) - Removed managed-mode branching that used Browserbase-style /v1/sessions paths — v3 gateway now supports /browsers directly - Removed unused managed_mode variable in close_session * fix(browser-use): use X-Browser-Use-API-Key header for managed mode The managed gateway expects X-Browser-Use-API-Key, not X-BB-API-Key (which is a Browserbase-specific header). Using the wrong header caused a 401 AUTH_ERROR on every managed-mode browser session create. Simplified _headers() to always use X-Browser-Use-API-Key regardless of direct vs managed mode. * fix(nous_subscription): browserbase explicit provider is direct-only Since managed Nous gateway now routes through Browser Use, the browserbase explicit provider path should not check managed_browser_available (which resolves against the browser-use gateway). Simplified to direct-only with managed=False. * fix(browser-use): port missing improvements from PR #5605 - CDP URL normalization: resolve HTTP discovery URLs to websocket after cloud provider create_session() (prevents agent-browser failures) - Managed session payload: send timeout=5 and proxyCountryCode=us for gateway-backed sessions (prevents billing overruns) - Update prompt builder, browser_close schema, and module docstring to replace remaining Browserbase references with Browser Use - Dynamic /browser status detection via _get_cloud_provider() instead of hardcoded env var checks (future-proof for new providers) - Rename post_setup key from 'browserbase' to 'agent_browser' - Update setup hint to mention Browser Use alongside Browserbase - Add tests: CDP normalization, browserbase direct-only guard, managed browser-use gateway, direct browserbase fallback --------- Co-authored-by: rob-maron <132852777+rob-maron@users.noreply.github.com>	2026-04-07 08:40:22 -04:00
Teknium	e120d2afac	feat: notify_on_complete for background processes (#5779 ) * feat: notify_on_complete for background processes When terminal(background=true, notify_on_complete=true), the system auto-triggers a new agent turn when the process exits — no polling needed. Changes: - ProcessSession: add notify_on_complete field - ProcessRegistry: add completion_queue, populate on _move_to_finished() - Terminal tool: add notify_on_complete parameter to schema + handler - CLI: drain completion_queue after agent turn AND during idle loop - Gateway: enhanced _run_process_watcher injects synthetic MessageEvent on completion, triggering a full agent turn - Checkpoint persistence includes notify_on_complete for crash recovery - code_execution_tool: block notify_on_complete in sandbox scripts - 15 new tests covering queue mechanics, checkpoint round-trip, schema * docs: update terminal tool descriptions for notify_on_complete - background: remove 'ONLY for servers' language, describe both patterns (long-lived processes AND long-running tasks with notify_on_complete) - notify_on_complete: more prescriptive about when to use it - TERMINAL_TOOL_DESCRIPTION: remove 'Do NOT use background for builds' guidance that contradicted the new feature	2026-04-07 02:40:16 -07:00
Teknium	1c425f219e	fix(cli): defer response content until reasoning block completes (#5773 ) When show_reasoning is on with streaming, content tokens could arrive while the reasoning box was still rendering (interleaved thinking mode). This caused the response box to open before reasoning finished, resulting in reasoning appearing after the response in the terminal. Fix: buffer content in _deferred_content while _reasoning_box_opened is True. Flush the buffer through _emit_stream_text when _close_reasoning_box runs, ensuring reasoning always renders before the response.	2026-04-07 01:03:52 -07:00
Ruzzgar	abd24d381b	Implement comprehensive browser path discovery for Windows	2026-04-06 16:54:16 -07:00
Tianxiao	8a29b49036	fix(cli): handle CJK wide chars in TUI input height	2026-04-06 16:54:16 -07:00
donrhmexe	2c814d7b5d	fix: /model --global writes model.name instead of model.default The canonical config key for model name is model.default (used by setup, auth, runtime_provider, profile list, and CLI startup). But /model --global wrote to model.name in both gateway and CLI paths. This caused: - hermes profile list showing the old model (reads model.default) - Gateway restart reverting to the old model (_resolve_gateway_model reads model.default) - CLI startup using the old model (main.py reads model.default) The only reason it appeared to work in Telegram was the cached agent staying alive with the in-place switch. Fix: change all 3 write/read sites to use model.default.	2026-04-06 13:20:01 -07:00
Teknium	9c96f669a1	feat: centralized logging, instrumentation, hermes logs CLI, gateway noise fix (#5430 ) Adds comprehensive logging infrastructure to Hermes Agent across 4 phases: Phase 1 — Centralized logging - New hermes_logging.py with idempotent setup_logging() used by CLI, gateway, and cron - agent.log (INFO+) and errors.log (WARNING+) with RotatingFileHandler + RedactingFormatter - config.yaml logging: section (level, max_size_mb, backup_count) - All entry points wired (cli.py, main.py, gateway/run.py, run_agent.py) - Fixed debug_helpers.py writing to ./logs/ instead of ~/.hermes/logs/ Phase 2 — Event instrumentation - API calls: model, provider, tokens, latency, cache hit % - Tool execution: name, duration, result size (both sequential + concurrent) - Session lifecycle: turn start (session/model/provider/platform), compression (before/after) - Credential pool: rotation events, exhaustion tracking Phase 3 — hermes logs CLI command - hermes logs / hermes logs -f / hermes logs errors / hermes logs gateway - --level, --session, --since filters - hermes logs list (file sizes + ages) Phase 4 — Gateway bug fix + noise reduction - fix: _async_flush_memories() called with wrong arg count — sessions never flushed - Batched session expiry logs: 6 lines/cycle → 2 summary lines - Added inbound message + response time logging 75 new tests, zero regressions on the full suite.	2026-04-06 00:08:20 -07:00
Teknium	dce5f51c7c	feat: config structure validation — detect malformed YAML at startup (#5426 ) Add validate_config_structure() that catches common config.yaml mistakes: - custom_providers as dict instead of list (missing '-' in YAML) - fallback_model accidentally nested inside another section - custom_providers entries missing required fields (name, base_url) - Missing model section when custom_providers is configured - Root-level keys that look like misplaced custom_providers fields Surface these diagnostics at three levels: 1. Startup: print_config_warnings() runs at CLI and gateway module load, so users see issues before hitting cryptic errors 2. Error time: 'Unknown provider' errors in auth.py and model_switch.py now include config diagnostics with fix suggestions 3. Doctor: 'hermes doctor' shows a Config Structure section with all issues and fix hints Also adds a warning log in runtime_provider.py when custom_providers is a dict (previously returned None silently). Motivated by a Discord user who had malformed custom_providers YAML and got only 'Unknown Provider' with no guidance on what was wrong. 17 new tests covering all validation paths.	2026-04-05 23:31:20 -07:00
emozilla	0365f6202c	feat: show model pricing for OpenRouter and Nous Portal providers Display live per-million-token pricing from /v1/models when listing models for OpenRouter or Nous Portal. Prices are shown in a column-aligned table with decimal points vertically aligned for easy comparison. Pricing appears in three places: - /provider slash command (table with In/Out headers) - hermes model picker (aligned columns in both TerminalMenu and numbered fallback) Implementation: - Add fetch_models_with_pricing() in models.py with per-base_url module-level cache (one network call per endpoint per session) - Add _format_price_per_mtok() with fixed 2-decimal formatting - Add format_model_pricing_table() for terminal table display - Add get_pricing_for_provider() convenience wrapper - Update _prompt_model_selection() to accept optional pricing dict - Wire pricing through _model_flow_openrouter/nous in main.py - Update test mocks for new pricing parameter	2026-04-05 22:02:21 -07:00
Teknium	fc15f56fc4	feat: warn users when loading non-agentic Hermes LLM models (#5378 ) Nous Research Hermes 3 & 4 models lack tool-calling capabilities and are not suitable for agent workflows. Add a warning that fires in two places: - /model switch (CLI + gateway) via model_switch.py warning_message - CLI session startup banner when the configured model contains 'hermes' Both paths suggest switching to an agentic model (Claude, GPT, Gemini, DeepSeek, etc.).	2026-04-05 18:41:03 -07:00
Mibayy	cc2b56b26a	feat(api): structured run events via /v1/runs SSE endpoint Add POST /v1/runs to start async agent runs and GET /v1/runs/{run_id}/events for SSE streaming of typed lifecycle events (tool.started, tool.completed, message.delta, reasoning.available, run.completed, run.failed). Changes the internal tool_progress_callback signature from positional (tool_name, preview, args) to event-type-first (event_type, tool_name, preview, args, **kwargs). Existing consumers filter on event_type and remain backward-compatible. Adds concurrency limit (_MAX_CONCURRENT_RUNS=10) and orphaned run sweep. Fixes logic inversion in cli.py _on_tool_progress where the original PR would have displayed internal tools instead of non-internal ones. Co-authored-by: Mibayy <mibayy@users.noreply.github.com>	2026-04-05 12:05:13 -07:00
Teknium	54cb311f40	fix: suppress false 'Unknown toolsets' warning for MCP server names (#5279 ) MCP server names (e.g. annas, libgen) are added to enabled_toolsets by _get_platform_tools() but aren't registered in TOOLSETS until later when _sync_mcp_toolsets() runs during tool discovery. The validation in HermesCLI.__init__() fires before that, producing a false warning. Fix: exclude configured MCP server names from the validation check. CLI_CONFIG is already available at the call site, so no new imports needed. Closes #5267 (alternative fix)	2026-04-05 11:44:40 -07:00
LucidPaths	70f798043b	fix: Ollama Cloud auth, /model switch persistence, and alias tab completion - Add OLLAMA_API_KEY to credential resolution chain for ollama.com endpoints - Update requested_provider/_explicit_api_key/_explicit_base_url after /model switch so _ensure_runtime_credentials() doesn't revert the switch - Pass base_url/api_key from fallback config to resolve_provider_client() - Add DirectAlias system: user-configurable model_aliases in config.yaml checked before catalog resolution, with reverse lookup by model ID - Add /model tab completion showing aliases with provider metadata Co-authored-by: LucidPaths <LucidPaths@users.noreply.github.com>	2026-04-05 11:06:06 -07:00
Teknium	4976a8b066	feat: /model command — models.dev primary database + --provider flag (#5181 ) Full overhaul of the model/provider system. ## What changed - models.dev (109 providers, 4000+ models) as primary database for provider identity AND model metadata - --provider flag replaces colon syntax for explicit provider switching - Full ModelInfo/ProviderInfo dataclasses with context, cost, capabilities, modalities - HermesOverlay system merges models.dev + Hermes-specific transport/auth/aggregator flags - User-defined endpoints via config.yaml providers: section - /model (no args) lists authenticated providers with curated model catalog - Rich metadata display: context window, max output, cost/M tokens, capabilities - Config migration: custom_providers list → providers dict (v11→v12) - AIAgent.switch_model() for in-place model swap preserving conversation ## Files agent/models_dev.py, hermes_cli/providers.py, hermes_cli/model_switch.py, hermes_cli/model_normalize.py, cli.py, gateway/run.py, run_agent.py, hermes_cli/config.py, hermes_cli/commands.py	2026-04-05 01:04:44 -07:00
Teknium	b93fa234df	fix: clear ghost status-bar lines on terminal resize (#4960 ) * feat: add /branch (/fork) command for session branching Inspired by Claude Code's /branch command. Creates a copy of the current session's conversation history in a new session, allowing the user to explore a different approach without losing the original. Works like 'git checkout -b' for conversations: - /branch — auto-generates a title from the parent session - /branch my-idea — uses a custom title - /fork — alias for /branch Implementation: - CLI: _handle_branch_command() in cli.py - Gateway: _handle_branch_command() in gateway/run.py - CommandDef with 'fork' alias in commands.py - Uses existing parent_session_id field in session DB - Uses get_next_title_in_lineage() for auto-numbered branches - 14 tests covering session creation, history copy, parent links, title generation, edge cases, and agent sync * fix: clear ghost status-bar lines on terminal resize When the terminal shrinks (e.g. un-maximize), the emulator reflows previously full-width rows (status bar, input rules) into multiple narrower rows. prompt_toolkit's _on_resize only cursor_up()s by the stored layout height, missing the extra rows from reflow — leaving ghost duplicates of the status bar visible. Fix: monkey-patch Application._on_resize to detect width shrinks, calculate the extra rows created by reflow, and inflate the renderer's cursor_pos.y so the erase moves up far enough to clear ghosts.	2026-04-03 22:43:45 -07:00
Teknium	8af6a08695	fix: don't treat bare file paths as slash commands Input like /Users/ironin/file.md:45-46 was routed to process_command() because it starts with /. Added _looks_like_slash_command() which checks whether the first word contains additional / characters — commands never do (/help, /model), paths always do (/Users/foo/bar.md). Applied to both process_loop routing and handle_enter interrupt bypass. Preserves prefix matching (/h → /help) since short prefixes still pass the check. Based on PR #4782 by iRonin. Co-authored-by: iRonin <iRonin@users.noreply.github.com>	2026-04-03 20:16:04 -07:00
Teknium	92dcdbff66	fix: clarify interrupt re-queue label, document busy_input_mode behaviour The '📨 Queued:' label was misleading — it looked like the message was silently deferred when it was actually being sent immediately after the interrupt. Changed to '⚡ Sending after interrupt:' with multi-message count when the user typed several messages during agent execution. Added comment documenting that this code path only applies when busy_input_mode == 'interrupt' (the default). Based on PR #4821 by iRonin. Co-authored-by: iRonin <iRonin@users.noreply.github.com>	2026-04-03 15:00:05 -07:00
Teknium	3f2180037c	fix: also filter session_meta in /session switch restore path The original PR missed the third CLI restore path — the /session switch command that loads history via get_messages_as_conversation() without stripping session_meta entries.	2026-04-03 14:57:33 -07:00
kagura-agent	6bf5946bbe	fix: filter transcript-only roles from chat-completions payload (#4715 ) Add a provider-agnostic role allowlist guard to _sanitize_api_messages() that drops messages with roles not accepted by the chat-completions API (e.g. session_meta). This prevents CLI resume/session restore from leaking transcript-only metadata into the outgoing messages payload. Two layers of defense: 1. API-boundary guard: _sanitize_api_messages() now filters messages by role allowlist (system/user/assistant/tool/function/developer) before the existing orphaned tool-call repair logic. This protects all current and future call paths. 2. CLI restore defense-in-depth: Both session restore paths in cli.py now strip session_meta entries before loading history into conversation_history, matching the existing gateway behavior. Closes #4715	2026-04-03 14:57:33 -07:00
CK iRonin.IT	de5aacddd2	fix: normalise \r\n and \r line endings in pasted text Windows (CRLF) and old Mac (CR) line endings are normalised to LF before the 5-line collapse threshold is checked in handle_paste. Without this, markdown copied from Windows sources contains \r\n but the line counter (pasted_text.count('\n')) still works — however buf.insert_text() leaves bare \r characters in the buffer which some terminals render by moving the cursor to the start of the line, making multi-line pastes appear as a single overwritten line.	2026-04-03 13:20:50 -07:00
Teknium	cc54818d26	fix(mcp): stability fix pack — reload timeout, shutdown cleanup, event loop handler, OAuth non-blocking (#4757 ) Four fixes for MCP server stability issues reported by community member (terminal lockup, zombie processes, escape sequence pollution, startup hang): 1. MCP reload timeout guard (cli.py): _check_config_mcp_changes now runs _reload_mcp in a separate daemon thread with a 30s hard timeout. Previously, a hung MCP server could block the process_loop thread indefinitely, freezing the entire TUI (user can type but nothing happens, only Ctrl+D/Ctrl+\ work). 2. MCP stdio subprocess PID tracking (mcp_tool.py): Tracks child PIDs spawned by stdio_client via before/after snapshots of /proc children. On shutdown, _stop_mcp_loop force-kills any tracked PIDs that survived the SDK's graceful SIGTERM→SIGKILL cleanup. Prevents zombie MCP server processes from accumulating across sessions. 3. MCP event loop exception handler (mcp_tool.py): Installs _mcp_loop_exception_handler on the MCP background event loop — same pattern as the existing _suppress_closed_loop_errors on prompt_toolkit's loop. Suppresses benign 'Event loop is closed' RuntimeError from httpx transport __del__ during MCP shutdown. Salvaged from PR #2538 (acsezen). 4. MCP OAuth non-blocking (mcp_oauth.py): Replaces blocking input() call in _wait_for_callback with OAuthNonInteractiveError raise. Adds _is_interactive() TTY detection. In non-interactive environments, build_oauth_auth() still returns a provider (cached tokens + refresh work), but the callback handler raises immediately instead of blocking the MCP event loop for 120s. Re-raises OAuth setup failures in _run_http so failed servers are reported cleanly without blocking others. Salvaged from PRs #4521 (voidborne-d) and #4465 (heathley). Closes #2537, closes #4462 Related: #4128, #3436	2026-04-03 02:29:20 -07:00
kshitijk4poor	4d99305345	fix(cli): surface recent sessions inside /history and /resume When /history is used in an empty chat or /resume with no argument, show an inline table of recent resumable sessions with title, preview, relative timestamp, and session ID instead of a dead-end message. Table formatting matches the existing hermes sessions list style (column headers + thin separators, no box drawing). Co-authored-by: kshitijk4poor <kshitijk4poor@users.noreply.github.com>	2026-04-03 00:50:49 -07:00
Teknium	924bc67eee	feat(memory): pluggable memory provider interface with profile isolation, review fixes, and honcho CLI restoration (#4623 ) * feat(memory): add pluggable memory provider interface with profile isolation Introduces a pluggable MemoryProvider ABC so external memory backends can integrate with Hermes without modifying core files. Each backend becomes a plugin implementing a standard interface, orchestrated by MemoryManager. Key architecture: - agent/memory_provider.py — ABC with core + optional lifecycle hooks - agent/memory_manager.py — single integration point in the agent loop - agent/builtin_memory_provider.py — wraps existing MEMORY.md/USER.md Profile isolation fixes applied to all 6 shipped plugins: - Cognitive Memory: use get_hermes_home() instead of raw env var - Hindsight Memory: check $HERMES_HOME/hindsight/config.json first, fall back to legacy ~/.hindsight/ for backward compat - Hermes Memory Store: replace hardcoded ~/.hermes paths with get_hermes_home() for config loading and DB path defaults - Mem0 Memory: use get_hermes_home() instead of raw env var - RetainDB Memory: auto-derive profile-scoped project name from hermes_home path (hermes-<profile>), explicit env var overrides - OpenViking Memory: read-only, no local state, isolation via .env MemoryManager.initialize_all() now injects hermes_home into kwargs so every provider can resolve profile-scoped storage without importing get_hermes_home() themselves. Plugin system: adds register_memory_provider() to PluginContext and get_plugin_memory_providers() accessor. Based on PR #3825. 46 tests (37 unit + 5 E2E + 4 plugin registration). * refactor(memory): drop cognitive plugin, rewrite OpenViking as full provider Remove cognitive-memory plugin (#727) — core mechanics are broken: decay runs 24x too fast (hourly not daily), prefetch uses row ID as timestamp, search limited by importance not similarity. Rewrite openviking-memory plugin from a read-only search wrapper into a full bidirectional memory provider using the complete OpenViking session lifecycle API: - sync_turn: records user/assistant messages to OpenViking session (threaded, non-blocking) - on_session_end: commits session to trigger automatic memory extraction into 6 categories (profile, preferences, entities, events, cases, patterns) - prefetch: background semantic search via find() endpoint - on_memory_write: mirrors built-in memory writes to the session - is_available: checks env var only, no network calls (ABC compliance) Tools expanded from 3 to 5: - viking_search: semantic search with mode/scope/limit - viking_read: tiered content (abstract ~100tok / overview ~2k / full) - viking_browse: filesystem-style navigation (list/tree/stat) - viking_remember: explicit memory storage via session - viking_add_resource: ingest URLs/docs into knowledge base Uses direct HTTP via httpx (no openviking SDK dependency needed). Response truncation on viking_read to prevent context flooding. * fix(memory): harden Mem0 plugin — thread safety, non-blocking sync, circuit breaker - Remove redundant mem0_context tool (identical to mem0_search with rerank=true, top_k=5 — wastes a tool slot and confuses the model) - Thread sync_turn so it's non-blocking — Mem0's server-side LLM extraction can take 5-10s, was stalling the agent after every turn - Add threading.Lock around _get_client() for thread-safe lazy init (prefetch and sync threads could race on first client creation) - Add circuit breaker: after 5 consecutive API failures, pause calls for 120s instead of hammering a down server every turn. Auto-resets after cooldown. Logs a warning when tripped. - Track success/failure in prefetch, sync_turn, and all tool calls - Wait for previous sync to finish before starting a new one (prevents unbounded thread accumulation on rapid turns) - Clean up shutdown to join both prefetch and sync threads * fix(memory): enforce single external memory provider limit MemoryManager now rejects a second non-builtin provider with a warning. Built-in memory (MEMORY.md/USER.md) is always accepted. Only ONE external plugin provider is allowed at a time. This prevents tool schema bloat (some providers add 3-5 tools each) and conflicting memory backends. The warning message directs users to configure memory.provider in config.yaml to select which provider to activate. Updated all 47 tests to use builtin + one external pattern instead of multiple externals. Added test_second_external_rejected to verify the enforcement. * feat(memory): add ByteRover memory provider plugin Implements the ByteRover integration (from PR #3499 by hieuntg81) as a MemoryProvider plugin instead of direct run_agent.py modifications. ByteRover provides persistent memory via the brv CLI — a hierarchical knowledge tree with tiered retrieval (fuzzy text then LLM-driven search). Local-first with optional cloud sync. Plugin capabilities: - prefetch: background brv query for relevant context - sync_turn: curate conversation turns (threaded, non-blocking) - on_memory_write: mirror built-in memory writes to brv - on_pre_compress: extract insights before context compression Tools (3): - brv_query: search the knowledge tree - brv_curate: store facts/decisions/patterns - brv_status: check CLI version and context tree state Profile isolation: working directory at $HERMES_HOME/byterover/ (scoped per profile). Binary resolution cached with thread-safe double-checked locking. All write operations threaded to avoid blocking the agent (curate can take 120s with LLM processing). * fix(memory): thread remaining sync_turns, fix holographic, add config key Plugin fixes: - Hindsight: thread sync_turn (was blocking up to 30s via _run_in_thread) - RetainDB: thread sync_turn (was blocking on HTTP POST) - Both: shutdown now joins sync threads alongside prefetch threads Holographic retrieval fixes: - reason(): removed dead intersection_key computation (bundled but never used in scoring). Now reuses pre-computed entity_residuals directly, moved role_content encoding outside the inner loop. - contradict(): added _MAX_CONTRADICT_FACTS=500 scaling guard. Above 500 facts, only checks the most recently updated ones to avoid O(n^2) explosion (~125K comparisons at 500 is acceptable). Config: - Added memory.provider key to DEFAULT_CONFIG ("" = builtin only). No version bump needed (deep_merge handles new keys automatically). * feat(memory): extract Honcho as a MemoryProvider plugin Creates plugins/honcho-memory/ as a thin adapter over the existing honcho_integration/ package. All 4 Honcho tools (profile, search, context, conclude) move from the normal tool registry to the MemoryProvider interface. The plugin delegates all work to HonchoSessionManager — no Honcho logic is reimplemented. It uses the existing config chain: $HERMES_HOME/honcho.json -> ~/.honcho/config.json -> env vars. Lifecycle hooks: - initialize: creates HonchoSessionManager via existing client factory - prefetch: background dialectic query - sync_turn: records messages + flushes to API (threaded) - on_memory_write: mirrors user profile writes as conclusions - on_session_end: flushes all pending messages This is a prerequisite for the MemoryManager wiring in run_agent.py. Once wired, Honcho goes through the same provider interface as all other memory plugins, and the scattered Honcho code in run_agent.py can be consolidated into the single MemoryManager integration point. * feat(memory): wire MemoryManager into run_agent.py Adds 8 integration points for the external memory provider plugin, all purely additive (zero existing code modified): 1. Init (~L1130): Create MemoryManager, find matching plugin provider from memory.provider config, initialize with session context 2. Tool injection (~L1160): Append provider tool schemas to self.tools and self.valid_tool_names after memory_manager init 3. System prompt (~L2705): Add external provider's system_prompt_block alongside existing MEMORY.md/USER.md blocks 4. Tool routing (~L5362): Route provider tool calls through memory_manager.handle_tool_call() before the catchall handler 5. Memory write bridge (~L5353): Notify external provider via on_memory_write() when the built-in memory tool writes 6. Pre-compress (~L5233): Call on_pre_compress() before context compression discards messages 7. Prefetch (~L6421): Inject provider prefetch results into the current-turn user message (same pattern as Honcho turn context) 8. Turn sync + session end (~L8161, ~L8172): sync_all() after each completed turn, queue_prefetch_all() for next turn, on_session_end() + shutdown_all() at conversation end All hooks are wrapped in try/except — a failing provider never breaks the agent. The existing memory system, Honcho integration, and all other code paths are completely untouched. Full suite: 7222 passed, 4 pre-existing failures. * refactor(memory): remove legacy Honcho integration from core Extracts all Honcho-specific code from run_agent.py, model_tools.py, toolsets.py, and gateway/run.py. Honcho is now exclusively available as a memory provider plugin (plugins/honcho-memory/). Removed from run_agent.py (-457 lines): - Honcho init block (session manager creation, activation, config) - 8 Honcho methods: _honcho_should_activate, _strip_honcho_tools, _activate_honcho, _register_honcho_exit_hook, _queue_honcho_prefetch, _honcho_prefetch, _honcho_save_user_observation, _honcho_sync - _inject_honcho_turn_context module-level function - Honcho system prompt block (tool descriptions, CLI commands) - Honcho context injection in api_messages building - Honcho params from __init__ (honcho_session_key, honcho_manager, honcho_config) - HONCHO_TOOL_NAMES constant - All honcho-specific tool dispatch forwarding Removed from other files: - model_tools.py: honcho_tools import, honcho params from handle_function_call - toolsets.py: honcho toolset definition, honcho tools from core tools list - gateway/run.py: honcho params from AIAgent constructor calls Removed tests (-339 lines): - 9 Honcho-specific test methods from test_run_agent.py - TestHonchoAtexitFlush class from test_exit_cleanup_interrupt.py Restored two regex constants (_SURROGATE_RE, _BUDGET_WARNING_RE) that were accidentally removed during the honcho function extraction. The honcho_integration/ package is kept intact — the plugin delegates to it. tools/honcho_tools.py registry entries are now dead code (import commented out in model_tools.py) but the file is preserved for reference. Full suite: 7207 passed, 4 pre-existing failures. Zero regressions. * refactor(memory): restructure plugins, add CLI, clean gateway, migration notice Plugin restructure: - Move all memory plugins from plugins/<name>-memory/ to plugins/memory/<name>/ (byterover, hindsight, holographic, honcho, mem0, openviking, retaindb) - New plugins/memory/__init__.py discovery module that scans the directory directly, loading providers by name without the general plugin system - run_agent.py uses load_memory_provider() instead of get_plugin_memory_providers() CLI wiring: - hermes memory setup — interactive curses picker + config wizard - hermes memory status — show active provider, config, availability - hermes memory off — disable external provider (built-in only) - hermes honcho — now shows migration notice pointing to hermes memory setup Gateway cleanup: - Remove _get_or_create_gateway_honcho (already removed in prev commit) - Remove _shutdown_gateway_honcho and _shutdown_all_gateway_honcho methods - Remove all calls to shutdown methods (4 call sites) - Remove _honcho_managers/_honcho_configs dict references Dead code removal: - Delete tools/honcho_tools.py (279 lines, import was already commented out) - Delete tests/gateway/test_honcho_lifecycle.py (131 lines, tested removed methods) - Remove if False placeholder from run_agent.py Migration: - Honcho migration notice on startup: detects existing honcho.json or ~/.honcho/config.json, prints guidance to run hermes memory setup. Only fires when memory.provider is not set and not in quiet mode. Full suite: 7203 passed, 4 pre-existing failures. Zero regressions. * feat(memory): standardize plugin config + add per-plugin documentation Config architecture: - Add save_config(values, hermes_home) to MemoryProvider ABC - Honcho: writes to $HERMES_HOME/honcho.json (SDK native) - Mem0: writes to $HERMES_HOME/mem0.json - Hindsight: writes to $HERMES_HOME/hindsight/config.json - Holographic: writes to config.yaml under plugins.hermes-memory-store - OpenViking/RetainDB/ByteRover: env-var only (default no-op) Setup wizard (hermes memory setup): - Now calls provider.save_config() for non-secret config - Secrets still go to .env via env vars - Only memory.provider activation key goes to config.yaml Documentation: - README.md for each of the 7 providers in plugins/memory/<name>/ - Requirements, setup (wizard + manual), config reference, tools table - Consistent format across all providers The contract for new memory plugins: - get_config_schema() declares all fields (REQUIRED) - save_config() writes native config (REQUIRED if not env-var-only) - Secrets use env_var field in schema, written to .env by wizard - README.md in the plugin directory * docs: add memory providers user guide + developer guide New pages: - user-guide/features/memory-providers.md — comprehensive guide covering all 7 shipped providers (Honcho, OpenViking, Mem0, Hindsight, Holographic, RetainDB, ByteRover). Each with setup, config, tools, cost, and unique features. Includes comparison table and profile isolation notes. - developer-guide/memory-provider-plugin.md — how to build a new memory provider plugin. Covers ABC, required methods, config schema, save_config, threading contract, profile isolation, testing. Updated pages: - user-guide/features/memory.md — replaced Honcho section with link to new Memory Providers page - user-guide/features/honcho.md — replaced with migration redirect to the new Memory Providers page - sidebars.ts — added both new pages to navigation * fix(memory): auto-migrate Honcho users to memory provider plugin When honcho.json or ~/.honcho/config.json exists but memory.provider is not set, automatically set memory.provider: honcho in config.yaml and activate the plugin. The plugin reads the same config files, so all data and credentials are preserved. Zero user action needed. Persists the migration to config.yaml so it only fires once. Prints a one-line confirmation in non-quiet mode. * fix(memory): only auto-migrate Honcho when enabled + credentialed Check HonchoClientConfig.enabled AND (api_key OR base_url) before auto-migrating — not just file existence. Prevents false activation for users who disabled Honcho, stopped using it (config lingers), or have ~/.honcho/ from a different tool. * feat(memory): auto-install pip dependencies during hermes memory setup Reads pip_dependencies from plugin.yaml, checks which are missing, installs them via pip before config walkthrough. Also shows install guidance for external_dependencies (e.g. brv CLI for ByteRover). Updated all 7 plugin.yaml files with pip_dependencies: - honcho: honcho-ai - mem0: mem0ai - openviking: httpx - hindsight: hindsight-client - holographic: (none) - retaindb: requests - byterover: (external_dependencies for brv CLI) * fix: remove remaining Honcho crash risks from cli.py and gateway cli.py: removed Honcho session re-mapping block (would crash importing deleted tools/honcho_tools.py), Honcho flush on compress, Honcho session display on startup, Honcho shutdown on exit, honcho_session_key AIAgent param. gateway/run.py: removed honcho_session_key params from helper methods, sync_honcho param, _honcho.shutdown() block. tests: fixed test_cron_session_with_honcho_key_skipped (was passing removed honcho_key param to _flush_memories_for_session). * fix: include plugins/ in pyproject.toml package list Without this, plugins/memory/ wouldn't be included in non-editable installs. Hermes always runs from the repo checkout so this is belt- and-suspenders, but prevents breakage if the install method changes. * fix(memory): correct pip-to-import name mapping for dep checks The heuristic dep.replace('-', '_') fails for packages where the pip name differs from the import name: honcho-ai→honcho, mem0ai→mem0, hindsight-client→hindsight_client. Added explicit mapping table so hermes memory setup doesn't try to reinstall already-installed packages. * chore: remove dead code from old plugin memory registration path - hermes_cli/plugins.py: removed register_memory_provider(), _memory_providers list, get_plugin_memory_providers() — memory providers now use plugins/memory/ discovery, not the general plugin system - hermes_cli/main.py: stripped 74 lines of dead honcho argparse subparsers (setup, status, sessions, map, peer, mode, tokens, identity, migrate) — kept only the migration redirect - agent/memory_provider.py: updated docstring to reflect new registration path - tests: replaced TestPluginMemoryProviderRegistration with TestPluginMemoryDiscovery that tests the actual plugins/memory/ discovery system. Added 3 new tests (discover, load, nonexistent). * chore: delete dead honcho_integration/cli.py and its tests cli.py (794 lines) was the old 'hermes honcho' command handler — nobody calls it since cmd_honcho was replaced with a migration redirect. Deleted tests that imported from removed code: - tests/honcho_integration/test_cli.py (tested _resolve_api_key) - tests/honcho_integration/test_config_isolation.py (tested CLI config paths) - tests/tools/test_honcho_tools.py (tested the deleted tools/honcho_tools.py) Remaining honcho_integration/ files (actively used by the plugin): - client.py (445 lines) — config loading, SDK client creation - session.py (991 lines) — session management, queries, flush * refactor: move honcho_integration/ into the honcho plugin Moves client.py (445 lines) and session.py (991 lines) from the top-level honcho_integration/ package into plugins/memory/honcho/. No Honcho code remains in the main codebase. - plugins/memory/honcho/client.py — config loading, SDK client creation - plugins/memory/honcho/session.py — session management, queries, flush - Updated all imports: run_agent.py (auto-migration), hermes_cli/doctor.py, plugin __init__.py, session.py cross-import, all tests - Removed honcho_integration/ package and pyproject.toml entry - Renamed tests/honcho_integration/ → tests/honcho_plugin/ * docs: update architecture + gateway-internals for memory provider system - architecture.md: replaced honcho_integration/ with plugins/memory/ - gateway-internals.md: replaced Honcho-specific session routing and flush lifecycle docs with generic memory provider interface docs * fix: update stale mock path for resolve_active_host after honcho plugin migration * fix(memory): address review feedback — P0 lifecycle, ABC contract, honcho CLI restore Review feedback from Honcho devs (erosika): P0 — Provider lifecycle: - Remove on_session_end() + shutdown_all() from run_conversation() tail (was killing providers after every turn in multi-turn sessions) - Add shutdown_memory_provider() method on AIAgent for callers - Wire shutdown into CLI atexit, reset_conversation, gateway stop/expiry Bug fixes: - Remove sync_honcho=False kwarg from /btw callsites (TypeError crash) - Fix doctor.py references to dead 'hermes honcho setup' command - Cache prefetch_all() before tool loop (was re-calling every iteration) ABC contract hardening (all backwards-compatible): - Add session_id kwarg to prefetch/sync_turn/queue_prefetch - Make on_pre_compress() return str (provider insights in compression) - Add *kwargs to on_turn_start() for runtime context - Add on_delegation() hook for parent-side subagent observation - Document agent_context/agent_identity/agent_workspace kwargs on initialize() (prevents cron corruption, enables profile scoping) - Fix docstring: single external provider, not multiple Honcho CLI restoration: - Add plugins/memory/honcho/cli.py (from main's honcho_integration/cli.py with imports adapted to plugin path) - Restore full hermes honcho command with all subcommands (status, peer, mode, tokens, identity, enable/disable, sync, peers, --target-profile) - Restore auto-clone on profile creation + sync on hermes update - hermes honcho setup now redirects to hermes memory setup fix(memory): wire on_delegation, skip_memory for cron/flush, fix ByteRover return type - Wire on_delegation() in delegate_tool.py — parent's memory provider is notified with task+result after each subagent completes - Add skip_memory=True to cron scheduler (prevents cron system prompts from corrupting user representations — closes #4052) - Add skip_memory=True to gateway flush agent (throwaway agent shouldn't activate memory provider) - Fix ByteRover on_pre_compress() return type: None -> str * fix(honcho): port profile isolation fixes from PR #4632 Ports 5 bug fixes found during profile testing (erosika's PR #4632): 1. 3-tier config resolution — resolve_config_path() now checks $HERMES_HOME/honcho.json → ~/.hermes/honcho.json → ~/.honcho/config.json (non-default profiles couldn't find shared host blocks) 2. Thread host=_host_key() through from_global_config() in cmd_setup, cmd_status, cmd_identity (--target-profile was being ignored) 3. Use bare profile name as aiPeer (not host key with dots) — Honcho's peer ID pattern is ^[a-zA-Z0-9_-]+$, dots are invalid 4. Wrap add_peers() in try/except — was fatal on new AI peers, killed all message uploads for the session 5. Gate Honcho clone behind --clone/--clone-all on profile create (bare create should be blank-slate) Also: sanitize assistant_peer_id via _sanitize_id() * fix(tests): add module cleanup fixture to test_cli_provider_resolution test_cli_provider_resolution._import_cli() wipes tools.*, cli, and run_agent from sys.modules to force fresh imports, but had no cleanup. This poisoned all subsequent tests on the same xdist worker — mocks targeting tools.file_tools, tools.send_message_tool, etc. patched the NEW module object while already-imported functions still referenced the OLD one. Caused ~25 cascade failures: send_message KeyError, process_registry FileNotFoundError, file_read_guards timeouts, read_loop_detection file-not-found, mcp_oauth None port, and provider_parity/codex_execution stale tool lists. Fix: autouse fixture saves all affected modules before each test and restores them after, matching the pattern in test_managed_browserbase_and_modal.py.	2026-04-02 15:33:51 -07:00
Teknium	28a073edc6	fix: repair OpenCode model routing and selection (#4508 ) OpenCode Zen and Go are mixed-API-surface providers — different models behind them use different API surfaces (GPT on Zen uses codex_responses, Claude on Zen uses anthropic_messages, MiniMax on Go uses anthropic_messages, GLM/Kimi on Go use chat_completions). Changes: - Add normalize_opencode_model_id() and opencode_model_api_mode() to models.py for model ID normalization and API surface routing - Add _provider_supports_explicit_api_mode() to runtime_provider.py to prevent stale api_mode from leaking across provider switches - Wire opencode routing into all three api_mode resolution paths: pool entry, api_key provider, and explicit runtime - Add api_mode field to ModelSwitchResult for propagation through the switch pipeline - Consolidate _PROVIDER_MODELS from main.py into models.py (single source of truth, eliminates duplicate dict) - Add opencode normalization to setup wizard and model picker flows - Add opencode block to _normalize_model_for_provider in CLI - Add opencode-zen/go fallback model lists to setup.py Tests: 160 targeted tests pass (26 new tests covering normalization, api_mode routing per provider/model, persistence, and setup wizard normalization). Based on PR #3017 by SaM13997. Co-authored-by: SaM13997 <139419381+SaM13997@users.noreply.github.com>	2026-04-02 09:36:24 -07:00
Devorun	f4f64c413f	fix(cli): ensure zero exit code on successful quiet mode queries (#4601 )	2026-04-02 09:33:31 -07:00
Roland Parnaso	c4e626b1fa	refactor: extract _detect_file_drop() + add 28 tests Extract the inline file-drop detection logic into a standalone _detect_file_drop() function at module level for testability. The main loop now calls this function instead of inlining the logic. Tests cover: - Slash commands still route correctly (/help, /quit, /xyz) - Image paths auto-detected (.png, .jpg, .gif, etc.) - Non-image files detected (.py, .txt, Makefile, etc.) - Backslash-escaped spaces from macOS drag-and-drop - Trailing user text preserved as remainder - Edge cases: directories, symlinks, no-extension files - Non-string input, empty strings, nonexistent paths	2026-04-02 00:40:27 -07:00
Roland Parnaso	1841886898	fix(cli): detect dragged file paths instead of treating them as slash commands When a user drags a file into the terminal, macOS pastes the absolute path (e.g. /Users/roland/Desktop/Screenshot.png) which starts with '/' and was incorrectly routed to process_command(), producing an 'Unknown command' error. This change adds file-path detection before the slash-command check: - Parses the first token, handling backslash-escaped spaces from macOS - Checks if the path exists as a real file via Path.exists() - Image files (.png, .jpg, etc.) are auto-attached to the message - Non-image files are reformatted as [User attached file: ...] context - Falls through to normal slash-command handling if not a real file path	2026-04-02 00:40:27 -07:00
Teknium	de9bba8d7c	fix: remove hardcoded OpenRouter/opus defaults No model, base_url, or provider is assumed when the user hasn't configured one. Previously the defaults dict in cli.py, AIAgent constructor args, and several fallback paths all hardcoded anthropic/claude-opus-4.6 + openrouter.ai/api/v1 — silently routing unconfigured users to OpenRouter, which 404s for anyone using a different provider. Now empty defaults force the setup wizard to run, and existing users who already completed setup are unaffected (their config.yaml has the model they chose). Files changed: - cli.py: defaults dict, _DEFAULT_CONFIG_MODEL - run_agent.py: AIAgent.__init__ defaults, main() defaults - hermes_cli/config.py: DEFAULT_CONFIG - hermes_cli/runtime_provider.py: is_fallback sentinel - acp_adapter/session.py: default_model - tests: updated to reflect empty defaults	2026-04-01 15:22:26 -07:00
Teknium	c59ab8b0da	fix: profile model.model promoted to model.default when default not set When a profile config sets model.model but not model.default, the hardcoded default (claude-opus-4.6) survived the config merge and took precedence in HermesCLI.__init__ because it checks model.default first. Profile model configs were silently ignored. Now model.model is promoted to model.default during the merge when the user didn't explicitly set model.default. Fixes #4486.	2026-04-01 13:46:18 -07:00
Teknium	16d9f58445	fix(gateway): persist memory flush state to prevent redundant re-flushes on restart (#4481 ) * fix: force-close TCP sockets on client cleanup, detect and recover dead connections When a provider drops connections mid-stream (e.g. OpenRouter outage), httpx's graceful close leaves sockets in CLOSE-WAIT indefinitely. These zombie connections accumulate and can prevent recovery without restarting. Changes: - _force_close_tcp_sockets: walks the httpx connection pool and issues socket.shutdown(SHUT_RDWR) + close() to force TCP RST on every socket when a client is closed, preventing CLOSE-WAIT accumulation - _cleanup_dead_connections: probes the primary client's pool for dead sockets (recv MSG_PEEK), rebuilds the client if any are found - Pre-turn health check at the start of each run_conversation call that auto-recovers with a user-facing status message - Primary client rebuild after stale stream detection to purge pool - User-facing messages on streaming connection failures: "Connection to provider dropped — Reconnecting (attempt 2/3)" "Connection failed after 3 attempts — try again in a moment" Made-with: Cursor * fix: pool entry missing base_url for openrouter, clean error messages - _resolve_runtime_from_pool_entry: add OPENROUTER_BASE_URL fallback when pool entry has no runtime_base_url (pool entries from auth.json credential_pool often omit base_url) - Replace Rich console.print for auth errors with plain print() to prevent ANSI escape code mangling through prompt_toolkit's stdout patch - Force-close TCP sockets on client cleanup to prevent CLOSE-WAIT accumulation after provider outages - Pre-turn dead connection detection with auto-recovery and user message - Primary client rebuild after stale stream detection - User-facing status messages on streaming connection failures/retries Made-with: Cursor * fix(gateway): persist memory flush state to prevent redundant re-flushes on restart The _session_expiry_watcher tracked flushed sessions in an in-memory set (_pre_flushed_sessions) that was lost on gateway restart. Expired sessions remained in sessions.json and were re-discovered every restart, causing redundant AIAgent runs that burned API credits and blocked the event loop. Fix: Add a memory_flushed boolean field to SessionEntry, persisted in sessions.json. The watcher sets it after a successful flush. On restart, the flag survives and the watcher skips already-flushed sessions. - Add memory_flushed field to SessionEntry with to_dict/from_dict support - Old sessions.json entries without the field default to False (backward compat) - Remove the ephemeral _pre_flushed_sessions set from SessionStore - Update tests: save/load roundtrip, legacy entry compat, auto-reset behavior	2026-04-01 12:05:02 -07:00
kshitijk4poor	935137f0d9	feat: add inline diff previews for write actions Show inline diffs in the CLI transcript when write_file, patch, or skill_manage modifies files. Captures a filesystem snapshot before the tool runs, computes a unified diff after, and renders it with ANSI coloring in the activity feed. Adds tool_start_callback and tool_complete_callback hooks to AIAgent for pre/post tool execution notifications. Also fixes _extract_parallel_scope_path to normalize relative paths to absolute, preventing the parallel overlap detection from missing conflicts when the same file is referenced with different path styles. Gated by display.inline_diffs config option (default: true). Based on PR #3774 by @kshitijk4poor.	2026-04-01 02:13:57 -07:00
Teknium	996250d178	fix(cli): pin entire TUI to bottom of terminal on startup (#4412 ) Replace the per-response padding from PR #4359 (which created a void between short responses and the prompt) with a one-time initial scroll at session start. Prints terminal_height newlines before the banner so the cursor starts at the bottom row — banner, responses, and prompt all appear pinned to the bottom with empty space above, not below. patch_stdout naturally keeps the prompt at the bottom from there, so no per-response padding is needed.	2026-04-01 01:41:09 -07:00
Johannnnn506	9b99ea176e	fix(cli): initialize ctx_len before compact banner path	2026-04-01 01:12:23 -07:00
Teknium	a7f7e87070	fix: preserve credential_pool through smart routing and defer eager fallback on 429 (#4361 ) Three bugs prevented credential pool rotation from working when multiple Codex OAuth tokens were configured: 1. credential_pool was dropped during smart model turn routing. resolve_turn_route() constructed runtime dicts without it, so the AIAgent was created without pool access. Fixed in smart_model_routing.py (no-route and fallback paths), cli.py, and gateway/run.py. 2. Eager fallback fired before pool rotation on 429. The rate-limit handler at line ~7180 switched to a fallback provider immediately, before _recover_with_credential_pool got a chance to rotate to the next credential. Now deferred when the pool still has credentials. 3. (Non-issue) Retry budget was reported as too small, but successful pool rotations already skip retry_count increment — no change needed. Reported by community member Schinsly who identified all three root causes and verified the fix locally with multiple Codex accounts.	2026-04-01 01:02:34 -07:00
Teknium	f8cb54ba04	fix(cli): anchor input prompt near bottom of terminal after responses (#4359 ) After short agent responses, the prompt_toolkit input area sat mid-screen with empty terminal space below it. Now prints padding newlines (half terminal height) after each response to push the prompt toward the bottom. patch_stdout renders the padding above the input area.	2026-03-31 14:56:35 -07:00
Teknium	1b62ad9de7	fix: root-level provider in config.yaml no longer overrides model.provider load_cli_config() had a priority inversion: a stale root-level 'provider' key in config.yaml would OVERRIDE the canonical 'model.provider' set by 'hermes model'. The gateway reads model.provider directly from YAML and worked correctly, but 'hermes chat -q' and the interactive CLI went through the merge logic and picked up the stale root-level key. Fix: root-level provider/base_url are now only used as a fallback when model.provider/model.base_url is not set (never as an override). Also added _normalize_root_model_keys() to config.py load_config() and save_config() — migrates root-level provider/base_url into the model section and removes the root-level keys permanently. Reported by (≧▽≦) in Discord: opencode-go provider persisted as a root-level key and overrode the correct model.provider=openrouter, causing 401 errors.	2026-03-31 12:54:22 -07:00
binhnt92	c94a5fa1b2	fix(cli): use atomic write in save_config_value to prevent config loss on interrupt save_config_value() used bare open(path, 'w') + yaml.dump() which truncates the file to zero bytes on open. If the process is interrupted mid-write, config.yaml is left empty. Replace with atomic_yaml_write() (temp file + fsync + os.replace), matching the gateway config write path. Co-authored-by: Hermes Agent <hermes@nousresearch.com>	2026-03-31 12:21:55 -07:00
Teknium	57625329a2	docs+feat: comprehensive local LLM provider guides and context length warning (#4294 ) * docs: update llama.cpp section with --jinja flag and tool calling guide The llama.cpp docs were missing the --jinja flag which is required for tool calling to work. Without it, models output tool calls as raw JSON text instead of structured API responses, making Hermes unable to execute them. Changes: - Add --jinja and -fa flags to the server startup example - Replace deprecated env vars (OPENAI_BASE_URL, LLM_MODEL) with hermes model interactive setup - Add caution block explaining the --jinja requirement and symptoms - List models with native tool calling support - Add /props endpoint verification tip * docs+feat: comprehensive local LLM provider guides and context length warning Docs (providers.md): - Rewrote Ollama section with context length warning (defaults to 4k on <24GB VRAM), three methods to increase it, and verification steps - Rewrote vLLM section with --max-model-len, tool calling flags (--enable-auto-tool-choice, --tool-call-parser), and context guidance - Rewrote SGLang section with --context-length, --tool-call-parser, and warning about 128-token default max output - Added LM Studio section (port 1234, context length defaults to 2048, tool calling since 0.3.6) - Added llama.cpp context length flag (-c) and GPU offload (-ngl) - Added Troubleshooting Local Models section covering: - Tool calls appearing as text (with per-server fix table) - Silent context truncation and diagnosis commands - Low detected context at startup - Truncated responses - Replaced all deprecated env vars (OPENAI_BASE_URL, LLM_MODEL) with hermes model interactive setup and config.yaml examples - Added deprecation warning for legacy env vars in General Setup Code (cli.py): - Added context length warning in show_banner() when detected context is <= 8192 tokens, with server-specific fix hints: - Ollama (port 11434): suggests OLLAMA_CONTEXT_LENGTH env var - LM Studio (port 1234): suggests model settings adjustment - Other servers: suggests config.yaml override Tests: - 9 new tests covering warning thresholds, server-specific hints, and no-warning cases	2026-03-31 11:42:48 -07:00
Teknium	8d59881a62	feat(auth): same-provider credential pools with rotation, custom endpoint support, and interactive CLI (#2647 ) * feat(auth): add same-provider credential pools and rotation UX Add same-provider credential pooling so Hermes can rotate across multiple credentials for a single provider, recover from exhausted credentials without jumping providers immediately, and configure that behavior directly in hermes setup. - agent/credential_pool.py: persisted per-provider credential pools - hermes auth add/list/remove/reset CLI commands - 429/402/401 recovery with pool rotation in run_agent.py - Setup wizard integration for pool strategy configuration - Auto-seeding from env vars and existing OAuth state Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com> Salvaged from PR #2647 * fix(tests): prevent pool auto-seeding from host env in credential pool tests Tests for non-pool Anthropic paths and auth remove were failing when host env vars (ANTHROPIC_API_KEY) or file-backed OAuth credentials were present. The pool auto-seeding picked these up, causing unexpected pool entries in tests. - Mock _select_pool_entry in auxiliary_client OAuth flag tests - Clear Anthropic env vars and mock _seed_from_singletons in auth remove test * feat(auth): add thread safety, least_used strategy, and request counting - Add threading.Lock to CredentialPool for gateway thread safety (concurrent requests from multiple gateway sessions could race on pool state mutations without this) - Add 'least_used' rotation strategy that selects the credential with the lowest request_count, distributing load more evenly - Add request_count field to PooledCredential for usage tracking - Add mark_used() method to increment per-credential request counts - Wrap select(), mark_exhausted_and_rotate(), and try_refresh_current() with lock acquisition - Add tests: least_used selection, mark_used counting, concurrent thread safety (4 threads × 20 selects with no corruption) * feat(auth): add interactive mode for bare 'hermes auth' command When 'hermes auth' is called without a subcommand, it now launches an interactive wizard that: 1. Shows full credential pool status across all providers 2. Offers a menu: add, remove, reset cooldowns, set strategy 3. For OAuth-capable providers (anthropic, nous, openai-codex), the add flow explicitly asks 'API key or OAuth login?' — making it clear that both auth types are supported for the same provider 4. Strategy picker shows all 4 options (fill_first, round_robin, least_used, random) with the current selection marked 5. Remove flow shows entries with indices for easy selection The subcommand paths (hermes auth add/list/remove/reset) still work exactly as before for scripted/non-interactive use. * fix(tests): update runtime_provider tests for config.yaml source of truth (#4165) Tests were using OPENAI_BASE_URL env var which is no longer consulted after #4165. Updated to use model config (provider, base_url, api_key) which is the new single source of truth for custom endpoint URLs. * feat(auth): support custom endpoint credential pools keyed by provider name Custom OpenAI-compatible endpoints all share provider='custom', making the provider-keyed pool useless. Now pools for custom endpoints are keyed by 'custom:<normalized_name>' where the name comes from the custom_providers config list (auto-generated from URL hostname). - Pool key format: 'custom:together.ai', 'custom:local-(localhost:8080)' - load_pool('custom:name') seeds from custom_providers api_key AND model.api_key when base_url matches - hermes auth add/list now shows custom endpoints alongside registry providers - _resolve_openrouter_runtime and _resolve_named_custom_runtime check pool before falling back to single config key - 6 new tests covering custom pool keying, seeding, and listing * docs: add Excalidraw diagram of full credential pool flow Comprehensive architecture diagram showing: - Credential sources (env vars, auth.json OAuth, config.yaml, CLI) - Pool storage and auto-seeding - Runtime resolution paths (registry, custom, OpenRouter) - Error recovery (429 retry-then-rotate, 402 immediate, 401 refresh) - CLI management commands and strategy configuration Open at: https://excalidraw.com/#json=2Ycqhqpi6f12E_3ITyiwh,c7u9jSt5BwrmiVzHGbm87g * fix(tests): update setup wizard pool tests for unified select_provider_and_model flow The setup wizard now delegates to select_provider_and_model() instead of using its own prompt_choice-based provider picker. Tests needed: - Mock select_provider_and_model as no-op (provider pre-written to config) - Call _stub_tts BEFORE custom prompt_choice mock (it overwrites it) - Pre-write model.provider to config so the pool step is reached * docs: add comprehensive credential pool documentation - New page: website/docs/user-guide/features/credential-pools.md Full guide covering quick start, CLI commands, rotation strategies, error recovery, custom endpoint pools, auto-discovery, thread safety, architecture, and storage format. - Updated fallback-providers.md to reference credential pools as the first layer of resilience (same-provider rotation before cross-provider) - Added hermes auth to CLI commands reference with usage examples - Added credential_pool_strategies to configuration guide * chore: remove excalidraw diagram from repo (external link only) * refactor: simplify credential pool code — extract helpers, collapse extras, dedup patterns - _load_config_safe(): replace 4 identical try/except/import blocks - _iter_custom_providers(): shared generator for custom provider iteration - PooledCredential.extra dict: collapse 11 round-trip-only fields (token_type, scope, client_id, portal_base_url, obtained_at, expires_in, agent_key_id, agent_key_expires_in, agent_key_reused, agent_key_obtained_at, tls) into a single extra dict with __getattr__ for backward-compatible access - _available_entries(): shared exhaustion-check between select and peek - Dedup anthropic OAuth seeding (hermes_pkce + claude_code identical) - SimpleNamespace replaces class _Args boilerplate in auth_commands - _try_resolve_from_custom_pool(): shared pool-check in runtime_provider Net -17 lines. All 383 targeted tests pass. --------- Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com>	2026-03-31 03:10:01 -07:00
Teknium	f890a94c12	refactor: make config.yaml the single source of truth for endpoint URLs (#4165 ) OPENAI_BASE_URL was written to .env AND config.yaml, creating a dual-source confusion. Users (especially Docker) would see the URL in .env and assume that's where all config lives, then wonder why LLM_MODEL in .env didn't work. Changes: - Remove all 27 save_env_value("OPENAI_BASE_URL", ...) calls across main.py, setup.py, and tools_config.py - Remove OPENAI_BASE_URL env var reading from runtime_provider.py, cli.py, models.py, and gateway/run.py - Remove LLM_MODEL/HERMES_MODEL env var reading from gateway/run.py and auxiliary_client.py — config.yaml model.default is authoritative - Vision base URL now saved to config.yaml auxiliary.vision.base_url (both setup wizard and tools_config paths) - Tests updated to set config values instead of env vars Convention enforced: .env is for SECRETS only (API keys). All other configuration (model names, base URLs, provider selection) lives exclusively in config.yaml.	2026-03-30 22:02:53 -07:00
Teknium	1bd206ea5d	feat: add /btw command for ephemeral side questions (#4161 ) Adds /btw <question> — ask a quick follow-up using the current session context without interrupting the main conversation. - Snapshots conversation history, answers with a no-tools agent - Response is not persisted to session history or DB - Runs in a background thread (CLI) / async task (gateway) - Per-session guard prevents concurrent /btw in gateway Implementation: - model_tools.py: enabled_toolsets=[] now correctly means "no tools" (was falsy, fell through to default "all tools") - run_agent.py: persist_session=False gates _persist_session() - cli.py: _handle_btw_command (background thread, Rich panel output) - gateway/run.py: _handle_btw_command + _run_btw_task (async task) - hermes_cli/commands.py: CommandDef for "btw" Inspired by PR #3504 by areu01or00, reimplemented cleanly on current main with the enabled_toolsets=[] fix and without the __btw_no_tools__ hack.	2026-03-30 21:10:05 -07:00
Teknium	c1ef9b2250	fix(cli): ensure on_session_end hook fires on interrupted exits (#4159 ) - Add SIGTERM/SIGHUP signal handlers for graceful shutdown - Add BrokenPipeError to exit exception handling (SSH disconnects) - Fire on_session_end plugin hook in finally block, guarded by _agent_running to avoid double-firing on normal exits (the hook already fires per-turn from run_conversation) Co-authored-by: kelsia14 <kelsia14@users.noreply.github.com>	2026-03-30 20:37:17 -07:00
Teknium	13f3e67165	ux: show 'Initializing agent...' on first message (#4086 ) Display a brief status message before the heavy agent initialization (OpenAI client setup, tool loading, memory init, etc.) so users aren't staring at a blank screen for several seconds. Only prints when self.agent is None (first use or after model switch). Closes #4060 Co-authored-by: SHL0MS <SHL0MS@users.noreply.github.com>	2026-03-30 17:05:40 -07:00
Teknium	f93637b3a1	feat: add /profile slash command to show active profile (#4027 ) Adds /profile to COMMAND_REGISTRY (Info category) with handlers in both CLI and gateway. Shows the active profile name and home directory. Works on all platforms — CLI, Telegram, Discord, Slack, etc. Detects profile by checking if HERMES_HOME is under ~/.hermes/profiles/. Shows 'default' when running without a profile.	2026-03-30 13:20:06 -07:00
Teknium	0976bf6cd0	feat: add /yolo slash command to toggle dangerous command approvals (#3990 ) Adds a /yolo command that toggles HERMES_YOLO_MODE at runtime, skipping all dangerous command approval prompts for the current session. Works in both CLI and gateway (Telegram, Discord, etc.). - /yolo -> ON: all commands auto-approved, no confirmation prompts - /yolo -> OFF: normal approval flow restored The --yolo CLI flag already existed for launch-time opt-in. This adds the ability to toggle mid-session without restarting. Session-scoped — resets when the process ends. Uses the existing HERMES_YOLO_MODE env var that check_all_command_guards() already respects.	2026-03-30 11:17:09 -07:00
Teknium	0e592aa5b4	fix(cli): remove input() from /tools disable that freezes the terminal (#3918 ) input() hangs inside prompt_toolkit's TUI event loop — this is a known pitfall (AGENTS.md). The /tools disable and /tools enable commands used input() for a Y/N confirmation prompt, causing the terminal to freeze with no way to type a response. Fix: remove the confirmation prompt. The user typing '/tools disable web' is implicit consent. The change is applied directly with a status message.	2026-03-30 02:53:21 -07:00
Wing Lian	efae525dc5	feat(plugins): add inject_message interface for remote message injection (#3778 )	2026-03-30 02:48:06 -07:00
kshitij	c288bbfb57	fix(cli): prevent status bar wrapping into duplicate rows (#3883 ) - measure status bar display width using prompt_toolkit cell widths - trim rendered status text when fragments would overflow - add a final single-fragment fallback to prevent wrapping - update width assertions to validate display cells instead of len()	2026-03-29 23:59:07 -07:00
Teknium	86ac23c8da	fix(auth): stop silently falling back to OpenRouter when no provider is configured (#3862 ) Previously, when no API keys or provider credentials were found, Hermes silently defaulted to OpenRouter + Claude Opus. This caused confusion when users configured local servers (LM Studio, Ollama, etc.) with a typo or unrecognized provider name — the system would silently route to OpenRouter instead of telling them something was wrong. Changes: - resolve_provider() now raises AuthError when no credentials are found instead of returning 'openrouter' as a silent fallback - Added local server aliases: lmstudio, ollama, vllm, llamacpp → custom - Removed hardcoded 'anthropic/claude-opus-4.6' fallback from gateway and cron scheduler (they read from config.yaml instead) - Updated cli-config.yaml.example with complete provider documentation including all supported providers, aliases, and local server setup	2026-03-29 21:06:35 -07:00
Teknium	e314833c9d	feat(display): configurable tool preview length -- show full paths by default (#3841 ) Tool call previews (paths, commands, queries) were hardcoded to truncate at 35-40 chars across CLI spinners, completion lines, and gateway progress messages. Users could not see full file paths in tool output. New config option: display.tool_preview_length (default 0 = no limit). Set a positive number to truncate at that length. Changes: - display.py: module-level _tool_preview_max_len with getter/setter; build_tool_preview() and get_cute_tool_message() _trunc/_path respect it - cli.py: reads config at startup, spinner widget respects config - gateway/run.py: reads config per-message, progress callback respects config - run_agent.py: removed redundant 30-char quiet-mode spinner truncation - config.py: added display.tool_preview_length to DEFAULT_CONFIG Reported by kriskaminski	2026-03-29 18:02:42 -07:00
Teknium	252fbea005	feat(providers): add ordered fallback provider chain (salvage #1761 ) (#3813 ) Extends the single fallback_model mechanism into an ordered chain. When the primary model fails, Hermes tries each fallback provider in sequence until one succeeds or the chain is exhausted. Config format (new): fallback_providers: - provider: openrouter model: anthropic/claude-sonnet-4 - provider: openai model: gpt-4o Legacy single-dict fallback_model format still works unchanged. Key fix vs original PR: the call sites in the retry loop now use _fallback_index < len(_fallback_chain) instead of the old one-shot _fallback_activated guard, so the chain actually advances through all configured providers. Changes: - run_agent.py: _fallback_chain list + _fallback_index replaces one-shot _fallback_model; _try_activate_fallback() advances through chain; failed provider resolution skips to next entry; call sites updated to allow chain advancement - cli.py: reads fallback_providers with legacy fallback_model compat - gateway/run.py: same - hermes_cli/config.py: fallback_providers: [] in DEFAULT_CONFIG - tests: 12 new chain tests + 6 existing test fixtures updated Co-authored-by: uzaylisak <uzaylisak@users.noreply.github.com>	2026-03-29 16:04:53 -07:00
Teknium	0fd3b59ba1	feat(cli): add Ctrl+Z process suspend support (#3802 ) Adds a Ctrl+Z key binding to suspend the hermes CLI to background using standard Unix job control. Uses prompt_toolkit's run_in_terminal() to properly save/restore terminal state, then sends SIGTSTP to the process group. Prints a branded message with resume instructions. Shows a not-supported notice on Windows. Co-authored-by: CharlieKerfoot <CharlieKerfoot@users.noreply.github.com>	2026-03-29 15:47:55 -07:00
Teknium	f6db1b27ba	feat: add profiles — run multiple isolated Hermes instances (#3681 ) Each profile is a fully independent HERMES_HOME with its own config, API keys, memory, sessions, skills, gateway, cron, and state.db. Core module: hermes_cli/profiles.py (~900 lines) - Profile CRUD: create, delete, list, show, rename - Three clone levels: blank, --clone (config), --clone-all (everything) - Export/import: tar.gz archive for backup and migration - Wrapper alias scripts (~/.local/bin/<name>) - Collision detection for alias names - Sticky default via ~/.hermes/active_profile - Skill seeding via subprocess (handles module-level caching) - Auto-stop gateway on delete with disable-before-stop for services - Tab completion generation for bash and zsh CLI integration (hermes_cli/main.py): - _apply_profile_override(): pre-import -p/--profile flag + sticky default - Full 'hermes profile' subcommand: list, use, create, delete, show, alias, rename, export, import - 'hermes completion bash/zsh' command - Multi-profile skill sync in hermes update Display (cli.py, banner.py, gateway/run.py): - CLI prompt: 'coder ❯' when using a non-default profile - Banner shows profile name - Gateway startup log includes profile name Gateway safety: - Token locks: Discord, Slack, WhatsApp, Signal (extends Telegram pattern) - Port conflict detection: API server, webhook adapter Diagnostics (hermes_cli/doctor.py): - Profile health section: lists profiles, checks config, .env, aliases - Orphan alias detection: warns when wrapper points to deleted profile Tests (tests/hermes_cli/test_profiles.py): - 71 automated tests covering: validation, CRUD, clone levels, rename, export/import, active profile, isolation, alias collision, completion - Full suite: 6760 passed, 0 new failures Documentation: - website/docs/user-guide/profiles.md: full user guide (12 sections) - website/docs/reference/profile-commands.md: command reference (12 commands) - website/docs/reference/faq.md: 6 profile FAQ entries - website/sidebars.ts: navigation updated	2026-03-29 10:41:20 -07:00
Teknium	9f01244137	fix: replace user-facing hardcoded ~/.hermes paths with display_hermes_home() Prep for profiles: user-facing messages now use display_hermes_home() so diagnostic output shows the correct path for each profile. New helper: display_hermes_home() in hermes_constants.py 12 files swept, ~30 user-facing string replacements. Includes dynamic TTS schema description.	2026-03-28 23:47:21 -07:00
Teknium	0bd7e95dfc	fix(honcho): allow self-hosted local instances without API key (#3644 ) Self-hosted Honcho on localhost doesn't require authentication, but both the activation gates and the SDK client required an API key. Combined fix from three contributor PRs: - Relax all 8 activation gates to accept (api_key OR base_url) as valid credentials (#3482 by @cameronbergh) - Use 'local' placeholder for the SDK client when base_url points to localhost/127.0.0.1/::1 (#3570 by @ygd58) Files changed: run_agent.py (2 gates), cli.py (1 gate), gateway/run.py (1 gate), honcho_integration/cli.py (2 gates), hermes_cli/doctor.py (2 gates), honcho_integration/client.py (SDK). Co-authored-by: cameronbergh <cameronbergh@users.noreply.github.com> Co-authored-by: ygd58 <ygd58@users.noreply.github.com> Co-authored-by: devorun <devorun@users.noreply.github.com>	2026-03-28 17:49:56 -07:00
Teknium	bea49e02a3	fix: route /bg spinner through TUI widget to prevent status bar collision (#3643 ) Background agent's KawaiiSpinner wrote \r-based animation and stop() messages through StdoutProxy, colliding with prompt_toolkit's status bar. Two fixes: - display.py: use isinstance(out, StdoutProxy) instead of fragile hasattr+name check for detecting prompt_toolkit's stdout wrapper - cli.py: silence bg agent's raw spinner (_print_fn=no-op) and route thinking updates through the TUI widget only when no foreground agent is active; clear spinner text in finally block with same guard Closes #2718 Co-authored-by: kshitijk4poor <kshitijk4poor@users.noreply.github.com>	2026-03-28 17:29:37 -07:00
Teknium	857a5d7b47	fix: sanitize surrogate characters from clipboard paste to prevent UnicodeEncodeError (#3624 ) Pasting text from rich-text editors (Google Docs, Word, etc.) can inject lone surrogate characters (U+D800..U+DFFF) that are invalid UTF-8. The OpenAI SDK serializes messages with ensure_ascii=False, then encodes to UTF-8 for the HTTP body — surrogates crash this with: UnicodeEncodeError: 'utf-8' codec can't encode character '\udce2' Three-layer fix: 1. Primary: sanitize user_message at the top of run_conversation() 2. CLI: sanitize in chat() before appending to conversation_history 3. Safety net: catch UnicodeEncodeError in the API error handler, sanitize the entire messages list in-place, and retry once. Also exclude UnicodeEncodeError from is_local_validation_error so it doesn't get classified as non-retryable. Includes 14 new tests covering the sanitization helpers and the integration with run_conversation().	2026-03-28 16:53:14 -07:00
Teknium	b029742092	fix(cli): strengthen paste collapse fallback for terminals without bracketed paste (#3625 ) The _on_text_changed fallback only detected pastes when all characters arrived in a single event (chars_added > 1). Some terminals (notably VSCode integrated terminal in certain configs) may deliver paste data differently, causing the fallback to miss. Add a second heuristic: if the newline count jumps by 4+ in a single text-change event, treat it as a paste. Alt+Enter only adds 1 newline per event, so this never false-positives on manual multi-line input. Also fixes: the fallback path was missing _paste_just_collapsed flag set before replacing buffer text, which could cause a re-trigger loop.	2026-03-28 15:40:49 -07:00
Teknium	e4480ff426	fix(config): accept 'model' key as alias for 'default' in model config (#3603 ) Users intuitively write model: { model: my-model } instead of model: { default: my-model } and it silently falls back to the hardcoded default. Now both spellings work across all three config consumers: runtime_provider, CLI, and gateway. Co-authored-by: ygd58 <ygd58@users.noreply.github.com>	2026-03-28 14:55:27 -07:00
Teknium	9a364f2805	fix: cap percentage displays at 100% in stats, gateway, and memory tool (#3599 ) Salvage of PR #3533 (binhnt92). Follow-up to #3480 — applies min(100, ...) to 5 remaining unclamped percentage display sites in context_compressor, cli /stats, gateway /stats, and memory tool. Defensive clamps now that the root cause (estimation heuristic) was already removed in #3480. Co-Authored-By: binhnt92 <binhnt92@users.noreply.github.com>	2026-03-28 14:55:18 -07:00
Teknium	1b2d4f21f3	feat(cli): show resume-by-title command in exit summary (#3607 ) When exiting a session that has a title (auto-generated or manual), the exit summary now also shows: hermes -c "Session Title" alongside the existing hermes --resume <id> command. Also adds the title to the session info block.	2026-03-28 14:54:53 -07:00
Teknium	be39292633	fix(cli): guard .strip() against None values from YAML config (#3552 ) dict.get(key, default) only returns default when key is ABSENT. When YAML has 'key:' with no value, it parses as None — .get() returns None, then .strip() crashes with AttributeError. Use (x or '') pattern to handle both missing and null cases. Salvaged from PR #3217. Co-authored-by: erosika <erosika@users.noreply.github.com>	2026-03-28 11:39:01 -07:00
Teknium	e4e04c2005	fix: make tirith block verdicts approvable instead of hard-blocking (#3428 ) Previously, tirith exit code 1 (block) immediately rejected the command with no approval prompt — users saw 'BLOCKED: Command blocked by security scan' and the agent moved on. This prevented gateway/CLI users from approving pipe-to-shell installs like 'curl ... \| sh' even when they understood the risk. Changes: - Tirith 'block' and 'warn' now both go through the approval flow. Users see the full tirith findings (severity, title, description, safer alternatives) and can choose to approve or deny. - New _format_tirith_description() builds rich descriptions from tirith findings JSON so the approval prompt is informative. - CLI startup now warns when tirith is enabled but not available, so users know command scanning is degraded to pattern matching only. The default approval choice is still deny, so the security posture is unchanged for unattended/timeout scenarios. Reported via Discord by pistrie — 'curl -fsSL https://mandex.dev/install.sh \| sh' was hard-blocked with no way to approve.	2026-03-27 13:22:01 -07:00
Teknium	8ecd7aed2c	fix: prevent reasoning box from rendering 3x during tool-calling loops (#3405 ) Two independent bugs caused the reasoning box to appear three times when the model produced reasoning + tool_calls: Bug A: _build_assistant_message() re-fired reasoning_callback with the full reasoning text even when streaming had already displayed it. The original guard only checked structured reasoning_content deltas, but reasoning also arrives via content tag extraction (<REASONING_SCRATCHPAD>/<think> tags in delta.content), which went through _fire_stream_delta not _fire_reasoning_delta. Fix: skip the callback entirely when streaming is active — both paths display reasoning during the stream. Any reasoning not shown during streaming is caught by the CLI post-response fallback. Bug B: The post-response reasoning display checked _reasoning_stream_started, but that flag was reset by _reset_stream_state() during intermediate turn boundaries (when stream_delta_callback(None) fires between tool calls). Introduced _reasoning_shown_this_turn flag that persists across the tool loop and is only reset at the start of each user turn. Live-tested in PTY: reasoning now shows exactly once per API call, no duplicates across tool-calling loops.	2026-03-27 09:57:50 -07:00
Teknium	e0dbbdb2c9	fix: eliminate 'Event loop is closed' / 'Press ENTER to continue' during idle (#3398 ) The OpenAI SDK's AsyncHttpxClientWrapper.__del__ schedules aclose() via asyncio.get_running_loop().create_task(). When an AsyncOpenAI client is garbage-collected while prompt_toolkit's event loop is running (the common CLI idle state), the aclose() task runs on prompt_toolkit's loop but the underlying TCP transport is bound to a different (dead) worker loop. The transport's self._loop.call_soon() then raises RuntimeError('Event loop is closed'), which prompt_toolkit surfaces as the disruptive 'Unhandled exception in event loop ... Press ENTER to continue...' error. Three-layer fix: 1. neuter_async_httpx_del(): Monkey-patches __del__ to a no-op at CLI startup before any AsyncOpenAI clients are created. Safe because cached clients are explicitly cleaned via _force_close_async_httpx, and uncached clients' TCP connections are cleaned by the OS on exit. 2. Custom asyncio exception handler: Installed on prompt_toolkit's event loop to silently suppress 'Event loop is closed' RuntimeError. Defense-in-depth for SDK upgrades that might change the class name. 3. cleanup_stale_async_clients(): Called after each agent turn (when the agent thread joins) to proactively evict cache entries whose event loop is closed, preventing stale clients from accumulating.	2026-03-27 09:45:25 -07:00
Teknium	1519c4d477	fix(session): add /resume CLI handler, session log truncation guard, reopen_session API (#3315 ) Three improvements salvaged from PR #3225 by Mibayy: 1. Add /resume slash command handler in CLI process_command(). The command was registered in the commands registry but had no handler, so typing /resume produced 'Unknown command'. The handler resolves by title or session ID, ends the current session cleanly, loads conversation history from SQLite, re-opens the target session, and syncs the AIAgent instance. Follows the same pattern as new_session(). 2. Add truncation guard in _save_session_log(). When resuming a session whose messages weren't fully written to SQLite, the agent starts with partial history and the first save would overwrite the full JSON log on disk. The guard reads the existing file and skips the write if it already has more messages than the current batch. 3. Add reopen_session() method to SessionDB. Proper API for clearing ended_at/end_reason instead of reaching into _conn directly. Note: Bug 1 from the original PR (INSERT OR IGNORE + _session_db = None) is already fixed on main — skipped as redundant. Closes #3123.	2026-03-26 19:04:28 -07:00
Teknium	3c57eaf744	fix: YAML boolean handling for tool_progress config (#3300 ) YAML 1.1 parses bare `off` as boolean False, which is falsy in Python's `or` chain and silently falls through to the 'all' default. Users setting `display.tool_progress: off` in config.yaml saw no effect — tool progress stayed on. Normalise False → 'off' before the or chain in both affected paths: - gateway/run.py _run_agent() tool progress reader - cli.py HermesCLI.__init__() tool_progress_mode Reported by @gibbsoft in #2859. Closes #2859.	2026-03-26 17:58:50 -07:00
Teknium	2d232c9991	feat(cli): configurable busy input mode + fix /queue always working (#3298 ) Two changes: 1. Fix /queue command: remove the _agent_running guard that rejected /queue after the agent finished. The prompt was deferred in _pending_input until the agent completed, then the handler checked _agent_running (now False) and rejected it. /queue now always queues regardless of timing. 2. Add display.busy_input_mode config (CLI-only): - 'interrupt' (default): Enter while busy interrupts the current run (preserves existing behavior) - 'queue': Enter while busy queues the message for the next turn, with a 'Queued for the next turn: ...' confirmation Ctrl+C always interrupts regardless of this setting. Salvaged from PR #3037 by StefanoChiodino. Key differences: - Default is 'interrupt' (preserves existing behavior) not 'queue' - No config version bump (unnecessary for new key in existing section) - Simpler normalization (no alias map) - /queue fix is simpler: just remove the guard instead of intercepting commands during busy state	2026-03-26 17:58:40 -07:00
Teknium	716e616d28	fix(tui): status bar duplicates and degrades during long sessions (#3291 ) shutil.get_terminal_size() can return stale/fallback values on SSH that differ from prompt_toolkit's actual terminal width. Fragments built for the wrong width overflow and wrap onto a second line (wrap_lines=True default), appearing as progressively degrading duplicates. - Read width from get_app().output.get_size().columns when inside a prompt_toolkit TUI, falling back to shutil outside TUI context - Add wrap_lines=False on the status bar Window as belt-and-suspenders guard against any future width mismatch Closes #3130 Co-authored-by: Mibayy <Mibayy@users.noreply.github.com>	2026-03-26 17:33:11 -07:00
Teknium	db241ae6ce	feat(sessions): add --source flag for third-party session isolation (#3255 ) When third-party tools (Paperclip orchestrator, etc.) spawn hermes chat as a subprocess, their sessions pollute user session history and search. - hermes chat --source <tag> (also HERMES_SESSION_SOURCE env var) - exclude_sources parameter on list_sessions_rich() and search_messages() - Sessions with source=tool hidden from sessions list/browse/search - Third-party adapters pass --source tool to isolate agent sessions Cherry-picked from PR #3208 by HenkDz. Co-authored-by: Henkey <noonou7@gmail.com>	2026-03-26 14:35:31 -07:00
Teknium	41ee207a5e	fix: catch KeyboardInterrupt in exit cleanup handlers (#3257 ) except Exception does not catch KeyboardInterrupt (inherits from BaseException). A second Ctrl+C during exit cleanup aborts pending writes — Honcho observations dropped, SQLite sessions left unclosed, cron job sessions never marked ended. Changed to except (Exception, KeyboardInterrupt) at all five sites: - cli.py: honcho.shutdown() and end_session() in finally exit block - run_agent.py: _flush_honcho_on_exit atexit handler - cron/scheduler.py: end_session() and close() in job finally block Tests exercise the actual production code paths and confirm KeyboardInterrupt propagates without the fix. Co-authored-by: dieutx <dangtc94@gmail.com>	2026-03-26 14:34:31 -07:00
Teknium	62f8aa9b03	fix: MCP toolset resolution for runtime and config (#3252 ) Gateway sessions had their own inline toolset resolution that only read platform_toolsets from config, which never includes MCP server names. MCP tools were discovered and registered but invisible to the model. - Replace duplicated gateway toolset resolution in _run_agent() and _run_background_task() with calls to the shared _get_platform_tools() - Extend _get_platform_tools() to include globally enabled MCP servers at runtime (include_default_mcp_servers=True), while config-editing flows use include_default_mcp_servers=False to avoid persisting implicit MCP defaults into platform_toolsets - Add homeassistant to PLATFORMS dict (was missing, caused KeyError) - Fix CLI entry point to use _get_platform_tools() as well, so MCP tools are visible in CLI mode too - Remove redundant platform_key reassignment in _run_background_task Co-authored-by: kshitijk4poor <kshitijk4poor@users.noreply.github.com>	2026-03-26 13:39:41 -07:00
Teknium	cbf195e806	chore: fix 154 f-strings, simplify getattr/URL patterns, remove dead code (#3119 ) Three categories of cleanup, all zero-behavioral-change: 1. F-strings without placeholders (154 fixes across 29 files) - Converted f'...' to '...' where no {expression} was present - Heaviest files: run_agent.py (24), cli.py (20), honcho_integration/cli.py (34) 2. Simplify defensive patterns in run_agent.py - Added explicit self._is_anthropic_oauth = False in __init__ (before the api_mode branch that conditionally sets it) - Replaced 7x getattr(self, '_is_anthropic_oauth', False) with direct self._is_anthropic_oauth (attribute always initialized now) - Added _is_openrouter_url() and _is_anthropic_url() helper methods - Replaced 3 inline 'openrouter' in self._base_url_lower checks 3. Remove dead code in small files - hermes_cli/claw.py: removed unused 'total' computation - tools/fuzzy_match.py: removed unused strip_indent() function and pattern_stripped variable Full test suite: 6184 passed, 0 failures E2E PTY: banner clean, tool calls work, zero garbled ANSI	2026-03-25 19:47:58 -07:00
Teknium	08d3be0412	fix: graceful return on max retries instead of crashing thread run_conversation raised the raw exception after exhausting retries, which crashed the background thread in cli.py (unhandled exception in Thread). Now returns a proper error result dict with failed=True and persists the session, matching the pattern used by other error paths (invalid responses, empty content, etc.). Also wraps cli.py's run_agent thread function in try/except as a safety net against any future unhandled exceptions from run_conversation. Made-with: Cursor	2026-03-25 19:00:39 -07:00
Teknium	f46542b6c6	fix(cli): read root-level provider and base_url from config.yaml into model config (#3112 ) When users write root-level provider and base_url in config.yaml (instead of nesting under model:), these keys were never merged into defaults['model']. The CLI reads them from CLI_CONFIG['model']['provider'] so root-level keys were silently ignored, causing fallback to OpenRouter. Merge root-level provider and base_url into defaults['model'] after handling the model key, so custom/local provider configs work regardless of nesting. Cherry-picked from PR #2283 by ygd58. Fixes #2281.	2026-03-25 18:38:32 -07:00
Teknium	9783c9d5c1	refactor: remove /model slash command from CLI and gateway (#3080 ) The /model command is removed from both the interactive CLI and messenger gateway (Telegram/Discord/Slack/WhatsApp). Users can still change models via 'hermes model' CLI subcommand or by editing config.yaml directly. Removed: - CommandDef entry from COMMAND_REGISTRY - CLI process_command() handler and model autocomplete logic - Gateway _handle_model_command() and dispatch - SlashCommandCompleter model_completer_provider parameter - Two-stage Tab completion and ghost text for /model - All /model-specific tests Unaffected: - /provider command (read-only, shows current model + providers) - ACP adapter _cmd_model (separate system for VS Code/Zed/JetBrains) - model_switch.py module (used by ACP) - 'hermes model' CLI subcommand Author: Teknium	2026-03-25 17:03:05 -07:00
Teknium	9d1e13019e	fix(cli): prevent TypeError on startup when base_url is None (#3068 ) Description This PR fixes the startup crash introduced in v0.4.0 where `self.base_url` being `None` throws a `TypeError`. Root Cause: At `cli.py:1108`, a membership check (`"openrouter.ai" in self.base_url`) is performed. If a user's config doesn't explicitly set a `base_url` (meaning it's `None`), Python raises a `TypeError: argument of type 'NoneType' is not iterable`, causing the entire CLI to crash on boot. Fix: Added a simple truthiness guard (`if self.base_url and ...`) to ensure the membership check only occurs if `base_url` is a valid string. Closes #2842 Co-authored-by: devorun <130918800+devorun@users.noreply.github.com>	2026-03-25 16:21:00 -07:00
Teknium	841401f588	feat(cli): preserve user input on multiline paste (#3065 ) When pasting 5+ lines, the CLI previously replaced the entire input buffer with a file reference placeholder. If the user had already typed a question, it was lost. Fix: move paste collapsing into handle_paste (BracketedPaste handler) so only the pasted content is saved to file. The placeholder is inserted at the cursor position, preserving existing buffer text. Also fixes: - Multi-ref expansion on submit (re.sub instead of re.match) so multiple paste blocks and surrounding text are all preserved - Double-collapse prevention via _paste_just_collapsed flag - Consistent Unicode arrow character across all paste paths Salvaged from PR #2607 by crazywriter1 (option B: core fix only, without keybinding overrides for solid-object navigation/deletion).	2026-03-25 16:00:36 -07:00
Teknium	77bcaba2d7	refactor: consolidate get_hermes_home() and parse_reasoning_effort() (#3062 ) Centralizes two widely-duplicated patterns into hermes_constants.py: 1. get_hermes_home() — Path resolution for ~/.hermes (HERMES_HOME env var) - Was copy-pasted inline across 30+ files as: Path(os.getenv("HERMES_HOME", Path.home() / ".hermes")) - Now defined once in hermes_constants.py (zero-dependency module) - hermes_cli/config.py re-exports it for backward compatibility - Removed local wrapper functions in honcho_integration/client.py, tools/website_policy.py, tools/tirith_security.py, hermes_cli/uninstall.py 2. parse_reasoning_effort() — Reasoning effort string validation - Was copy-pasted in cli.py, gateway/run.py, cron/scheduler.py - Same validation logic: check against (xhigh, high, medium, low, minimal, none) - Now defined once in hermes_constants.py, called from all 3 locations - Warning log for unknown values kept at call sites (context-specific) 31 files changed, net +31 lines (125 insertions, 94 deletions) Full test suite: 6179 passed, 0 failed	2026-03-25 15:54:28 -07:00
Teknium	8bb1d15da4	chore: remove ~100 unused imports across 55 files (#3016 ) Automated cleanup via pyflakes + autoflake with manual review. Changes: - Removed unused stdlib imports (os, sys, json, pathlib.Path, etc.) - Removed unused typing imports (List, Dict, Any, Optional, Tuple, Set, etc.) - Removed unused internal imports (hermes_cli.auth, hermes_cli.config, etc.) - Fixed cli.py: removed 8 shadowed banner imports (imported from hermes_cli.banner then immediately redefined locally — only build_welcome_banner is actually used) - Added noqa comments to imports that appear unused but serve a purpose: - Re-exports (gateway/session.py SessionResetPolicy, tools/terminal_tool.py is_interrupted/_interrupt_event) - SDK presence checks in try/except (daytona, fal_client, discord) - Test mock targets (auxiliary_client.py Path, mcp_config.py get_hermes_home) Zero behavioral changes. Full test suite passes (6162/6162, 2 pre-existing streaming test failures unrelated to this change).	2026-03-25 15:02:03 -07:00
Teknium	861624d4e9	fix(cli): refresh TUI before background task output to prevent status bar overlap (#3048 ) When a background task (/bg command) prints its output while the main agent is processing with the thinking spinner visible, the status bar could render on the same row as the spinner, causing visual overlap. This fix adds an explicit app.invalidate() call with a brief pause before printing background task output, ensuring the TUI layout is in a consistent state before the output is written. Changes: - Add TUI refresh before success output in _handle_background_command - Add TUI refresh before error output in the exception handler - Add tests for the refresh behavior Closes #2718 Co-authored-by: Bartok9 <bartokmagic@proton.me>	2026-03-25 15:00:33 -07:00
Teknium	e4033b2baf	fix(cli): catch KeyboardInterrupt during flush_memories on exit (#3025 ) KeyboardInterrupt inherits from BaseException, not Exception, so the except Exception: clauses wrapping flush_memories() on exit paths silently skipped the flush when the user pressed Ctrl+C. This could lose conversation memory. Change both call sites to except (Exception, KeyboardInterrupt): so the memory flush is attempted even during interrupt. Salvaged from PR #2855 by RufusLin (dropped unrelated bundled changes).	2026-03-25 12:47:51 -07:00
Teknium	8f6ef042c1	fix(cli): buffer reasoning preview chunks and fix duplicate display (#3013 ) Three improvements to reasoning/thinking display in the CLI: 1. Buffer tiny reasoning chunks: providers like DeepSeek stream reasoning one word at a time, producing a separate [thinking] line per token. Add a buffer that coalesces chunks and flushes at natural boundaries (newlines, sentence endings, terminal width). 2. Fix duplicate reasoning display: centralize callback selection into _current_reasoning_callback() — one place instead of 4 scattered inline ternaries. Prevents both the streaming box AND the preview callback from firing simultaneously. 3. Fix post-response reasoning box guard: change the check from 'not self._stream_started' to 'not self._reasoning_stream_started' so the final reasoning box is only suppressed when reasoning was actually streamed live, not when any text was streamed. Cherry-picked from PR #2781 by juanfradb.	2026-03-25 12:16:39 -07:00
Teknium	52c5e491f5	fix(session): surface silent SessionDB failures that cause session data loss (#2999 ) * fix(session): surface silent SessionDB failures that cause session data loss SessionDB initialization and operation failures were logged at debug level or silently swallowed, causing sessions to never be indexed in the FTS5 database. This made session_search unable to find affected conversations. In practice, ~48% of sessions can be lost without any visible indication. The JSON session files are still written (separate code path), but the SQLite/FTS5 index gets nothing — making session_search return empty results for affected sessions. Changes: - cli.py: Log warnings (not debug) when SessionDB init fails at both __init__ and _start_session entry points - run_agent.py: Log warnings on create_session, append_message, and compression split failures - run_agent.py: Set _session_db = None after create_session failure to fail fast instead of silently dropping every message for the session Root cause: When gateway restarts or DB lock contention occurs during SessionDB() init, the exception is caught and swallowed. The agent continues running normally — JSON session logs are written to disk — but no messages reach the FTS5 index. * fix: use module logger instead of root logging for SessionDB warnings Follow-up to cherry-picked PR #2939 — the original used logging.warning() (root logger) instead of logger.warning() (module logger) in the 5 new warning calls. Module logger preserves the logger hierarchy and shows the correct module name in log output. --------- Co-authored-by: LucidPaths <lc77@outlook.de>	2026-03-25 11:10:19 -07:00
Teknium	fd292e676b	fix: skip KawaiiSpinner when TUI handles tool progress (#2973 ) * docs: unify hooks documentation — add plugin hooks to hooks page, add session:end event The hooks page only documented gateway event hooks (HOOK.yaml system). The plugins page listed plugin hooks (pre_tool_call, etc.) that weren't referenced from the hooks page, which was confusing. Changes: - hooks.md: Add overview table showing both hook systems - hooks.md: Add Plugin Hooks section with available hooks, callback signatures, and example - hooks.md: Add missing session:end gateway event (emitted but undocumented) - hooks.md: Mark pre_llm_call, post_llm_call, on_session_start, on_session_end as planned (defined in VALID_HOOKS but not yet invoked) - hooks.md: Update info box to cross-reference plugin hooks - hooks.md: Fix heading hierarchy (gateway content as subsections) - plugins.md: Add cross-reference to hooks page for full details - plugins.md: Mark planned hooks as (planned) * feat(session_search): add recent sessions mode when query is omitted When session_search is called without a query (or with an empty query), it now returns metadata for the most recent sessions instead of erroring. This lets the agent quickly see what was worked on recently without needing specific keywords. Returns for each session: session_id, title, source, started_at, last_active, message_count, preview (first user message). Zero LLM cost — pure DB query. Current session lineage and child delegation sessions are excluded. The agent can then keyword-search specific sessions if it needs deeper context from any of them. * docs: clarify two-mode behavior in session_search schema description * fix(compression): restore sane defaults and cap summary at 12K tokens - threshold: 0.80 → 0.50 (compress at 50%, not 80%) - target_ratio: 0.40 → 0.20, now relative to threshold not total context (20% of 50% = 10% of context as tail budget) - summary ceiling: 32K → 12K (Gemini can't output more than ~12K) - Updated DEFAULT_CONFIG, config display, example config, and tests * fix: browser_vision ignores auxiliary.vision.timeout config (#2901) * docs: unify hooks documentation — add plugin hooks to hooks page, add session:end event The hooks page only documented gateway event hooks (HOOK.yaml system). The plugins page listed plugin hooks (pre_tool_call, etc.) that weren't referenced from the hooks page, which was confusing. Changes: - hooks.md: Add overview table showing both hook systems - hooks.md: Add Plugin Hooks section with available hooks, callback signatures, and example - hooks.md: Add missing session:end gateway event (emitted but undocumented) - hooks.md: Mark pre_llm_call, post_llm_call, on_session_start, on_session_end as planned (defined in VALID_HOOKS but not yet invoked) - hooks.md: Update info box to cross-reference plugin hooks - hooks.md: Fix heading hierarchy (gateway content as subsections) - plugins.md: Add cross-reference to hooks page for full details - plugins.md: Mark planned hooks as (planned) * fix: browser_vision ignores auxiliary.vision.timeout config browser_vision called call_llm() without passing a timeout parameter, so it always used the 30-second default in auxiliary_client.py. This made vision analysis with local models (llama.cpp, ollama) impossible since they typically need more than 30s for screenshot analysis. Now browser_vision reads auxiliary.vision.timeout from config.yaml (same config key that vision_analyze already uses) and passes it through to call_llm(). Also bumped the default vision timeout from 30s to 120s in both browser_vision and vision_analyze — 30s is too aggressive for local models and the previous default silently failed for anyone running vision locally. Fixes user report from GamerGB1988. * fix(skills): agent-created skills were incorrectly treated as untrusted community content _resolve_trust_level() didn't handle 'agent-created' source, so it fell through to 'community' trust level. Community policy blocks on any caution or dangerous findings, which meant common patterns like curl with env vars, systemctl, crontab, cloudflared references etc. would block skill creation/patching. The agent-created policy row already existed in INSTALL_POLICY with permissive settings (allow caution, ask on dangerous) but was never reached. Now it is. Fixes reports of skill_manage being blocked by security scanner. * fix(cli): enhance real-time reasoning output by forcing flush of long partial lines Updated the reasoning output mechanism to emit complete lines and force-flush long partial lines, ensuring reasoning is visible in real-time even without newlines. This improves user experience during reasoning sessions. * fix: skip KawaiiSpinner when TUI handles tool progress In the interactive CLI, the agent runs with quiet_mode=True and tool_progress_callback set. The quiet_mode condition triggered KawaiiSpinner for every tool call, but the TUI was already handling progress display via the spinner widget. The KawaiiSpinner writes carriage-return animation through StdoutProxy, triggering run_in_terminal() erase/redraw cycles on every flush. These redundant cycles cause the status bar to ghost into terminal scrollback. The thinking spinner already had this guard (checks thinking_callback). This extends the same pattern to the three tool spinner creation sites: concurrent tools, delegate_task, and single tool execution.	2026-03-25 08:33:44 -07:00
Teknium	ad1bf16f28	chore: remove all remaining mini-swe-agent references Complete cleanup after dropping the mini-swe-agent submodule (PR #2804): - Remove MSWEA_SILENT_STARTUP and MSWEA_GLOBAL_CONFIG_DIR env var settings from cli.py, run_agent.py, hermes_cli/main.py, doctor.py - Remove mini-swe-agent health check from hermes doctor - Remove 'minisweagent' from logger suppression lists - Remove litellm/typer/platformdirs from requirements.txt - Remove mini-swe-agent install steps from install.ps1 (Windows) - Remove mini-swe-agent install steps from website docs - Update all stale comments/docstrings referencing mini-swe-agent in terminal_tool.py, tools/__init__.py, code_execution_tool.py, environments/README.md, environments/agent_loop.py - Remove mini_swe_runner from pyproject.toml py-modules (still exists as standalone script for RL training use) - Shrink test_minisweagent_path.py to empty stub The orphaned mini-swe-agent/ directory on disk needs manual removal: rm -rf mini-swe-agent/	2026-03-24 08:19:23 -07:00
Teknium	02b38b93cb	refactor: remove mini-swe-agent dependency — inline Docker/Modal backends (#2804 ) Drop the mini-swe-agent git submodule. All terminal backends now use hermes-agent's own environment implementations directly. Docker backend: - Inline the `docker run -d` container startup (was 15 lines in minisweagent's DockerEnvironment). Our wrapper already handled execute(), cleanup(), security hardening, volumes, and resource limits. Modal backend: - Import swe-rex's ModalDeployment directly instead of going through minisweagent's 90-line passthrough wrapper. - Bake the _AsyncWorker pattern (from environments/patches.py) directly into ModalEnvironment for Atropos compatibility without monkey-patching. Cleanup: - Remove minisweagent_path.py (submodule path resolution helper) - Remove submodule init/install from install.sh and setup-hermes.sh - Remove mini-swe-agent from .gitmodules - environments/patches.py is now a no-op (kept for backward compat) - terminal_tool.py no longer does sys.path hacking for minisweagent - mini_swe_runner.py guards imports (optional, for RL training only) - Update all affected tests to mock the new direct subprocess calls - Update README.md, CONTRIBUTING.md No functionality change — all Docker, Modal, local, SSH, Singularity, and Daytona backends behave identically. 6093 tests pass.	2026-03-24 07:30:25 -07:00
Teknium	2e524272b1	refactor(model): extract shared switch_model() from CLI and gateway handlers Phase 4 of the /model command overhaul. Both the CLI (cli.py) and gateway (gateway/run.py) /model handlers had ~50 lines of duplicated core logic: parsing, provider detection, credential resolution, and model validation. This extracts that pipeline into hermes_cli/model_switch.py. New module exports: - ModelSwitchResult: dataclass with all fields both handlers need - CustomAutoResult: dataclass for bare '/model custom' results - switch_model(): core pipeline — parse → detect → resolve → validate - switch_to_custom_provider(): resolve endpoint + auto-detect model The shared functions are pure (no I/O side effects). Each caller handles its own platform-specific concerns: - CLI: sets self.model/provider/etc, calls save_config_value(), prints - Gateway: writes config.yaml directly, sets env vars, returns markdown Net result: -244 lines from handlers, +234 lines in shared module. The handlers are now ~80 lines each (down from ~150+) and can't drift apart on core logic.	2026-03-24 07:08:07 -07:00
Teknium	b641ee88f4	feat(model): /model command overhaul — Phases 2, 3, 5 * feat(model): persist base_url on /model switch, auto-detect for bare /model custom Phase 2+3 of the /model command overhaul: Phase 2 — Persist base_url on model switch: - CLI: save model.base_url when switching to a non-OpenRouter endpoint; clear it when switching away from custom to prevent stale URLs leaking into the new provider's resolution - Gateway: same logic using direct YAML write Phase 3 — Better feedback and edge cases: - Bare '/model custom' now auto-detects the model from the endpoint using _auto_detect_local_model() and saves all three config values (model, provider, base_url) atomically - Shows endpoint URL in success messages when switching to/from custom providers (both CLI and gateway) - Clear error messages when no custom endpoint is configured - Updated test assertions for the additional save_config_value call Fixes #2562 (Phase 2+3) * feat(model): support custom:name:model triple syntax for named custom providers Phase 5 of the /model command overhaul. Extends parse_model_input() to handle the triple syntax: /model custom:local-server:qwen → provider='custom:local-server', model='qwen' /model custom:my-model → provider='custom', model='my-model' (unchanged) The 'custom:local-server' provider string is already supported by _get_named_custom_provider() in runtime_provider.py, which matches it against the custom_providers list in config.yaml. This just wires the parsing so users can do it from the /model slash command. Added 4 tests covering single, triple, whitespace, and empty model cases.	2026-03-24 06:58:04 -07:00
Teknium	4313b8aff6	fix(cli): ensure single closure of streaming boxes during tool generation - Updated `_on_tool_gen_start` method in `HermesCLI` to close open streaming boxes exactly once, preventing potential multiple closures. - Added a check for `_stream_box_opened` to manage the state of the streaming box more effectively, enhancing user experience during large payload streaming.	2026-03-24 06:33:21 -07:00
Teknium	87e2626cf6	feat(cli, agent): add tool generation callback for streaming updates - Introduced `_on_tool_gen_start` in `HermesCLI` to indicate when tool-call arguments are being generated, enhancing user feedback during streaming. - Updated `AIAgent` to support a new `tool_gen_callback`, notifying the display layer when tool generation starts, allowing for better user experience during large payloads. - Ensured that the callback is triggered appropriately during streaming events to prevent user interface freezing.	2026-03-23 23:10:58 -07:00
Teknium	4ff73fb32c	feat(config): support ${ENV_VAR} substitution in config.yaml (#2684 ) * feat(config): support ${ENV_VAR} substitution in config.yaml * fix: extend env var expansion to CLI and gateway config loaders The original PR (#2680) only wired _expand_env_vars into load_config(), which is used by 'hermes tools' and 'hermes setup'. The two primary config paths were missed: - load_cli_config() in cli.py (interactive CLI) - Module-level _cfg in gateway/run.py (gateway — bridges api_keys to env vars) Also: - Remove redundant 'import re' (already imported at module level) - Add missing blank lines between top-level functions (PEP 8) - Add tests for load_cli_config() expansion --------- Co-authored-by: teyrebaz33 <hakanerten02@hotmail.com>	2026-03-23 16:02:06 -07:00
Teknium	f60ebc7bf2	fix: move activated skills line below welcome text Previously 'Activated skills: xxx' was printed above the banner in show_banner(). Now it prints directly after the 'Welcome to Hermes Agent!' line in run(), which is a more natural placement.	2026-03-23 06:20:19 -07:00
Teknium	1b5fb36c9d	fix(cli): allow custom/local endpoints without API key Local LLM servers (llama.cpp, ollama, vLLM, etc.) typically don't require authentication. When a custom base_url is configured but no API key is found, use a placeholder instead of failing with 'Provider resolver returned an empty API key.' The OpenAI SDK accepts any string as api_key, and local servers simply ignore the Authorization header. Fixes issue reported by @ThatWolfieGuy — llama.cpp stopped working after updating because the new runtime provider resolver enforces non-empty API keys even for keyless local endpoints.	2026-03-22 16:08:21 -07:00
Teknium	1f21ef7488	fix(cli): prevent 'Press ENTER to continue...' on exit When AsyncOpenAI clients are garbage-collected after the event loop closes, their AsyncHttpxClientWrapper.__del__ tries to schedule aclose() on the dead loop, causing RuntimeError: Event loop is closed. prompt_toolkit catches this as an unhandled exception and shows 'Press ENTER to continue...' which blocks CLI exit. Fix: Add shutdown_cached_clients() to auxiliary_client.py that marks all cached async clients' underlying httpx transport as CLOSED before GC runs. This prevents __del__ from attempting the aclose() call. - _force_close_async_httpx(): sets httpx AsyncClient._state to CLOSED - shutdown_cached_clients(): iterates _client_cache, closes sync clients normally and marks async clients as closed - Also fix stale client eviction in _get_cached_client to mark evicted async clients as closed (was just del-ing them, triggering __del__) - Call shutdown_cached_clients() from _run_cleanup() in cli.py	2026-03-22 15:31:54 -07:00
Teknium	bfe4baa6ed	chore: remove unused imports, dead code, and stale comments Mechanical cleanup — no behavior changes. Unused imports removed: - model_tools.py: import os - run_agent.py: OPENROUTER_MODELS_URL, get_model_context_length - cli.py: Table, VERSION, RELEASE_DATE, resolve_toolset, get_skill_commands - terminal_tool.py: signal, uuid, tempfile, set_interrupt_event, DANGEROUS_PATTERNS, _load_permanent_allowlist, _detect_dangerous_command Dead code removed: - toolsets.py: print_toolset_tree() (zero callers) - browser_tool.py: _get_session_name() (never called) Stale comments removed: - toolsets.py: duplicated/garbled comment line - web_tools.py: 3 aspirational TODO comments from early development	2026-03-22 08:33:34 -07:00
Mibayy	0698ddb496	fix(compression): remove hardcoded gemini-3-flash-preview as default summary model Closes #2453 The DEFAULT_CONFIG was hardcoding google/gemini-3-flash-preview as the summary_model for context compression. This caused unexpected OpenRouter charges for users who configured a different provider/model, because the compression task would silently fall back to gemini via OpenRouter even when the user's main model was on a different provider. Fix: change summary_model default to empty string. When empty, call_llm() resolves the model through the standard auto-detection chain (auxiliary.compression config -> env vars -> main provider), which correctly uses the user's configured provider and model. Users who want a dedicated cheap model for compression can still explicitly set compression.summary_model in their config.yaml.	2026-03-22 04:36:36 -07:00
Teknium	f69c47d9ae	fix: /stop command crash + UnboundLocalError in streaming media delivery Two fixes: 1. CLI /stop command crashed with 'cannot import name get_registry' — the code imported a non-existent function. Fixed to use the actual process_registry singleton and list_sessions() method. (Reported in #2458 by haiyuzhong1980) 2. Streaming media delivery used undefined 'adapter' variable — our PR #2382 called _deliver_media_from_response(adapter=adapter) but 'adapter' wasn't guaranteed to be defined in that scope. Fixed to resolve via self.adapters.get(source.platform). (Reported in #2424 by 42-evey)	2026-03-22 04:35:27 -07:00
Teknium	8cb7864110	fix: resolve garbled ANSI escape codes in status printouts (#2262 ) (#2448 ) Two related root causes for the '?[33mTool progress: NEW?[0m' garbling reported on kitty, alacritty, ghostty and gnome-console: 1. /verbose label printing used self.console.print() with Rich markup ([yellow]...[/]). self.console is a plain Rich Console() whose output goes directly to sys.stdout, which patch_stdout's StdoutProxy intercepts and mangles raw ANSI sequences. 2. Context pressure status lines (e.g. 'approaching compaction') from AIAgent._safe_print() had the same problem -- _safe_print() was a @staticmethod that always called builtin print(), bypassing the prompt_toolkit renderer entirely. Fix: - Convert AIAgent._safe_print() from @staticmethod to an instance method that delegates to self._print_fn (defaults to builtin print, preserving all non-CLI behaviour). - After the CLI creates its AIAgent instance, wire self.agent._print_fn to the existing _cprint() helper which routes through prompt_toolkit.print_formatted_text(ANSI(text)). - Rewrite the /verbose feedback labels to use hermes_cli.colors.Colors ANSI constants in f-strings and emit them via _cprint() directly, removing the Rich-markup-inside-patch_stdout anti-pattern. Fixes #2262 Co-authored-by: Animesh Mishra <animesh.m.7523@gmail.com>	2026-03-22 04:07:06 -07:00
Teknium	2a5f86ed6d	Merge pull request #2343 from NousResearch/hermes/hermes-31d7db3b feat: @ context references + Honcho config fixes	2026-03-21 16:10:19 -07:00
Teknium	8da410ed95	feat(plugins): add slash command registration for plugins (#2359 ) Plugins can now register slash commands via ctx.register_command() in their register() function. Commands automatically appear in: - /help and COMMANDS_BY_CATEGORY (under 'Plugins' category) - Tab autocomplete in CLI - Telegram bot menu - Slack subcommand mapping - Gateway dispatch Handler signature: handler(args: str) -> str \| None Async handlers are supported in gateway context. Changes: - commands.py: add register_plugin_command() and rebuild_lookups() - plugins.py: add register_command() to PluginContext, track in PluginManager._plugin_commands and LoadedPlugin.commands_registered - cli.py: dispatch plugin commands in process_command() - gateway/run.py: dispatch plugin commands before skill commands - tests: 5 new tests for registration, help, tracking, handler, gateway - docs: update plugins feature page and build guide	2026-03-21 16:00:30 -07:00
Teknium	da44c196b6	feat: @ context references — inline file, folder, diff, git, and URL injection Add @file:path, @folder:dir, @diff, @staged, @git:N, and @url: references that expand inline before the message reaches the LLM. Supports line ranges (@file:main.py:10-50), token budget enforcement (soft warn at 25%, hard block at 50%), and path sandboxing for gateway. Core module from PR #2090 by @kshitijk4poor. CLI and gateway wiring rewritten against current main. Fixed asyncio.run() crash when called from inside a running event loop (gateway). Closes #682.	2026-03-21 15:57:13 -07:00
christopher-kapic	97108db038	fix(cli): pass conversation_history in quiet mode with --resume hermes chat -q 'msg' --resume SESSION_ID loaded the session history but never passed it to run_conversation(), so the model responded without prior context. The interactive mode already does this correctly. Based on work by christopher-kapic in PR #2081. Fixes #2106.	2026-03-21 12:51:34 -07:00
Teknium	29520df44f	revert: remove Shift+Enter keybindings that crash prompt_toolkit Reverts the s-enter and Kitty CSI keybindings from PR #2345/#2346. The s-enter key notation causes 'Invalid key: s-enter' crash on some prompt_toolkit versions, breaking hermes startup entirely.	2026-03-21 10:41:07 -07:00
Teknium	42cef9c282	fix: resolve merge conflict markers in cli.py breaking hermes startup PR #2346 was merged with unresolved git conflict markers (<<<<<<, =======, >>>>>>>) in cli.py at line 6047, causing SyntaxError on startup. Resolved by keeping both the Shift+Enter keybindings and the tab handler.	2026-03-21 10:34:21 -07:00
ygd58	356122e990	fix(cli): handle Kitty keyboard protocol Shift+Enter for Ghostty/WezTerm Kitty-protocol terminals (Ghostty, WezTerm) encode Shift+Enter as CSI 13;2u instead of plain Enter. Without this binding, raw escape characters appear in the input buffer. Adds s-enter and the Kitty escape sequence as newline-insert bindings. Based on work by ygd58 in PR #1798. Fixes #1795. Registry.py apostrophe sanitization change excluded (unrelated scope).	2026-03-21 10:03:55 -07:00
Teknium	c4e787d47b	feat: enable streaming by default in CLI Streaming provides a better UX — tokens appear as they arrive instead of waiting for the full response. show_reasoning remains false so thinking blocks are not streamed to the user.	2026-03-21 09:49:47 -07:00
Teknium	d70e07fc45	refactor(cli): add protected TUI extension hooks for wrapper CLIs Based on PR #1749 by @erosika (reimplemented on current main). Extracts three protected methods from run() so wrapper CLIs can extend the TUI without overriding the entire method: - _get_extra_tui_widgets(): inject widgets between spacer and status bar - _register_extra_tui_keybindings(kb, input_area): add keybindings - _build_tui_layout_children(**widgets): full control over ordering Default implementations reproduce existing layout exactly. The inline HSplit in run() now delegates to _build_tui_layout_children(). 5 tests covering defaults, widget insertion position, and keybinding registration.	2026-03-21 09:42:07 -07:00
Teknium	f4a74d3ac7	fix(honcho): hide session banner when not explicitly configured Add explicitly_configured field to HonchoClientConfig — set when the config has a hosts.hermes block or explicit enabled flag, vs auto-enabled from a stray HONCHO_API_KEY env var. Banner only shows when this is true. Based on #1960 by @erosika, reimplemented without duplicating config parsing.	2026-03-21 08:33:44 -07:00
Teknium	fd1d6c03cb	fix(cli): correct truncated AUXILIARY_WEB_EXTRACT_API_KEY env var name Cherry-picked from PR #2295 by @dlkakbs. The web_extract auxiliary client api_key env var was literally stored as 'AUXILI..._KEY' (dots in the source) instead of the full name. Users configuring an auxiliary web_extract model with an API key would have auth failures because the key was written to a non-existent var.	2026-03-21 07:09:28 -07:00
Teknium	eb537b5db4	fix(cli): prevent multiple reasoning boxes from rendering Added a check to suppress further reasoning rendering once the response box is open, preventing potential overlap of reasoning boxes during late thinking blocks. This enhances the user experience by maintaining a clean output in the CLI.	2026-03-21 06:28:47 -07:00
Teknium	3585019831	feat(cli): enhance user input display with consistent formatting - Added a user bar separator for improved visual clarity when displaying pasted text and user input in the HermesCLI. - Ensured consistent formatting for both multi-line and single-line user inputs, enhancing the overall user experience in the command-line interface. These changes contribute to a more organized and visually appealing output during interactions.	2026-03-20 23:36:49 -07:00
Test	d0ac8d9fc7	chore: remove dead top-level toolsets config key The top-level 'toolsets' key in config.yaml was never read at runtime. Tool selection uses platform_toolsets (per-platform) or the --toolsets CLI flag. The key existed in load_cli_config() defaults and the example config as 'toolsets: [all]', misleading users into thinking it controlled tool availability. - Remove from load_cli_config() hardcoded defaults - Remove from hermes config show output - Replace in cli-config.yaml.example with deprecation note pointing to platform_toolsets and hermes tools	2026-03-20 22:27:13 -07:00
Test	f7e2ed20fa	feat(cli): implement true-color ANSI support for response text - Added support for true-color ANSI escape codes in the HermesCLI to enhance the visual appearance of streamed content. - Introduced a fallback mechanism for text color in case of errors while retrieving the color from the active skin. - Updated the output formatting to include the new text color in both line emissions and buffer flushing. These changes improve the user experience by ensuring consistent and visually appealing text output in the command-line interface.	2026-03-20 21:02:36 -07:00
Teknium	2416b2b7af	refactor(cli, banner): update gold ANSI color to true-color format (#2246 ) - Changed the ANSI escape code for gold color in cli.py and banner.py to use true-color format (#FFD700) for better visual consistency. - Enhanced the _on_tool_progress method in HermesCLI to update the TUI spinner with tool execution status, improving user feedback during operations. These changes improve the visual representation and user experience in the command-line interface. Co-authored-by: Test <test@test.com>	2026-03-20 18:17:38 -07:00
Teknium	8e884fb3f1	Merge pull request #2215 from NousResearch/hermes/hermes-31d7db3b fix: infer provider from base URL for models.dev context length lookup	2026-03-20 12:52:07 -07:00
Test	76bc27199f	fix(cli, agent): improve streaming handling and state management - Updated _stream_delta method in HermesCLI to handle None values, flushing the stream and resetting state for clean tool execution. - Enhanced quiet mode handling in AIAgent to ensure proper display closure before tool execution, preventing display issues with intermediate streamed content. These changes improve the robustness of the streaming functionality and ensure a smoother user experience during tool interactions.	2026-03-20 10:02:42 -07:00
Teknium	66a1942524	feat: add /queue command to queue prompts without interrupting (#2191 ) Adds /queue <prompt> (alias /q) that queues a message for the next turn while the agent is busy, without interrupting the current run. - CLI: /queue <prompt> puts it in _pending_input for the next turn - Gateway: /queue <prompt> creates a pending MessageEvent on the adapter, picked up after the current agent run finishes - Enter still interrupts as usual (no behavior change) - /queue with no prompt shows usage - /queue when agent is idle tells user to just type normally Co-authored-by: Test <test@test.com>	2026-03-20 09:44:27 -07:00
Test	1055d4356a	fix: skip model auto-detection for custom/local providers When the user is on a custom provider (provider=custom, localhost, or 127.0.0.1 endpoint), /model <name> no longer tries to auto-detect a provider switch. The model name changes on the current endpoint as-is. To switch away from a custom endpoint, users must use explicit provider:model syntax (e.g. /model openai-codex:gpt-5.2-codex). A helpful tip is printed when changing models on a custom endpoint. This prevents the confusing case where someone on LM Studio types /model gpt-5.2-codex, the auto-detection tries to switch providers, fails or partially succeeds, and requests still go to the old endpoint. Also fixes the missing prompt_toolkit.auto_suggest mock stub in test_cli_init.py (same issue already fixed in test_cli_new_session.py).	2026-03-20 04:35:17 -07:00
Teknium	b19f5133c3	Merge pull request #2118 from NousResearch/hermes/hermes-e83093f0 feat: show reasoning/thinking blocks when show_reasoning is enabled	2026-03-20 04:35:12 -07:00
Test	b1832faaae	feat: show reasoning/thinking blocks when show_reasoning is enabled - Add <thinking> tag to streaming filter's tag list - When show_reasoning is on, route XML reasoning content to the reasoning display box instead of silently discarding it - Expand _strip_think_blocks to handle all tag variants: <think>, <thinking>, <THINKING>, <reasoning>, <REASONING_SCRATCHPAD>	2026-03-19 19:44:31 -07:00
InB4DevOps	fe331ed9bd	fix: Reset token counters on new session for accurate usage display (#2099 )	2026-03-20 01:21:25 +01:00
Teknium	4c0c7f4c6e	fix: /model command — bare provider names, custom endpoint display Two issues with /model preventing proper provider switching: 1. Bare provider names not detected: typing '/model nous' treated 'nous' as a model name instead of triggering a provider switch. Fixed by adding step 0 in detect_provider_for_model() that checks if the input matches a known provider name/alias (excluding 'custom'/'openrouter' which need explicit model names) and returns that provider's default model. 2. Custom endpoint details hidden: /model (no args) showed '[custom]' with just a usage hint but no endpoint URL or model name. Now displays the configured base_url for custom providers in both CLI and gateway. Note: config base_url and OPENAI_BASE_URL are intentionally NOT cleared on provider switch — dedicated provider paths (nous, anthropic, codex) have their own credential resolution that ignores these, and clearing them would destroy the user's custom endpoint config, preventing switching back. Co-authored-by: Test <test@test.com>	2026-03-19 12:06:48 -07:00
StefanIsMe	04b6ecadc4	feat(cli): Tab now accepts auto-suggestions (ghost text) Previously, Tab only handled dropdown completions. Users seeing gray ghost text from history-based suggestions had no way to accept them with Tab - they had to use Right arrow or Ctrl+E. Now Tab follows priority: 1. Completion menu open → accept selected completion 2. Ghost text suggestion available → accept auto-suggestion 3. Otherwise → start completion menu This matches user intuition that Tab should 'complete what I see.'	2026-03-19 10:40:37 -07:00
cmcleay	bb59057d5d	fix: normalize live Chrome CDP endpoints for browser tools	2026-03-19 10:17:03 -07:00
Teknium	d76fa7fc37	fix: detect context length for custom model endpoints via fuzzy matching + config override (#2051 ) * fix: detect context length for custom model endpoints via fuzzy matching + config override Custom model endpoints (non-OpenRouter, non-known-provider) were silently falling back to 2M tokens when the model name didn't exactly match what the endpoint's /v1/models reported. This happened because: 1. Endpoint metadata lookup used exact match only — model name mismatches (e.g. 'qwen3.5:9b' vs 'Qwen3.5-9B-Q4_K_M.gguf') caused a miss 2. Single-model servers (common for local inference) required exact name match even though only one model was loaded 3. No user escape hatch to manually set context length Changes: - Add fuzzy matching for endpoint model metadata: single-model servers use the only available model regardless of name; multi-model servers try substring matching in both directions - Add model.context_length config override (highest priority) so users can explicitly set their model's context length in config.yaml - Log an informative message when falling back to 2M probe, telling users about the config override option - Thread config_context_length through ContextCompressor and AIAgent init Tests: 6 new tests covering fuzzy match, single-model fallback, config override (including zero/None edge cases). * fix: auto-detect local model name and context length for local servers Cherry-picked from PR #2043 by sudoingX. - Auto-detect model name from local server's /v1/models when only one model is loaded (no manual model name config needed) - Add n_ctx_train and n_ctx to context length detection keys for llama.cpp - Query llama.cpp /props endpoint for actual allocated context (not just training context from GGUF metadata) - Strip .gguf suffix from display in banner and status bar - _auto_detect_local_model() in runtime_provider.py for CLI init Co-authored-by: sudo <sudoingx@users.noreply.github.com> * fix: revert accidental summary_target_tokens change + add docs for context_length config - Revert summary_target_tokens from 2500 back to 500 (accidental change during patching) - Add 'Context Length Detection' section to Custom & Self-Hosted docs explaining model.context_length config override --------- Co-authored-by: Test <test@test.com> Co-authored-by: sudo <sudoingx@users.noreply.github.com>	2026-03-19 06:01:16 -07:00
Test	e7844e9c8d	Merge origin/main, resolve conflicts (self._base_url_lower)	2026-03-18 04:09:00 -07:00
Test	c1750bb32d	feat(cli): add /statusbar command to toggle context bar Adds /statusbar (alias /sb) to show/hide the bottom status bar that displays model name, context usage, and session duration. Uses ConditionalContainer so the bar takes zero space when hidden rather than leaving a blank line.	2026-03-18 03:49:49 -07:00
Test	8422196e89	Merge PR #1879 : feat: integrate GitHub Copilot providers	2026-03-18 03:18:33 -07:00
Teknium	24ac577046	fix: respect model.default from config.yaml for openai-codex provider (#1896 ) When config.yaml had a non-default model (e.g. gpt-5.3-codex) and the provider was openai-codex, _normalize_model_for_provider() would replace it with the latest available codex model because _model_is_default only checked the CLI argument, not the config value. Now _model_is_default is False when config.yaml has a model that differs from the global fallback (anthropic/claude-opus-4.6), so the user's explicit config choice is preserved. Fixes #1887 Co-authored-by: Test <test@test.com>	2026-03-18 02:50:31 -07:00
Test	a8132d1252	fix: respect model.default from config.yaml for openai-codex provider When config.yaml had a non-default model (e.g. gpt-5.3-codex) and the provider was openai-codex, _normalize_model_for_provider() would replace it with the latest available codex model because _model_is_default only checked the CLI argument, not the config value. Now _model_is_default is False when config.yaml has a model that differs from the global fallback (anthropic/claude-opus-4.6), so the user's explicit config choice is preserved. Fixes #1887	2026-03-18 02:24:41 -07:00
max	0c392e7a87	feat: integrate GitHub Copilot providers across Hermes Add first-class GitHub Copilot and Copilot ACP provider support across model selection, runtime provider resolution, CLI sessions, delegated subagents, cron jobs, and the Telegram gateway. This also normalizes Copilot model catalogs and API modes, introduces a Copilot ACP OpenAI-compatible shim, and fixes service-mode auth by resolving Homebrew-installed gh binaries under launchd. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-03-17 23:40:22 -07:00

1 2 3 4 5 ...

543 Commits