hermes-agent

Author	SHA1	Message	Date
Teknium	816a3ef6f1	Merge pull request #745 from NousResearch/hermes/hermes-f8d56335 feat: browser console tool, annotated screenshots, auto-recording, and dogfood QA skill	2026-03-08 21:29:52 -07:00
teknium1	a8bf414f4a	feat: browser console/errors tool, annotated screenshots, auto-recording, and dogfood QA skill New browser capabilities and a built-in skill for agent-driven web QA. ## New tool: browser_console Returns console messages (log/warn/error/info) AND uncaught JavaScript exceptions in a single call. Uses agent-browser's 'console' and 'errors' commands through the existing session plumbing. Supports --clear to reset buffers. Verified working in both local and Browserbase cloud modes. ## Enhanced tool: browser_vision(annotate=True) New boolean parameter on browser_vision. When true, agent-browser overlays numbered [N] labels on interactive elements — each [N] maps to ref @eN. Annotation data (element name, role, bounding box) returned alongside the vision analysis. Useful for QA reports and spatial reasoning. ## Config: browser.record_sessions Auto-record browser sessions as WebM video files when enabled: - Starts recording on first browser_navigate - Stops and saves on browser_close - Saves to ~/.hermes/browser_recordings/ - Works in both local and cloud modes (verified) - Disabled by default ## Built-in skill: dogfood Systematic exploratory QA testing for web applications. Teaches the agent a 5-phase workflow: 1. Plan — accept URL, create output dirs, set scope 2. Explore — systematic crawl with annotated screenshots 3. Collect Evidence — screenshots, console errors, JS exceptions 4. Categorize — severity (Critical/High/Medium/Low) and category (Functional/Visual/Accessibility/Console/UX/Content) 5. Report — structured markdown with per-issue evidence Includes: - skills/dogfood/SKILL.md — full workflow instructions - skills/dogfood/references/issue-taxonomy.md — severity/category defs - skills/dogfood/templates/dogfood-report-template.md — report template ## Tests 21 new tests covering: - browser_console message/error parsing, clear flag, empty/failed states - browser_console schema registration - browser_vision annotate schema and flag passing - record_sessions config defaults and recording lifecycle - Dogfood skill file existence and content validation Addresses #315.	2026-03-08 21:28:12 -07:00
Teknium	315f3ea429	Merge pull request #740 from NousResearch/hermes/hermes-3cd7c62d feat: simple fallback model for provider resilience (#737)	2026-03-08 21:16:58 -07:00
teknium1	b7d6eae64c	fix: Signal adapter parity pass — integration gaps, clawdbot features, env var simplification Integration gaps fixed (7 files missing Signal): - cron/scheduler.py: Signal in platform_map (cron delivery was broken) - agent/prompt_builder.py: PLATFORM_HINTS for Signal (agent knows it's on Signal) - toolsets.py: hermes-signal toolset + added to hermes-gateway composite - hermes_cli/status.py: Signal + Slack in platform status display - tools/send_message_tool.py: Signal example in target description - tools/cronjob_tools.py: Signal in delivery option docs + schema - gateway/channel_directory.py: Signal in session-based channel discovery Clawdbot parity features added to signal.py: - Self-message filtering: prevents reply loops by checking sender != account - SyncMessage filtering: ignores sync envelopes (sent transcripts, read receipts) - Edit message support: reads dataMessage from editMessage envelope - Mention rendering: replaces \uFFFC placeholders with @identifier text - Jitter in SSE reconnection backoff (20% randomization, prevents thundering herd) Env var simplification (7 → 4): - Removed SIGNAL_DM_POLICY (DM auth follows standard platform pattern via SIGNAL_ALLOWED_USERS + DM pairing, same as Telegram/Discord) - Removed SIGNAL_GROUP_POLICY (derived from SIGNAL_GROUP_ALLOWED_USERS: not set = disabled, set with IDs = allowlist, set with * = open) - Removed SIGNAL_DEBUG (was setting root logger, removed entirely) - Remaining: SIGNAL_HTTP_URL, SIGNAL_ACCOUNT (required), SIGNAL_ALLOWED_USERS, SIGNAL_GROUP_ALLOWED_USERS (optional) Updated all docs (website, AGENTS.md, signal.md) to match.	2026-03-08 21:00:21 -07:00
teknium1	b3765c28d0	fix: restrict fallback providers to actual hermes providers Remove hallucinated providers (openai, deepseek, together, groq, fireworks, mistral, gemini, nous) from the fallback provider map. These don't exist in hermes-agent's provider system. The real supported providers for fallback are: openrouter (OPENROUTER_API_KEY) zai (ZAI_API_KEY) kimi-coding (KIMI_API_KEY) minimax (MINIMAX_API_KEY) minimax-cn (MINIMAX_CN_API_KEY) For any other OpenAI-compatible endpoint, users can use the base_url + api_key_env overrides in the config. Also adds Kimi User-Agent header for kimi fallback (matching the main provider system).	2026-03-08 20:49:55 -07:00
teknium1	161436cfdd	feat: simple fallback model for provider resilience When the primary model/provider fails after retries (rate limit, overload, auth errors, connection failures), Hermes automatically switches to a configured fallback model for the remainder of the session. Config (in ~/.hermes/config.yaml): fallback_model: provider: openrouter model: anthropic/claude-sonnet-4 Supports all major providers: OpenRouter, OpenAI, Nous, DeepSeek, Together, Groq, Fireworks, Mistral, Gemini — plus custom endpoints via base_url and api_key_env overrides. Design principles: - Dead simple: one fallback model, not a chain - One-shot: switches once, doesn't ping-pong back - Zero new dependencies: uses existing OpenAI client - Minimal code: ~100 lines in run_agent.py, ~5 lines in cli.py/gateway - Three trigger points: max retries exhausted, non-retryable client errors, and invalid response exhaustion Does NOT trigger on context overflow or payload-too-large errors (those are handled by the existing compression system). Addresses #737. 25 new tests, 2492 total passing.	2026-03-08 20:22:33 -07:00
teknium1	24f549a692	feat: add Signal messenger gateway platform (#405 ) Complete Signal adapter using signal-cli daemon HTTP API. Based on PR #268 by ibhagwan, rebuilt on current main with bug fixes. Architecture: - SSE streaming for inbound messages with exponential backoff (2s→60s) - JSON-RPC 2.0 for outbound (send, typing, attachments, contacts) - Health monitor detects stale SSE connections (120s threshold) - Phone number redaction in all logs and global redact.py Features: - DM and group message support with separate access policies - DM policies: pairing (default), allowlist, open - Group policies: disabled (default), allowlist, open - Attachment download with magic-byte type detection - Typing indicators (8s refresh interval) - 100MB attachment size limit, 8000 char message limit - E.164 phone + UUID allowlist support Integration: - Platform.SIGNAL enum in gateway/config.py - Signal in _is_user_authorized() allowlist maps (gateway/run.py) - Adapter factory in _create_adapter() (gateway/run.py) - user_id_alt/chat_id_alt fields in SessionSource for UUIDs - send_message tool support via httpx JSON-RPC (not aiohttp) - Interactive setup wizard in 'hermes gateway setup' - Connectivity testing during setup (pings /api/v1/check) - signal-cli detection and install guidance Bug fixes from PR #268: - Timestamp reads from envelope_data (not outer wrapper) - Uses httpx consistently (not aiohttp in send_message tool) - SIGNAL_DEBUG scoped to signal logger (not root) - extract_images regex NOT modified (preserves group numbering) - pairing.py NOT modified (no cross-platform side effects) - No dual authorization (adapter defers to run.py for user auth) - Wildcard uses set membership ('*' in set, not list equality) - .zip default for PK magic bytes (not .docx) No new Python dependencies — uses httpx (already core). External requirement: signal-cli daemon (user-installed). Tests: 30 new tests covering config, init, helpers, session source, phone redaction, authorization, and send_message integration. Co-authored-by: ibhagwan <ibhagwan@users.noreply.github.com>	2026-03-08 20:20:35 -07:00
Teknium	7a8778ac73	Merge pull request #732 from NousResearch/hermes/hermes-2cb83eed docs: comprehensive AGENTS.md audit and corrections	2026-03-08 20:10:32 -07:00
teknium1	763c6d104d	fix: unify gateway session hygiene with agent compression config The gateway had a SEPARATE compression system ('session hygiene') with hardcoded thresholds (100k tokens / 200 messages) that were completely disconnected from the model's context length and the user's compression config in config.yaml. This caused premature auto-compression on Telegram/Discord — triggering at ~60k tokens (from the 200-message threshold) or inconsistent token counts. Changes: - Gateway hygiene now reads model name from config.yaml and uses get_model_context_length() to derive the actual context limit - Compression threshold comes from compression.threshold in config.yaml (default 0.85), same as the agent's ContextCompressor - Removed the message-count-based trigger (was redundant and caused false positives in tool-heavy sessions) - Removed the undocumented session_hygiene config section — the standard compression.* config now controls everything - Env var overrides (CONTEXT_COMPRESSION_THRESHOLD, CONTEXT_COMPRESSION_ENABLED) are respected - Warn threshold is now 95% of model context (was hardcoded 200k) - Updated tests to verify model-aware thresholds, scaling across models, and that message count alone no longer triggers compression For claude-opus-4.6 (200k context) at 85% threshold: gateway hygiene now triggers at 170k tokens instead of the old 100k.	2026-03-08 20:08:02 -07:00
teknium1	2d1a1c1c47	refactor: remove redundant 'openai' auxiliary provider, clean up docs The 'openai' provider was redundant — using OPENAI_BASE_URL + OPENAI_API_KEY with provider: 'main' already covers direct OpenAI API. Provider options are now: auto, openrouter, nous, codex, main. - Removed _try_openai(), _OPENAI_AUX_MODEL, _OPENAI_BASE_URL - Replaced openai tests with codex provider tests - Updated all docs to remove 'openai' option and clarify 'main' - 'main' description now explicitly mentions it works with OpenAI API, local models, and any OpenAI-compatible endpoint Tests: 2467 passed.	2026-03-08 18:50:26 -07:00
teknium1	71e81728ac	feat: Codex OAuth vision support + multimodal content adapter The Codex Responses API (chatgpt.com/backend-api/codex) supports vision via gpt-5.3-codex. This was verified with real API calls using image analysis. Changes to _CodexCompletionsAdapter: - Added _convert_content_for_responses() to translate chat.completions multimodal format to Responses API format: - {type: 'text'} → {type: 'input_text'} - {type: 'image_url', image_url: {url: '...'}} → {type: 'input_image', image_url: '...'} - Fixed: removed 'stream' from resp_kwargs (responses.stream() handles it) - Fixed: removed max_output_tokens and temperature (Codex endpoint rejects them) Provider changes: - Added 'codex' as explicit auxiliary provider option - Vision auto-fallback now includes Codex (OpenRouter → Nous → Codex) since gpt-5.3-codex supports multimodal input - Updated docs with Codex OAuth examples Tested with real Codex OAuth token + ~/.hermes/image2.png — confirmed working end-to-end through the full adapter pipeline. Tests: 2459 passed.	2026-03-08 18:44:33 -07:00
Teknium	ebe60646db	Merge pull request #735 from NousResearch/hermes/hermes-f8d56335 fix: allow non-codex-suffixed models (e.g. gpt-5.4) with OpenAI Codex provider	2026-03-08 18:30:27 -07:00
teknium1	f996d7950b	fix: trust user-selected models with OpenAI Codex provider The Codex model normalization was rejecting any model without 'codex' in its name, forcing a fallback to gpt-5.3-codex. This blocked models like gpt-5.4 that the Codex API actually supports. The fix simplifies _normalize_model_for_provider() to two operations: 1. Strip provider prefixes (API needs bare slugs) 2. Replace the untouched default model with a Codex-compatible one If the user explicitly chose a model — any model — we trust them and let the API be the judge. No allowlists, no slug checks. Also removes the 'codex not in slug' filter from _read_cache_models() so the local cache preserves all API-available models. Inspired by OpenClaw's approach which explicitly lists non-codex models (gpt-5.4, gpt-5.2) as valid Codex models.	2026-03-08 18:29:09 -07:00
teknium1	ae4a674c84	feat: add 'openai' as auxiliary provider option Users can now set provider: "openai" for auxiliary tasks (vision, web extract, compression) to use OpenAI's API directly with their OPENAI_API_KEY. This hits api.openai.com/v1 with gpt-4o-mini as the default model — supports vision since GPT-4o handles image input. Provider options are now: auto, openrouter, nous, openai, main. Changes: - agent/auxiliary_client.py: added _try_openai(), "openai" case in _resolve_forced_provider(), updated auxiliary_max_tokens_param() to use max_completion_tokens for OpenAI - Updated docs: cli-config.yaml.example, AGENTS.md, and user-facing configuration.md with Common Setups section showing OpenAI, OpenRouter, and local model examples - 3 new tests for OpenAI provider resolution Tests: 2459 passed (was 2429).	2026-03-08 18:25:30 -07:00
teknium1	5ae0b731d0	fix: harden auxiliary model config — gateway bridge, vision safety, tests Improvements on top of PR #606 (auxiliary model configuration): 1. Gateway bridge: Added auxiliary.* and compression.summary_provider config bridging to gateway/run.py so config.yaml settings work from messaging platforms (not just CLI). Matches the pattern in cli.py. 2. Vision auto-fallback safety: In auto mode, vision now only tries OpenRouter + Nous Portal (known multimodal-capable providers). Custom endpoints, Codex, and API-key providers are skipped to avoid confusing errors from providers that don't support vision input. Explicit provider override (AUXILIARY_VISION_PROVIDER=main) still allows using any provider. 3. Comprehensive tests (46 new): - _get_auxiliary_provider env var resolution (8 tests) - _resolve_forced_provider with all provider types (8 tests) - Per-task provider routing integration (4 tests) - Vision auto-fallback safety (7 tests) - Config bridging logic (11 tests) - Gateway/CLI bridge parity (2 tests) - Vision model override via env var (2 tests) - DEFAULT_CONFIG shape validation (4 tests) 4. Docs: Added auxiliary_client.py to AGENTS.md project structure. Updated module docstring with separate text/vision resolution chains. Tests: 2429 passed (was 2383).	2026-03-08 18:06:47 -07:00
teknium1	d9f373654b	feat: enhance auxiliary model configuration and environment variable handling - Added support for auxiliary model overrides in the configuration, allowing users to specify providers and models for vision and web extraction tasks. - Updated the CLI configuration example to include new auxiliary model settings. - Enhanced the environment variable mapping in the CLI to accommodate auxiliary model configurations. - Improved the resolution logic for auxiliary clients to support task-specific provider overrides. - Updated relevant documentation and comments for clarity on the new features and their usage.	2026-03-08 18:06:47 -07:00
Teknium	0efbb137e8	Merge pull request #734 from NousResearch/hermes/hermes-f8d56335 feat: display previous messages when resuming a session in CLI	2026-03-08 18:06:00 -07:00
0xbyt4	d8df91dfa8	fix: resolve merge conflict with main in clipboard.py	2026-03-09 03:50:29 +03:00
teknium1	f88343a6da	Merge PR #733 : feat: interactive session browser with search filtering (#718 )	2026-03-08 17:47:42 -07:00
teknium1	491605cfea	feat: add high-value tool result hints for patch and search_files (#722 ) Add contextual [Hint: ...] suffixes to tool results where they save real iterations: - patch (no match): suggests read_file/search_files to verify content before retrying — addresses the common pattern where the agent retries with stale old_string instead of re-reading the file. - search_files (truncated): provides explicit next offset and suggests narrowing the search — clearer than relying on total_count inference. Other hints proposed in #722 (terminal, web_search, web_extract, browser_snapshot, search zero-results, search content-matches) were evaluated and found to be low-value: either already covered by existing mechanisms (read_file pagination, similar-files, schema descriptions) or guidance the agent already follows from its own reasoning. 5 new tests covering hint presence/absence for both tools.	2026-03-08 17:46:28 -07:00
teknium1	3aded1d4e5	feat: display previous messages when resuming a session in CLI When resuming a session via --continue or --resume, show a compact recap of the previous conversation inside a Rich panel before the input prompt. This gives users immediate visual context about what was discussed. Changes: - Add _preload_resumed_session() to load session history early (in run(), before banner) so _init_agent() doesn't need a separate DB round-trip - Add _display_resumed_history() that renders a formatted recap panel: * User messages shown with gold bullet (truncated at 300 chars) * Assistant responses shown with green diamond (truncated at 200 chars / 3 lines) * Tool calls collapsed to count + tool names * System messages and tool results hidden * <REASONING_SCRATCHPAD> blocks stripped from display * Pure-reasoning messages (no visible output) skipped entirely * Capped at last 10 exchanges with 'N earlier messages' indicator * Dim/muted styling distinguishes recap from active conversation - Add display.resume_display config option: 'full' (default) or 'minimal' - Store resume_display as instance variable (like compact) for testability - 27 new tests covering all display scenarios, config, and edge cases Closes #719	2026-03-08 17:45:45 -07:00
teknium1	4f0402ed3a	chore: remove all NOUS_API_KEY references NOUS_API_KEY is unused — vision tools use OPENROUTER_API_KEY or Nous Portal OAuth (auth.json), and MoA tools use OPENROUTER_API_KEY. Removed from: - hermes_cli/config.py: api_keys allowlist for config set routing - .env.example: example env file entry and comment - tests/hermes_cli/test_set_config_value.py: parametrize test data - tests/integration/test_web_tools.py: updated comments and log messages to reference 'auxiliary LLM provider' instead of NOUS_API_KEY No HECATE references found in codebase (already cleaned up).	2026-03-08 17:45:38 -07:00
teknium1	ecac6321c4	feat: interactive session browser with search filtering (#718 ) Add `hermes sessions browse` — a curses-based interactive session picker with live type-to-search filtering, arrow key navigation, and seamless session resume via Enter. Features: - Arrow keys to navigate, Enter to select and resume, Esc/q to quit - Type characters to live-filter sessions by title, preview, source, or ID - Backspace to edit filter, first Esc clears filter, second Esc exits - Adaptive column layout (title/preview, last active, source, ID) - Scrolling support for long session lists - --source flag to filter by platform (cli, telegram, discord, etc.) - --limit flag to control how many sessions to load (default: 50) - Windows fallback: numbered list with input prompt - After selection, seamlessly execs into `hermes --resume <id>` Design decisions: - Separate subcommand (not a flag on -c) — preserves `hermes -c` as-is for instant most-recent-session resume - Uses curses (not simple_term_menu) per Known Pitfalls to avoid the arrow-key ghost-duplication rendering bug in tmux/iTerm - Follows existing curses pattern from hermes_cli/tools_config.py Also fixes: removed redundant `import os` inside cmd_sessions stats block that shadowed the module-level import (would cause UnboundLocalError if browse action was taken in the same function). Tests: 33 new tests covering curses picker, fallback mode, filtering, navigation, edge cases, and argument parser registration.	2026-03-08 17:42:50 -07:00
teknium1	97b1c76b14	test: add regression test for #712 (setup wizard codex import) Verifies that setup.py imports the correct function name (get_codex_model_ids) from codex_models.py. This would have caught the ImportError bug before it reached users.	2026-03-08 17:32:52 -07:00
teknium1	c0520223fd	fix: clipboard BMP conversion file loss and broken test Source code (hermes_cli/clipboard.py): - _convert_to_png() lost the file when both Pillow and ImageMagick were unavailable: path.rename(tmp) moved the file to .bmp, then subprocess.run raised FileNotFoundError, but the file was never renamed back. The final fallback 'return path.exists()' returned False. - Fix: restore the original file in both except handlers by renaming tmp back to path when the original is missing. Test (tests/tools/test_clipboard.py): - test_file_still_usable_when_no_converter expected 'from PIL import Image' to raise an Exception, but Pillow is installed so pytest.raises fired 'DID NOT RAISE'. The test also never called _convert_to_png(). - Fix: properly mock PIL unavailability via patch.dict(sys.modules), actually call _convert_to_png(), and assert the correct result.	2026-03-08 17:22:27 -07:00
teknium1	2e73a9e893	Merge PR #704 : fix: initialize Skills Hub before listing skills Authored by PeterFile. Fixes #703.	2026-03-08 17:10:54 -07:00
teknium1	26bb56b775	feat: add /resume command to gateway for switching to named sessions Messaging users can now switch back to previously-named sessions: - /resume My Project — resolves the title (with auto-lineage) and restores that session's conversation history - /resume (no args) — lists recent titled sessions to choose from Adds SessionStore.switch_session() which ends the current session and points the session entry at the target session ID so the old transcript is loaded on the next message. Running agents are cleared on switch. Completes the session naming feature from PR #720 for gateway users. 8 new tests covering: name resolution, lineage auto-latest, already-on- session check, nonexistent names, agent cleanup, no-DB fallback, and listing titled sessions.	2026-03-08 17:09:00 -07:00
teknium1	95b1130485	fix: normalize incompatible models when provider resolves to Codex When _ensure_runtime_credentials() resolves the provider to openai-codex, check if the active model is Codex-compatible. If not (e.g. the default anthropic/claude-opus-4.6), swap it for the best available Codex model. Also strips provider prefixes the Codex API rejects (openai/gpt-5.3-codex → gpt-5.3-codex). Adds _model_is_default flag so warnings are only shown when the user explicitly chose an incompatible model (not when it's the config default). Fixes #651. Co-inspired-by: stablegenius49 (PR #661) Co-inspired-by: teyrebaz33 (PR #696)	2026-03-08 16:48:56 -07:00
teknium1	3fb8938cd3	fix: search_files now reports error for non-existent paths instead of silent empty results Previously, search_files would silently return 0 results when the search path didn't exist (e.g., /root/.hermes/... when HOME is /home/user). The path was passed to rg/grep/find which would fail silently, and the empty stdout was parsed as 'no matches found'. Changes: - Add path existence check at the top of search() using test -e. Returns SearchResult with a clear error message when path doesn't exist. - Add exit code 2 checks in _search_with_rg() and _search_with_grep() as secondary safety net for other error types (bad regex, permissions). - Add 4 new tests covering: nonexistent path (content mode), nonexistent path (files mode), existing path proceeds normally, rg error exit code. Tests: 37 → 41 in test_file_operations.py, full suite 2330 passed.	2026-03-08 16:47:20 -07:00
teknium1	34b4fe495e	fix: add title validation — sanitize, length limit, control char stripping - Add SessionDB.sanitize_title() static method: - Strips ASCII control chars (null, bell, ESC, etc.) except whitespace - Strips problematic Unicode controls (zero-width, RTL override, BOM) - Collapses whitespace runs, strips edges - Normalizes empty/whitespace-only to None - Enforces 100 char max length (raises ValueError) - set_session_title() now calls sanitize_title() internally, so all call sites (CLI, gateway, auto-lineage) are protected - CLI /title handler sanitizes early to show correct feedback - Gateway /title handler sanitizes early to show correct feedback - 24 new tests: sanitize_title (17 cases covering control chars, zero-width, RTL, BOM, emoji, CJK, length, integration), gateway validation (too long, control chars, only-control-chars)	2026-03-08 15:54:51 -07:00
teknium1	4fdd6c0dac	fix: harden session title system + add /title to gateway - Empty string titles normalized to None (prevents uncaught IntegrityError when two sessions both get empty-string titles via the unique index) - Escape SQL LIKE wildcards (%, _) in resolve_session_by_title and get_next_title_in_lineage to prevent false matches on titles like 'test_project' matching 'testXproject #2' - Optimize list_sessions_rich from N+2 queries to a single query with correlated subqueries (preview + last_active computed in SQL) - Add /title slash command to gateway (Telegram, Discord, Slack, WhatsApp) with set and show modes, uniqueness conflict handling - Add /title to gateway /help text and _known_commands - 12 new tests: empty string normalization, multi-empty-title safety, SQL wildcard edge cases, gateway /title set/show/conflict/cross-platform	2026-03-08 15:48:09 -07:00
teknium1	60b6abefd9	feat: session naming with unique titles, auto-lineage, rich listing, resume by name - Schema v4: unique title index, migration from v2/v3 - set/get/resolve session titles with uniqueness enforcement - Auto-lineage: context compression auto-numbers titles (Task -> Task #2 -> Task #3) - resolve_session_by_title: auto-latest finds most recent continuation - list_sessions_rich: preview (first 60 chars) + last_active timestamp - CLI: -c accepts optional name arg (hermes -c 'my project') - CLI: /title command with deferred mode (set before session exists) - CLI: sessions list shows Title, Preview, Last Active, ID - 27 new tests (1844 total passing)	2026-03-08 15:20:29 -07:00
0xbyt4	0c3253a485	fix: mock asyncio.run in mirror test to prevent event loop destruction asyncio.run() closes the event loop after execution, which breaks subsequent tests using asyncio.get_event_loop() (test_send_image_file).	2026-03-09 00:20:19 +03:00
0xbyt4	d0f84c0964	fix: log exceptions instead of silently swallowing in cron scheduler Two 'except Exception: pass' blocks silently hide failures: - mirror_to_session failure: user's message never gets mirrored, no trace - config.yaml parse failure: wrong model used silently Replace with logger.warning so failures are visible in logs.	2026-03-09 00:06:34 +03:00
0xbyt4	67421ed74f	fix: update test_non_empty_has_markers to match todo filtering behavior Completed/cancelled items are now filtered from format_for_injection() output. Update the existing test to verify active items appear and completed items are excluded.	2026-03-08 23:07:38 +03:00
0xbyt4	e2fe1373f3	fix: escalate read/search blocking, track search loops, filter completed todos - Block file reads after 3+ re-reads of same region (no content returned) - Track search_files calls and block repeated identical searches - Filter completed/cancelled todos from post-compression injection to prevent agent from re-doing finished work - Add 10 new tests covering all three fixes	2026-03-08 23:01:21 +03:00
0xbyt4	9eee529a7f	fix: detect and warn on file re-read loops after context compression When context compression summarizes conversation history, the agent loses track of which files it already read and re-reads them in a loop. Users report the agent reading the same files endlessly without writing. Root cause: context compression is lossy — file contents and read history are lost in the summary. After compression, the model thinks it hasn't examined the files yet and reads them again. Fix (two-part): 1. Track file reads per task in file_tools.py. When the same file region is read again, include a _warning in the response telling the model to stop re-reading and use existing information. 2. After context compression, inject a structured message listing all files already read in the session with explicit "do NOT re-read" instruction, preserving read history across compression boundaries. Adds 16 tests covering warning detection, task isolation, summary accuracy, tracker cleanup, and compression history injection.	2026-03-08 20:44:42 +03:00
Verne	333e4abe30	fix: Initialize Skills Hub on list Call ensure_hub_dirs() at the start of hermes skills list so the\nSkills Hub directory structure is created before reading hub\nmetadata.\n\nAdd a regression test covering the empty-home path where\ndoctor recommends running the list command.\n\nRefs: #703	2026-03-09 01:43:59 +08:00
teknium1	cd77c7100c	Merge PR #648 : test: add regression coverage for compressor tool-call boundaries Authored by intertwine. Related to #647.	2026-03-08 06:46:50 -07:00
teknium1	cf810c2950	fix: pre-process CLI clipboard images through vision tool instead of raw embedding Images pasted in the CLI were embedded as raw base64 image_url content parts in the conversation history, which only works with vision-capable models. If the main model (e.g. Nous API) doesn't support vision, this breaks the request and poisons all subsequent messages. Now the CLI uses the same approach as the messaging gateway: images are pre-processed through the auxiliary vision model (Gemini Flash via OpenRouter or Nous Portal) and converted to text descriptions. The local file path is included so the agent can re-examine via vision_analyze if needed. Works with any model. Fixes #638.	2026-03-08 06:22:00 -07:00
teknium1	a23bcb81ce	fix: improve /model user feedback + update docs User messaging improvements: - Rejection: '(>_<) Error: not a valid model' instead of '(^_^) Warning: Error:' - Rejection: shows 'Model unchanged' + tip about /model and /provider - Session-only: explains 'this session only' with reason and 'will revert on restart' - Saved: clear '(saved to config)' confirmation Docs updated: - cli-commands.md, cli.md, messaging/index.md: /model now shows provider:model syntax, /provider command added to tables Test fixes: deduplicated test names, assertions match new messages.	2026-03-08 06:13:12 -07:00
stablegenius49	d07d867718	Fix empty tool selection persistence	2026-03-08 06:11:18 -07:00
teknium1	666f2dd486	feat: /provider command + fix gateway bugs + harden parse_model_input /provider command (CLI + gateway): Shows all providers with auth status (✓/✗), aliases, and active marker. Users can now discover what provider names work with provider:model syntax. Gateway bugs fixed: - Config was saved even when validation.persist=False (told user 'session only' but actually persisted the unvalidated model) - HERMES_INFERENCE_PROVIDER env var not set on provider switch, causing the switch to be silently overridden if that env var was already set parse_model_input hardened: - Colon only treated as provider delimiter if left side is a recognized provider name or alias. 'anthropic/claude-3.5-sonnet:beta' now passes through as a model name instead of trying provider='anthropic/claude-3.5-sonnet'. - HTTP URLs, random colons no longer misinterpreted. 56 tests passing across model validation, CLI commands, and integration.	2026-03-08 06:09:36 -07:00
teknium1	66d3e6a0c2	feat: provider switching via /model + enhanced model display Add provider:model syntax to /model command for runtime provider switching: /model zai:glm-5 → switch to Z.AI provider with glm-5 /model nous:hermes-3 → switch to Nous Portal with hermes-3 /model openrouter:anthropic/claude-sonnet-4.5 → explicit OpenRouter When switching providers, credentials are resolved via resolve_runtime_provider and validated before committing. Both model and provider are saved to config. Provider aliases work (glm: → zai, kimi: → kimi-coding, etc.). Enhanced /model (no args) display now shows: - Current model and provider - Curated model list for the current provider with ← marker - Usage examples including provider:model syntax 39 tests covering parse_model_input, curated_models_for_provider, provider switching (success + credential failure), and display output.	2026-03-08 05:45:59 -07:00
teknium1	4a09ae2985	chore: remove dead module stubs from test_cli_init.py The 200 lines of prompt_toolkit/rich/fire stubs added in PR #650 were guarded by 'if module in sys.modules: return' and never activated since those dependencies are always installed. Removed to keep the test file lean. Also removed unused MagicMock and pytest imports.	2026-03-08 05:35:02 -07:00
teknium1	8c734f2f27	fix: remove OpenRouter '/' format enforcement — let API probe be the authority Not all providers require 'provider/model' format. Removing the rigid format check lets the live API probe handle all validation uniformly. If someone types 'gpt-5.4' on OpenRouter, the probe won't find it and will suggest 'openai/gpt-5.4' — better UX than a format rejection.	2026-03-08 05:31:41 -07:00
teknium1	245d174359	feat: validate /model against live API instead of hardcoded lists Replace the static catalog-based model validation with a live API probe. The /model command now hits the provider's /models endpoint to check if the requested model actually exists: - Model found in API → accepted + saved to config - Model NOT found in API → rejected with 'Error: not a valid model' and fuzzy-match suggestions from the live model list - API unreachable → graceful fallback to hardcoded catalog (session-only for unrecognized models) - Format errors (empty, spaces, missing '/') still caught instantly without a network call The API probe takes ~0.2s for OpenRouter (346 models) and works with any OpenAI-compatible endpoint (Ollama, vLLM, custom, etc.). 32 tests covering all paths: format checks, API found, API not found, API unreachable fallback, CLI integration.	2026-03-08 05:22:20 -07:00
stablegenius49	77f47768dd	fix: improve /history message display	2026-03-08 05:08:57 -07:00
teknium1	90fa9e54ca	fix: guard validate_requested_model + expand test coverage (PR #649 follow-up) - Wrap validate_requested_model in try/except so /model doesn't crash if validation itself fails (falls back to old accept+save behavior) - Remove unnecessary sys.path.insert from both test files - Expand test_model_validation.py: 4 → 23 tests covering normalize_provider, provider_model_ids, empty/whitespace/spaces rejection, OpenRouter format validation, custom endpoints, nous provider, provider aliases, unknown providers, fuzzy suggestions - Expand test_cli_model_command.py: 2 → 5 tests adding known-model save, validation crash fallback, and /model with no argument	2026-03-08 04:47:35 -07:00
stablegenius49	9d3a44e0e8	fix: validate /model values before saving	2026-03-08 04:47:35 -07:00
Teknium	b8120df860	Revert "feat: skill prerequisites — hide skills with unmet runtime dependencies"	2026-03-08 03:58:13 -07:00
teknium1	0df7df52f3	test: expand slash command autocomplete coverage (PR #645 follow-up) - Fix failing test: use display_text/display_meta_text instead of str() on prompt_toolkit FormattedText objects - Add regression guard: EXPECTED_COMMANDS set ensures no command silently disappears from the shared dict - Add edge case tests: non-slash input, empty input, partial vs exact match trailing space, builtin display_meta content - Add skill provider tests: None provider, exception swallowing, description truncation at 50 chars, missing description fallback, exact-match trailing space on skill commands - Total: 15 tests (up from 4)	2026-03-08 03:53:22 -07:00
stablegenius49	bfa27d0a68	fix(cli): unify slash command autocomplete registry	2026-03-08 03:53:22 -07:00
teknium1	5a20c486e3	Merge PR #659 : feat: skill prerequisites — hide skills with unmet runtime dependencies Authored by kshitijk4poor. Fixes #630.	2026-03-08 03:12:35 -07:00
kshitij	f210510276	feat: add prerequisites field to skill spec — hide skills with unmet dependencies Skills can now declare runtime prerequisites (env vars, CLI binaries) via YAML frontmatter. Skills with unmet prerequisites are excluded from the system prompt so the agent never claims capabilities it can't deliver, and skill_view() warns the agent about what's missing. Three layers of defense: - build_skills_system_prompt() filters out unavailable skills - _find_all_skills() flags unmet prerequisites in metadata - skill_view() returns prerequisites_warning with actionable details Tagged 12 bundled skills that have hard runtime dependencies: gif-search (TENOR_API_KEY), notion (NOTION_API_KEY), himalaya, imessage, apple-notes, apple-reminders, openhue, duckduckgo-search, codebase-inspection, blogwatcher, songsee, mcporter. Closes #658 Fixes #630	2026-03-08 13:19:32 +05:30
teknium1	19b6f81ee7	fix: allow Anthropic API URLs as custom OpenAI-compatible endpoints Removed the hard block on base_url containing 'api.anthropic.com'. Anthropic now offers an OpenAI-compatible /chat/completions endpoint, so blocking their URL prevents legitimate use. If the endpoint isn't compatible, the API call will fail with a proper error anyway. Removed from: run_agent.py, mini_swe_runner.py Updated test to verify Anthropic URLs are accepted.	2026-03-07 23:36:35 -08:00
teknium1	b8c3bc7841	feat: browser screenshot sharing via MEDIA: on all messaging platforms browser_vision now saves screenshots persistently to ~/.hermes/browser_screenshots/ and returns the screenshot_path in its JSON response. The model can include MEDIA:<path> in its response to share screenshots as native photos. Changes: - browser_tool.py: Save screenshots persistently, return screenshot_path, auto-cleanup files older than 24 hours, mkdir moved inside try/except - telegram.py: Add send_image_file() — sends local images via bot.send_photo() - discord.py: Add send_image_file() — sends local images via discord.File - slack.py: Add send_image_file() — sends local images via files_upload_v2() (WhatsApp already had send_image_file — no changes needed) - prompt_builder.py: Updated Telegram hint to list image extensions, added Discord and Slack MEDIA: platform hints - browser.md: Document screenshot sharing and 24h cleanup - send_file_integration_map.md: Updated to reflect send_image_file is now implemented on Telegram/Discord/Slack - test_send_image_file.py: 19 tests covering MEDIA: .png extraction, send_image_file on all platforms, and screenshot cleanup Partially addresses #466 (Phase 0: platform adapter gaps for send_image_file).	2026-03-07 22:57:05 -08:00
teknium1	dfd37a4b31	Merge PR #635 : fix: add Kimi Code API support (api.kimi.com/coding/v1) Authored by christomitov. Auto-detects sk-kimi- key prefix and routes to api.kimi.com/coding/v1. Adds User-Agent header for Kimi Code API compatibility. Legacy Moonshot keys continue to work unchanged.	2026-03-07 21:45:27 -08:00
teknium1	4be783446a	fix: wire worktree flag into hermes CLI entry point + docs + tests Critical fixes: - Add --worktree/-w to hermes_cli/main.py argparse (both chat subcommand and top-level parser) so 'hermes -w' works via the actual CLI entry point, not just 'python cli.py -w' - Pass worktree flag through cmd_chat() kwargs to cli_main() - Handle worktree attr in bare 'hermes' and --resume/--continue paths Bug fixes in cli.py: - Skip worktree creation for --list-tools/--list-toolsets (wasteful) - Wrap git worktree subprocess.run in try/except (crash on timeout) - Add stale worktree pruning on startup (_prune_stale_worktrees): removes clean worktrees older than 24h left by crashed/killed sessions Documentation updates: - AGENTS.md: add --worktree to CLI commands table - cli-config.yaml.example: add worktree config section - website/docs/reference/cli-commands.md: add to core commands - website/docs/user-guide/cli.md: add usage examples - website/docs/user-guide/configuration.md: add config docs Test improvements (17 → 31 tests): - Stale worktree pruning (prune old clean, keep recent, keep dirty) - Directory symlink via .worktreeinclude - Edge cases (no commits, not a repo, pre-existing .worktrees/) - CLI flag/config OR logic - TERMINAL_CWD integration - System prompt injection format	2026-03-07 21:05:40 -08:00
teknium1	8d719b180a	feat: git worktree isolation for parallel CLI sessions (--worktree / -w) Add a --worktree (-w) flag to the hermes CLI that creates an isolated git worktree for the session. This allows running multiple hermes-agent instances concurrently on the same repo without file collisions. How it works: - On startup with -w: detects git repo, creates .worktrees/<session>/ with its own branch (hermes/<session-id>), sets TERMINAL_CWD to it - Each agent works in complete isolation — independent HEAD, index, and working tree, shared git object store - On exit: auto-removes worktree and branch if clean, warns and keeps if there are uncommitted changes - .worktreeinclude file support: list gitignored files (.env, .venv/) to auto-copy/symlink into new worktrees - .worktrees/ is auto-added to .gitignore - Agent gets a system prompt note about the worktree context - Config support: set worktree: true in config.yaml to always enable Usage: hermes -w # Interactive mode in worktree hermes -w -q "Fix issue #123" # Single query in worktree # Or in config.yaml: worktree: true Includes 17 tests covering: repo detection, worktree creation, independence verification, cleanup (clean/dirty), .worktreeinclude, .gitignore management, and 10 concurrent worktrees. Closes #652	2026-03-07 20:51:08 -08:00
teknium1	c5a9d1ef9d	Merge branch 'main' into pr-635	2026-03-07 20:36:42 -08:00
teknium1	c7b6f423c7	feat: auto-compress pathologically large gateway sessions (#628 ) Long-lived gateway sessions can accumulate enough history that every new message rehydrates an oversized transcript, causing repeated truncation failures (finish_reason=length). Add a session hygiene check in _handle_message that runs right after loading the transcript and before invoking the agent: 1. Estimate message count and rough token count of the transcript 2. If above configurable thresholds (default: 200 msgs or 100K tokens), auto-compress the transcript proactively 3. Notify the user about the compression with before/after stats 4. If still above warn threshold (default: 200K tokens) after compression, suggest /reset 5. If compression fails on a dangerously large session, warn the user to use /compress or /reset manually Thresholds are configurable via config.yaml: session_hygiene: auto_compress_tokens: 100000 auto_compress_messages: 200 warn_tokens: 200000 This complements the agent's existing preflight compression (which runs inside run_conversation) by catching pathological sessions at the gateway layer before the agent is even created. Includes 12 tests for threshold detection and token estimation.	2026-03-07 20:09:48 -08:00
Bryan Young	fcde9be10d	fix: keep tool-call output runs intact during compression	2026-03-08 03:13:14 +00:00
Christo Mitov	4447e7d71a	fix: add Kimi Code API support (api.kimi.com/coding/v1) Kimi Code (platform.kimi.ai) issues API keys prefixed sk-kimi- that require: 1. A different base URL: api.kimi.com/coding/v1 (not api.moonshot.ai/v1) 2. A User-Agent header identifying a recognized coding agent Without this fix, sk-kimi- keys fail with 401 (wrong endpoint) or 403 ('only available for Coding Agents') errors. Changes: - Auto-detect sk-kimi- key prefix and route to api.kimi.com/coding/v1 - Send User-Agent: KimiCLI/1.0 header for Kimi Code endpoints - Legacy Moonshot keys (api.moonshot.ai) continue to work unchanged - KIMI_BASE_URL env var override still takes priority over auto-detection - Updated .env.example with correct docs and all endpoint options - Fixed doctor.py health check for Kimi Code keys Reference: https://github.com/MoonshotAI/kimi-cli (platforms.py)	2026-03-07 21:00:12 -05:00
teknium1	faab73ad58	Merge PR #573 : fix(doctor): detect OpenAI custom endpoint env settings Authored by stablegenius49. Fixes #572.	2026-03-07 16:16:08 -08:00
vincent	86eed141af	fix: rebuild compressed payload before retry	2026-03-07 18:55:01 -05:00
teknium1	24f6a193e7	fix: remove stale 'model' assertion from delegate_task schema test The 'model' property was removed from DELEGATE_TASK_SCHEMA but the test still asserted its presence, causing CI to fail.	2026-03-07 11:29:55 -08:00
teknium1	d80c30cc92	feat(gateway): proactive async memory flush on session expiry Previously, when a session expired (idle/daily reset), the memory flush ran synchronously inside get_or_create_session — blocking the user's message for 10-60s while an LLM call saved memories. Now a background watcher task (_session_expiry_watcher) runs every 5 min, detects expired sessions, and flushes memories proactively in a thread pool. By the time the user sends their next message, memories are already saved and the response is immediate. Changes: - Add _is_session_expired(entry) to SessionStore — works from entry alone without needing a SessionSource - Add _pre_flushed_sessions set to track already-flushed sessions - Remove sync _on_auto_reset callback from get_or_create_session - Refactor flush into _flush_memories_for_session (sync worker) + _async_flush_memories (thread pool wrapper) - Add _session_expiry_watcher background task, started in start() - Simplify /reset command to use shared fire-and-forget flush - Add 10 tests for expiry detection, callback removal, tracking	2026-03-07 11:27:50 -08:00
teknium1	b84f9e410c	feat: default reasoning effort from xhigh to medium Reduces token usage and latency for most tasks by defaulting to medium reasoning effort instead of xhigh. Users can still override via config or CLI flag. Updates code, tests, example config, and docs.	2026-03-07 10:14:19 -08:00
0xbyt4	ee7d8c56c7	fix: prevent data loss in clipboard PNG conversion when ImageMagick fails _convert_to_png() renamed the original file to .bmp before calling ImageMagick convert, then unconditionally deleted the .bmp regardless of whether convert succeeded. If convert failed, both files were gone. - Only delete .bmp after confirmed successful conversion - Restore original file on convert failure, timeout, or missing binary - Add 3 tests covering failure, not-installed, and timeout scenarios	2026-03-07 20:02:12 +03:00
0xbyt4	451a007fb1	fix(tests): isolate max_turns tests from CI env and update default to 90 _make_cli() did not clear HERMES_MAX_ITERATIONS env var, so tests failed in CI where the var was set externally. Also, default max_turns changed from 60 to 90 in `0a82396` but tests were not updated. - Clear HERMES_MAX_ITERATIONS in _make_cli() for proper isolation - Add env_overrides parameter for tests that need specific env values - Update hardcoded 60 assertions to 90 to match new default - Simplify test_env_var_max_turns using env_overrides	2026-03-07 19:43:20 +03:00
0xbyt4	5cdcb9e26f	fix: strip MarkdownV2 italic markers in Telegram plaintext fallback When MarkdownV2 parsing fails, _strip_mdv2() removes escape backslashes and bold markers (text) but missed italic markers (_text_). Users saw raw underscores around italic text in the plaintext fallback. - Add regex to strip _text_ italic markers in _strip_mdv2() - Use word boundary lookaround to preserve snake_case identifiers - Add tests for _strip_mdv2 covering italic, bold, snake_case, and edge cases	2026-03-07 18:55:25 +03:00
teknium1	f668e9fc75	feat: platform-conditional skill loading + Apple/macOS skills Add a 'platforms' field to SKILL.md frontmatter that restricts skills to specific operating systems. Skills with platforms: [macos] only appear in the system prompt, skills_list(), and slash commands on macOS. Skills without the field load everywhere (backward compatible). Implementation: - skill_matches_platform() in tools/skills_tool.py — core filter - Wired into all 3 discovery paths: prompt_builder.py, skills_tool.py, skill_commands.py - 28 new tests across 3 test files New bundled Apple/macOS skills (all platforms: [macos]): - imessage — Send/receive iMessages via imsg CLI - apple-reminders — Manage Reminders via remindctl CLI - apple-notes — Manage Notes via memo CLI - findmy — Track devices/AirTags via AppleScript + screen capture Docs updated: CONTRIBUTING.md, AGENTS.md, creating-skills.md, skills.md (user guide)	2026-03-07 00:47:54 -08:00
teknium1	69a36a3361	Merge PR #309 : fix(timezone): timezone-aware now() for prompt, cron, and execute_code Authored by areu01or00. Adds timezone support via hermes_time.now() helper with IANA timezone resolution (HERMES_TIMEZONE env → config.yaml → server-local). Updates system prompt timestamp, cron scheduling, and execute_code sandbox TZ injection. Includes config migration (v4→v5) and comprehensive test coverage.	2026-03-07 00:04:41 -08:00
stablegenius49	5609117882	fix(doctor): recognize OPENAI_API_KEY custom endpoint config	2026-03-06 19:47:09 -08:00
Tyler	53b4b7651a	Add official OpenClaw migration skill for Hermes Agent Introduces a new OpenClaw-to-Hermes migration skill with a Python helper script that handles importing SOUL.md, memories, user profiles, messaging settings, command allowlists, skills, TTS assets, and workspace instructions. Supports two migration presets (user-data / full), three skill conflict modes (skip / overwrite / rename), overflow file export for entries that exceed character limits, and granular include/exclude option filtering. Includes detailed SKILL.md agent instructions covering the clarify-tool interaction protocol, decision-to-command mapping, post-run reporting rules, and path resolution guidance. Adds dynamic panel width calculation to CLI clarify/approval widgets so panels adapt to content and terminal size. Includes 7 new tests covering presets, include/exclude, conflict modes, overflow exports, and skills_guard integration.	2026-03-06 18:57:12 -08:00
teknium1	388dd4789c	feat: add z.ai/GLM, Kimi/Moonshot, MiniMax as first-class providers Adds 4 new direct API-key providers (zai, kimi-coding, minimax, minimax-cn) to the inference provider system. All use standard OpenAI-compatible chat/completions endpoints with Bearer token auth. Core changes: - auth.py: Extended ProviderConfig with api_key_env_vars and base_url_env_var fields. Added providers to PROVIDER_REGISTRY. Added provider aliases (glm, z-ai, zhipu, kimi, moonshot). Added auto-detection of API-key providers in resolve_provider(). Added resolve_api_key_provider_credentials() and get_api_key_provider_status() helpers. - runtime_provider.py: Added generic API-key provider branch in resolve_runtime_provider() — any provider with auth_type='api_key' is automatically handled. - main.py: Added providers to hermes model menu with generic _model_flow_api_key_provider() flow. Updated _has_any_provider_configured() to check all provider env vars. Updated argparse --provider choices. - setup.py: Added providers to setup wizard with API key prompts and curated model lists. - config.py: Added env vars (GLM_API_KEY, KIMI_API_KEY, MINIMAX_API_KEY, etc.) to OPTIONAL_ENV_VARS. - status.py: Added API key display and provider status section. - doctor.py: Added connectivity checks for each provider endpoint. - cli.py: Updated provider docstrings. Docs: Updated README.md, .env.example, cli-config.yaml.example, cli-commands.md, environment-variables.md, configuration.md. Tests: 50 new tests covering registry, aliases, resolution, auto-detection, credential resolution, and runtime provider dispatch. Inspired by PR #33 (numman-ali) which proposed a provider registry approach. Credit to tars90percent (PR #473) and manuelschipper (PR #420) for related provider improvements merged earlier in this changeset.	2026-03-06 18:55:18 -08:00
Robin Fernandes	bc091eb7ef	fix: implement Nous credential refresh on 401 error for retry logic	2026-03-07 13:34:23 +11:00
0xbyt4	33cfe1515d	fix: sanitize FTS5 queries and close mirror DB connections Two bugs fixed: 1. search_messages() crashes with OperationalError when user queries contain FTS5 special characters (+, ", (, {, dangling AND/OR, etc). Added _sanitize_fts5_query() to strip dangerous operators and a fallback try-except for edge cases. 2. _append_to_sqlite() in mirror.py creates a new SessionDB per call but never closes it, leaking SQLite connections. Added finally block to ensure db.close() is always called.	2026-03-07 04:24:45 +03:00
teknium1	94053d75a6	fix: custom endpoint no longer leaks OPENROUTER_API_KEY (#560 ) API key selection is now base_url-aware: when the resolved base_url targets OpenRouter, OPENROUTER_API_KEY takes priority (preserving the #289 fix). When hitting any other endpoint (Z.ai, vLLM, custom, etc.), OPENAI_API_KEY takes priority so the OpenRouter key doesn't leak. Applied in both the runtime provider resolver (the real code path) and the CLI initial default (for consistency). Fixes #560.	2026-03-06 17:16:14 -08:00
teknium1	2a68099675	fix(tests): isolate tests from user ~/.hermes/ config and SOUL.md _make_cli() now patches CLI_CONFIG with clean defaults so test_cli_init tests don't depend on the developer's local config.yaml. test_empty_dir_returns_empty now mocks Path.home() so it doesn't pick up a global SOUL.md. Credit to teyrebaz33 for identifying and fixing these in PR #557. Fixes #555.	2026-03-06 17:10:35 -08:00
0xbyt4	3b43f7267a	fix: count actual tool calls instead of tool-related messages tool_call_count was inaccurate in two ways: 1. Under-counting: an assistant message with N parallel tool calls (e.g. "kill the light and shut off the fan" = 2 ha_call_service) only incremented tool_call_count by 1 instead of N. 2. Over-counting: tool response messages (role=tool) also incremented tool_call_count, double-counting every tool interaction. Combined: 2 parallel tool calls produced tool_call_count=3 (1 from assistant + 2 from tool responses) instead of the correct value of 2. Fix: only count from assistant messages with tool_calls, incrementing by len(tool_calls) to handle parallel calls correctly. Tool response messages no longer affect tool_call_count. This impacts /insights and /usage accuracy for sessions with tool use.	2026-03-07 04:07:52 +03:00
0xbyt4	211b55815e	fix: prevent data loss in skills sync on copy/update failure Two bugs in sync_skills(): 1. Failed copytree poisons manifest: when shutil.copytree fails (disk full, permission error), the skill is still recorded in the manifest. On the next sync, the skill appears as "in manifest but not on disk" which is interpreted as "user deliberately deleted it" — the skill is never retried. Fix: only write to manifest on successful copy. 2. Failed update destroys user copy: rmtree deletes the existing skill directory before copytree runs. If copytree then fails, the user's skill is gone with no way to recover. Fix: move to .bak before copying, restore from backup if copytree fails. Both bugs are proven by new regression tests that fail on the old code and pass on the fix.	2026-03-07 03:58:32 +03:00
teknium1	4f56e31dc7	fix: track origin hashes in skills manifest to preserve user modifications Upgrade skills_sync manifest to v2 format (name:origin_hash). The origin hash records the MD5 of the bundled skill at the time it was last synced. On update, the user's copy is compared against the origin hash: - User copy == origin hash → unmodified → safe to update from bundled - User copy != origin hash → user customized → skip (preserve changes) v1 manifests (plain names) are auto-migrated: the user's current hash becomes the baseline, so future syncs can detect modifications. Output now shows user-modified skills: ~ whisper (user-modified, skipping) 27 tests covering all scenarios including v1→v2 migration, user modification detection, update after migration, and origin hash tracking. 2009 tests pass.	2026-03-06 16:13:58 -08:00
Teknium	6d3804770c	Merge pull request #552 from NousResearch/feat/insights feat: /insights command — usage analytics, cost estimation & activity patterns	2026-03-06 16:00:28 -08:00
teknium1	ab0f4126cf	fix: restore all removed bundled skills + fix skills sync system - Restored 21 skills removed in commits `757d012` and `740dd92`: accelerate, audiocraft, code-review, faiss, flash-attention, gguf, grpo-rl-training, guidance, llava, nemo-curator, obliteratus, peft, pytorch-fsdp, pytorch-lightning, simpo, slime, stable-diffusion, tensorrt-llm, torchtitan, trl-fine-tuning, whisper - Rewrote sync_skills() with proper update semantics: * New skills (not in manifest): copied to user dir * Existing skills (in manifest + on disk): updated via hash comparison * User-deleted skills (in manifest, not on disk): respected, not re-added * Stale manifest entries (removed from bundled): cleaned from manifest - Added sync_skills() to CLI startup (cmd_chat) and gateway startup (start_gateway) — previously only ran during 'hermes update' - Updated cmd_update output to show new/updated/cleaned counts - Rewrote tests: 20 tests covering manifest CRUD, dir hashing, fresh install, user deletion respect, update detection, stale cleanup, and name collision handling 75 bundled skills total. 2002 tests pass.	2026-03-06 15:57:30 -08:00
unmodeled-tyler	1755a9e38a	Design agent migration skill for Hermes Agent from OpenClaw \| Run successful dry tests with reports	2026-03-06 15:12:45 -08:00
teknium1	585f8528b2	fix: deep review — prefix matching, tool_calls extraction, query perf, serialization Issues found and fixed during deep code path review: 1. CRITICAL: Prefix matching returned wrong prices for dated model names - 'gpt-4o-mini-2024-07-18' matched gpt-4o ($2.50) instead of gpt-4o-mini ($0.15) - Same for o3-mini→o3 (9x), gpt-4.1-mini→gpt-4.1 (5x), gpt-4.1-nano→gpt-4.1 (20x) - Fix: use longest-match-wins strategy instead of first-match - Removed dangerous key.startswith(bare) reverse matching 2. CRITICAL: Top Tools section was empty for CLI sessions - run_agent.py doesn't set tool_name on tool response messages (pre-existing) - Insights now also extracts tool names from tool_calls JSON on assistant messages, which IS populated for all sessions - Uses max() merge strategy to avoid double-counting between sources 3. SELECT * replaced with explicit column list - Skips system_prompt and model_config blobs (can be thousands of chars) - Reduces memory and I/O for large session counts 4. Sets in overview dict converted to sorted lists - models_with_pricing / models_without_pricing were Python sets - Sets aren't JSON-serializable — would crash json.dumps() 5. Negative duration guard - end > start check prevents negative durations from clock drift 6. Model breakdown sort fallback - When all tokens are 0, now sorts by session count instead of arbitrary order 7. Removed unused timedelta import Added 6 new tests: dated model pricing (4), tool_calls JSON extraction, JSON serialization safety. Total: 69 tests.	2026-03-06 14:50:57 -08:00
teknium1	75f523f5c0	fix: unknown/custom models get zero cost instead of fake estimates Custom OAI endpoints, self-hosted models, and local inference should NOT show fabricated cost estimates. Changed default pricing from $3/$12 per million tokens to $0/$0 for unrecognized models. - Added _has_known_pricing() to distinguish commercial vs custom models - Models with known pricing show $ amounts; unknown models show 'N/A' - Overview shows asterisk + note when some models lack pricing data - Gateway format adds '(excludes custom/self-hosted models)' note - Added 7 new tests for custom model cost handling	2026-03-06 14:18:19 -08:00
teknium1	b52b37ae64	feat: add /insights command with usage analytics and cost estimation Inspired by Claude Code's /insights, adapted for Hermes Agent's multi-platform architecture. Analyzes session history from state.db to produce comprehensive usage insights. Features: - Overview stats: sessions, messages, tokens, estimated cost, active time - Model breakdown: per-model sessions, tokens, and cost estimation - Platform breakdown: CLI vs Telegram vs Discord etc. (unique to Hermes) - Tool usage ranking: most-used tools with percentages - Activity patterns: day-of-week chart, peak hours, streaks - Notable sessions: longest, most messages, most tokens, most tool calls - Cost estimation: real pricing data for 25+ models (OpenAI, Anthropic, DeepSeek, Google, Meta) with fuzzy model name matching - Configurable time window: --days flag (default 30) - Source filtering: --source flag to filter by platform Three entry points: - /insights slash command in CLI (supports --days and --source flags) - /insights slash command in gateway (compact markdown format) - hermes insights CLI subcommand (standalone) Includes 56 tests covering pricing helpers, format helpers, empty DB, populated DB with multi-platform data, filtering, formatting, and edge cases.	2026-03-06 14:04:59 -08:00
teknium1	d63b363cde	refactor: extract atomic_json_write helper, add 24 checkpoint tests Extract the duplicated temp-file + fsync + os.replace pattern from batch_runner.py (1 instance) and process_registry.py (2 instances) into a shared utils.atomic_json_write() function. Add 12 tests for atomic_json_write covering: valid JSON, parent dir creation, overwrite, crash safety (original preserved on error), no temp file leaks, string paths, unicode, custom indent, concurrent writes. Add 12 tests for batch_runner checkpoint behavior covering: _save_checkpoint (valid JSON, last_updated, overwrite, lock/no-lock, parent dirs, no temp leaks), _load_checkpoint (missing file, existing data, corrupt JSON), and resume logic (preserves prior progress, different run_name starts fresh).	2026-03-06 05:50:12 -08:00
teknium1	4a63737227	Merge PR #433 : fix(whatsapp): replace Linux-only fuser with cross-platform port cleanup Authored by Farukest. Fixes #432. Extracts _kill_port_process() helper that uses netstat+taskkill on Windows and fuser on Linux. Previously, fuser calls were inline with bare except-pass, so on Windows orphaned bridge processes were never cleaned up — causing 'address already in use' errors on reconnect. Includes 5 tests covering both platforms, port matching edge cases, and exception suppression.	2026-03-06 04:52:25 -08:00
teknium1	3e93db16bd	Merge PR #436 : fix: use _max_tokens_param in max-iterations retry path Authored by Farukest. Fixes #435. The retry summary in _handle_max_iterations() hardcoded max_tokens instead of using _max_tokens_param(), which returns max_completion_tokens for direct OpenAI API (required by gpt-4o, o-series). The first attempt already used _max_tokens_param correctly — only the retry path was wrong. Includes 4 tests for _max_tokens_param provider detection.	2026-03-06 04:46:24 -08:00
teknium1	c30967806c	test: add 26 tests for set_config_value secret routing Verifies explicit allowlist keys, catch-all _API_KEY/_TOKEN patterns, case insensitivity, TERMINAL_SSH prefix, and config.yaml routing for non-secret keys. Covers the fix from PR #469.	2026-03-06 04:26:18 -08:00
teknium1	b89eb29174	fix: correct mock tool name 'search' → 'search_files' in test_code_execution The mock handler checked for function_name == 'search' but the RPC sends 'search_files'. Any test exercising search_files through the mock would get 'Unknown tool' instead of the canned response.	2026-03-06 03:53:43 -08:00
teknium1	3982fcf095	fix: sync execute_code sandbox stubs with real tool schemas The _TOOL_STUBS dict in code_execution_tool.py was out of sync with the actual tool schemas, causing TypeErrors when the LLM used parameters it sees in its system prompt but the sandbox stubs didn't accept: search_files: - Added missing params: context, offset, output_mode - Fixed target default: 'grep' → 'content' (old value was obsolete) patch: - Added missing params: mode, patch (V4A multi-file patch support) Also added 4 drift-detection tests (TestStubSchemaDrift) that will catch future divergence between stubs and real schemas: - test_stubs_cover_all_schema_params: every schema param in stub - test_stubs_pass_all_params_to_rpc: every stub param sent over RPC - test_search_files_target_uses_current_values: no obsolete values - test_generated_module_accepts_all_params: generated code compiles All 28 tests pass.	2026-03-06 03:40:06 -08:00
teknium1	39299e2de4	Merge PR #451 : feat: Add Daytona environment backend Authored by rovle. Adds Daytona as the sixth terminal execution backend with cloud sandboxes, persistent workspaces, and full CLI/gateway integration. Includes 24 unit tests and 8 integration tests.	2026-03-06 03:32:40 -08:00
teknium1	efec4fcaab	feat(execute_code): add json_parse, shell_quote, retry helpers to sandbox The execute_code sandbox generates a hermes_tools.py stub module for LLM scripts. Three common failure modes keep tripping up scripts: 1. json.loads(strict=True) rejects control chars in terminal() output (e.g., GitHub issue bodies with literal tabs/newlines) 2. Shell backtick/quote interpretation when interpolating dynamic content into terminal() commands (markdown with backticks gets eaten by bash) 3. No retry logic for transient network failures (API timeouts, rate limits) Adds three convenience helpers to the generated hermes_tools module: - json_parse(text) — json.loads with strict=False for tolerant parsing - shell_quote(s) — shlex.quote() for safe shell interpolation - retry(fn, max_attempts=3, delay=2) — exponential backoff wrapper Also updates the EXECUTE_CODE_SCHEMA description to document these helpers so LLMs know they're available without importing anything extra. Includes 7 new tests (unit + integration) covering all three helpers.	2026-03-06 01:52:46 -08:00
teknium1	2317d115cd	fix: clipboard image paste on WSL2, Wayland, and VSCode terminal The original implementation only supported xclip (X11), which silently fails on WSL2 (can't access Windows clipboard for images), Wayland desktops (xclip is X11-only), and VSCode terminal on WSL2. Clipboard backend changes (hermes_cli/clipboard.py): - WSL2: detect via /proc/version, use powershell.exe with .NET System.Windows.Forms.Clipboard to extract images as base64 PNG - Wayland: use wl-paste with MIME type detection, auto-convert BMP to PNG for WSLg environments (via Pillow or ImageMagick) - Dispatch order: WSL → Wayland → X11 (xclip), with fallthrough - New has_clipboard_image() for lightweight clipboard checks - Cache WSL detection result per-process CLI changes (cli.py): - /paste command: explicit clipboard image check for terminals where BracketedPaste doesn't fire (image-only clipboard in VSCode/WinTerm) - Ctrl+V keybinding: fallback for Linux terminals where Ctrl+V sends raw byte instead of triggering bracketed paste Tests: 80 tests (up from 37) covering WSL, Wayland, X11 dispatch, BMP conversion, has_clipboard_image, and /paste command.	2026-03-05 20:22:44 -08:00
teknium1	8253b54be9	test: strengthen assertions in skill_manager + memory_tool (batch 3) test_skill_manager_tool.py (20 weak → 0): - Validation error messages verified against exact strings - Name validation: checks specific invalid name echoed in error - Frontmatter validation: exact error text for missing fields, unclosed markers, empty content, invalid YAML - File path validation: traversal, disallowed dirs, root-level test_memory_tool.py (13 weak → 0): - Security scan tests verify both 'Blocked' prefix AND specific threat pattern ID (prompt_injection, exfil_curl, etc.) - Invisible unicode tests verify exact codepoint strings - Snapshot test verifies type, header, content, and isolation	2026-03-05 18:51:43 -08:00

1 2 3 4 5 ...

308 Commits