hermes-agent

Author	SHA1	Message	Date
helix4u	bd7e272c1f	fix(slack): per-thread sessions for DMs by default Each top-level Slack DM now gets its own Hermes session, matching the per-thread behavior channels already have. Previously all top-level DM messages shared one continuous session because thread_ts was None, causing context to accumulate across unrelated conversations. The behavior is controlled by platforms.slack.extra.dm_top_level_threads_as_sessions in config.yaml (default: true). Set to false to restore legacy behavior. Based on PR #10789 by helix4u. Changes from original: - Default flipped to true (was opt-in, now opt-out) - Removed env var fallback (config.yaml only per project policy) - Tests updated to cover both default and opt-out paths	2026-04-16 04:22:33 -07:00
LeonSGP43	daef0519e9	fix(google-workspace): normalize authorized user token writes	2026-04-16 04:22:16 -07:00
Teknium	f726b9b843	fix(browser): runtime fallback to local Chromium when cloud provider fails Wraps provider.create_session() in _get_session_info() with try/except to catch cloud provider runtime failures (timeouts, auth errors, rate limits, invalid responses). Falls back to _create_local_session() so browser automation continues working when cloud APIs are down. Marks fallback sessions with fallback_from_cloud, fallback_reason, and fallback_provider metadata for observability. If both cloud and local fail, raises RuntimeError with chained context from both errors. Closes #10883 Co-authored-by: konsisumer <konsisumer@users.noreply.github.com>	2026-04-16 04:19:34 -07:00
Teknium	23a42635f0	docs: remove nonexistent CAMOFOX_PROFILE_DIR env var references (#10976 ) Camofox automatically maps each userId to a persistent Firefox profile on the server side — no CAMOFOX_PROFILE_DIR env var exists. Our docs incorrectly told users to configure this on the server. Removed the fabricated env var from: - browser docs (:::note block) - config.py DEFAULT_CONFIG comment - test docstring	2026-04-16 04:07:11 -07:00
Markus Corazzione	c928ebb1b1	retry transient telegram send failures	2026-04-16 03:47:00 -07:00
Teknium	333cb8251b	fix: improve interrupt responsiveness during concurrent tool execution and follow-up turns (#10935 ) Three targeted fixes for the 'agent stuck on terminal command' report: 1. Concurrent tool wait loop now checks interrupts (run_agent.py) The sequential path checked _interrupt_requested before each tool call, but the concurrent path's wait loop just blocked with 30s timeouts. Now polls every 5s and cancels pending futures on interrupt, giving already-running tools 3s to notice the per-thread interrupt signal. 2. Cancelled concurrent tools get proper interrupt messages (run_agent.py) When a concurrent tool is cancelled or didn't return a result due to interrupt, the tool result message says 'skipped due to user interrupt' instead of a generic error. 3. Typing indicator fires before follow-up turn (gateway/run.py) After an interrupt is acknowledged and the pending message dequeued, the gateway now sends a typing indicator before starting the recursive _run_agent call. This gives the user immediate visual feedback that the system is processing their new message (closing the perceived 'dead air' gap between the interrupt ack and the response). Reported by @_SushantSays.	2026-04-16 02:44:56 -07:00
Peter Berthelsen	9a9b8cd1e4	fix: keep rapid telegram follow-ups from getting cut off	2026-04-16 02:44:00 -07:00
Teknium	e4cd62d07d	fix(tests): resolve remaining CI failures — commit_memory_session, already_sent, timezone leak, session env (#10785 ) Fixes 12 CI test failures: 1. test_cli_new_session (4): _FakeAgent missing commit_memory_session attribute added in the memory provider refactoring. Added MagicMock. 2. test_run_progress_topics (1): already_sent detection only checked stream consumer flags, missing the response_previewed path from interim_assistant_callback. Restructured guard to check both paths. 3. test_timezone (1): HERMES_TIMEZONE leaked into child processes via _SAFE_ENV_PREFIXES matching HERMES_*. The code correctly converts it to TZ but didn't remove the original. Added child_env.pop(). 4. test_session_env (1): contextvars baseline captured from a different context couldn't be restored after clear. Changed assertion to verify the test's value was removed rather than comparing to a fragile baseline. 5. test_discord_slash_commands (5): already fixed on current main.	2026-04-16 02:26:14 -07:00
Teknium	5c397876b9	fix(cli): hint about /v1 suffix when configuring local model endpoints When a user enters a local model server URL (Ollama, vLLM, llama.cpp) without a /v1 suffix during 'hermes model' custom endpoint setup, prompt them to add it. Most OpenAI-compatible local servers require /v1 in the base URL for chat completions to work.	2026-04-16 02:22:09 -07:00
Mibayy	3522a7aa13	feat(ollama): pass think=false to custom providers when reasoning_effort is none When a custom/Ollama provider is used and reasoning_effort is set to 'none' (or enabled: false), inject 'think': false into the request extra_body. Ollama does not recognise the OpenRouter-style 'reasoning' extra_body field, so thinking-capable models (Qwen3, etc.) generate <think> blocks regardless of the reasoning_effort setting. This produces empty-response errors that corrupt session state. The fix adds a provider-specific block in _build_api_kwargs() that sets think=false in extra_body whenever self.provider == 'custom' and reasoning is explicitly disabled. Closes #3191	2026-04-16 02:22:09 -07:00
LeonSGP43	8011aa31ba	fix(agent): continue ollama glm truncation replies	2026-04-16 02:22:09 -07:00
kshitijk4poor	1b61ec470b	feat: add Ollama Cloud as built-in provider Add ollama-cloud as a first-class provider with full parity to existing API-key providers (gemini, zai, minimax, etc.): - PROVIDER_REGISTRY entry with OLLAMA_API_KEY env var - Provider aliases: ollama -> custom (local), ollama_cloud -> ollama-cloud - models.dev integration for accurate context lengths - URL-to-provider mapping (ollama.com -> ollama-cloud) - Passthrough model normalization (preserves Ollama model:tag format) - Default auxiliary model (nemotron-3-nano:30b) - HermesOverlay in providers.py - CLI --provider choices, CANONICAL_PROVIDERS entry - Dynamic model discovery with disk caching (1hr TTL) - 37 provider-specific tests Cherry-picked from PR #6038 by kshitijk4poor. Closes #3926	2026-04-16 02:22:09 -07:00
helix4u	8021a735c2	fix(gateway): preserve notify context in executor threads Gateway executor work now inherits the active session contextvars via copy_context() so background process watchers retain the correct platform/chat/user/session metadata for routing completion events back to the originating chat. Cherry-picked from #10647 by @helix4u with: - Use asyncio.get_running_loop() instead of deprecated get_event_loop() - Strip trailing whitespace - Add *args forwarding test - Add exception propagation test	2026-04-16 02:05:59 -07:00
helix4u	4093982f19	fix: recompute Copilot api_mode after model switch Recomputes GitHub Copilot api_mode from the selected model in the shared /model switch path. Before this change, Copilot could carry a stale codex_responses mode forward from a GPT-5 selection into a later Claude model switch, causing unsupported_api_for_model errors. Cherry-picked from #10533 by @helix4u with: - Comment specificity (Provider-specific → Copilot api_mode override) - Fix pre-existing duplicate opencode-go in set literal - Extract test mock helper to reduce duplication - Add GPT-5 → GPT-5 regression test (keeps codex_responses)	2026-04-16 01:16:14 -07:00
Markus Corazzione	0cf7d570e2	fix(telegram): restore typing indicator and thread routing for forum General topic In Telegram forum-enabled groups, the General topic does not include message_thread_id in incoming messages (it is None). This caused: 1. Messages in General losing thread context — replies went to wrong place 2. Typing indicator failing because thread_id=1 was rejected by Telegram Fix: synthesize thread_id="1" for forum groups when message_thread_id is None, then handle it correctly per operation: - send: omit message_thread_id (Telegram rejects thread_id=1 for sends) - typing: pass thread_id=1, retry without it on "thread not found" Also centralizes thread_id extraction into _metadata_thread_id() across all send methods (send, send_voice, send_image, send_document, send_video, send_animation, send_photo), replacing ~10 duplicate patterns. Salvaged from PR #7892 by @corazzione. Closes #7877, closes #7519.	2026-04-15 22:35:19 -07:00
Teknium	36b54afbc4	feat(plugins): add dispatch_tool() to PluginContext (#10763 ) Expands the plugin interface so slash command handlers can dispatch tool calls through the registry with parent agent context wired up automatically. This is the public API for plugins that need to orchestrate tools like delegate_task — they call ctx.dispatch_tool() instead of reaching into framework internals. The parent agent is resolved lazily from _cli_ref when available (CLI mode) and omitted in gateway mode (tools degrade gracefully). Enables the hermes-deliver-plugin pattern where /deliver and /fanout slash commands spawn subagents via delegate_task without touching the agent conversation loop. 7 new tests covering: registry delegation, parent_agent injection from cli_ref, gateway mode (no cli_ref), uninitialized agent, explicit parent_agent override, kwargs forwarding, return value passthrough.	2026-04-15 22:23:01 -07:00
leeyang1990	c5acc6edb6	feat(telegram): add dedicated TELEGRAM_PROXY env var and config.yaml proxy_url support Pass platform_env_var="TELEGRAM_PROXY" to resolve_proxy_url() in both telegram.py (main connect) and telegram_network.py (fallback transport), so a Telegram-specific proxy takes priority over the generic HTTPS_PROXY. Also bridge telegram.proxy_url from config.yaml to the TELEGRAM_PROXY env var (env var takes precedence if both are set), add OPTIONAL_ENV_VARS entry, docs, and tests. Composite salvage of four community PRs: - Core approach (both call sites): #9414 by @leeyang1990 - config.yaml bridging + docs: #6530 by @WhiteWorld - Naming convention: #9074 by @brantzh6 - Earlier proxy work: #7786 by @ten-ltw Closes #9414, closes #9074, closes #7786, closes #6530 Co-authored-by: WhiteWorld <WhiteWorld@users.noreply.github.com> Co-authored-by: brantzh6 <brantzh6@users.noreply.github.com> Co-authored-by: ten-ltw <ten-ltw@users.noreply.github.com>	2026-04-15 22:13:11 -07:00
kshitijk4poor	ff5bf0d6c8	fix(tests): resolve CI test failures — pool auto-seeding, stale assertions, mock isolation Salvaged from PR #10643 by kshitijk4poor, updated for current main. Root causes fixed: 1. Telegram xdist mock pollution — new tests/gateway/conftest.py with shared mock that runs at collection time (prevents ChatType=None caching) 2. VIRTUAL_ENV env var leak — monkeypatch.delenv in _detect_venv_dir tests 3. Copilot base_url missing — add fallback in _resolve_runtime_from_pool_entry 4. Stale vision model assertion — zai now uses glm-5v-turbo 5. Reasoning item id intentionally stripped — assert 'id' not in (store=False) 6. Context length warning unreachable — pass base_url to AIAgent in test 7. Kimi provider label updated — 'Kimi / Kimi Coding Plan' matches models.py 8. Google Workspace calendar tests — rewritten for current production code, properly mock subprocess on api_module, removed stale +agenda assertions 9. Credential pool auto-seeding — mock _select_pool_entry / _resolve_auto / _import_codex_cli_tokens to prevent real credentials from leaking into tests	2026-04-15 22:05:21 -07:00
Teknium	498b995c13	feat: implement register_command() on plugin context (#10626 ) Complete the half-built plugin slash command system. The dispatch code in cli.py and gateway/run.py already called get_plugin_command_handler() but the registration side was never implemented. Changes: - Add register_command() to PluginContext — stores handler, description, and plugin name; normalizes names; rejects conflicts with built-in commands - Add _plugin_commands dict to PluginManager - Add commands_registered tracking on LoadedPlugin - Add get_plugin_command_handler() and get_plugin_commands() module-level convenience functions - Fix commands.py to use actual plugin description in Telegram bot menu (was hardcoded 'Plugin command') - Add plugin commands to SlashCommandCompleter autocomplete - Show command count in /plugins display - 12 new tests covering registration, conflict detection, normalization, handler dispatch, and introspection Closes #10495	2026-04-15 19:53:11 -07:00
Teknium	cc6e8941db	feat(honcho): context injection overhaul, 5-tool surface, cost safety, session isolation (#10619 ) Salvaged from PR #9884 by erosika. Cherry-picked plugin changes onto current main with minimal core modifications. Plugin changes (plugins/memory/honcho/): - New honcho_reasoning tool (5th tool, splits LLM calls from honcho_context) - Two-layer context injection: base context (summary + representation + card) on contextCadence, dialectic supplement on dialecticCadence - Multi-pass dialectic depth (1-3 passes) with early bail-out on strong signal - Cold/warm prompt selection based on session state - dialecticCadence defaults to 3 (was 1) — ~66% fewer Honcho LLM calls - Session summary injection for conversational continuity - Bidirectional peer targeting on all 5 tools - Correctness fixes: peer param fallback, None guard on set_peer_card, schema validation, signal_sufficient anchored regex, mid->medium level fix Core changes (~20 lines across 3 files): - agent/memory_manager.py: Enhanced sanitize_context() to strip full <memory-context> blocks and system notes (prevents leak from saveMessages) - run_agent.py: gateway_session_key param for stable per-chat Honcho sessions, on_turn_start() call before prefetch_all() for cadence tracking, sanitize_context() on user messages to strip leaked memory blocks - gateway/run.py: skip_memory=True on 2 temp agents (prevents orphan sessions), gateway_session_key threading to main agent Tests: 509 passed (3 skipped — honcho SDK not installed locally) Docs: Updated honcho.md, memory-providers.md, tools-reference.md, SKILL.md Co-authored-by: erosika <erosika@users.noreply.github.com>	2026-04-15 19:12:19 -07:00
Kovyrin Family Claw	00ff9a26cd	Fix Telegram link preview suppression for bot sends	2026-04-15 17:54:43 -07:00
Kovyrin Family Claw	aea3499e56	feat(telegram): add config option to disable link previews	2026-04-15 17:54:43 -07:00
Roque	92a23479c0	fix(model-switch): normalize Unicode dashes from Telegram/iOS input Telegram on iOS auto-converts double hyphens (--) to em dashes (—) or en dashes (–) via autocorrect. This breaks /model flag parsing since parse_model_flags() only recognizes literal '--provider' and '--global'. When the flag isn't parsed, the entire string (e.g. 'glm-5.1 —provider zai') gets treated as the model name and fails with 'Model names cannot contain spaces.' Fix: normalize Unicode dashes (U+2012-U+2015) to '--' when they appear before flag keywords (provider, global), before flag extraction. The existing test suite in test_model_switch_provider_routing.py already covers all four dash variants — this commit adds the code that makes them pass.	2026-04-15 17:54:16 -07:00
Xowiek	21cd3a3fc0	fix(profile): use existing get_active_profile_name() for /profile command Replace inline Path.home() / '.hermes' / 'profiles' detection in both CLI and gateway /profile handlers with the existing get_active_profile_name() from hermes_cli.profiles — which already handles custom-root deployments, standard profiles, and Docker layouts. Fixes /profile incorrectly reporting 'default' when HERMES_HOME points to a custom-root profile path like /opt/data/profiles/coder. Based on PR #10484 by Xowiek.	2026-04-15 17:52:03 -07:00
Xowiek	77435c4f13	fix(gateway): use profile-aware Hermes paths in runtime hints	2026-04-15 17:52:03 -07:00
Teknium	c850a40e4e	fix: gate Matrix adapter path on media_files presence Text-only Matrix sends should continue using the lightweight _send_matrix() HTTP helper (~100ms). Only route through the heavy MatrixAdapter (full sync + E2EE setup) when media files are present. Adds test verifying text-only messages don't take the adapter path.	2026-04-15 17:37:43 -07:00
Teknium	276ed5c399	fix(send_message): deliver Matrix media via adapter Matrix media delivery was silently dropped by send_message because Matrix wasn't wired into the native adapter-backed media path. Only Telegram, Discord, and Weixin had native media support. Adds _send_matrix_via_adapter() which creates a MatrixAdapter instance, connects, sends text + media via the adapter's native upload methods (send_document, send_image_file, send_video, send_voice), then disconnects. Also fixes a stale URL-encoding assertion in test_send_message_missing_platforms that broke after PR #10151 added quote() to room IDs. Cherry-picked from PR #10486 by helix4u.	2026-04-15 17:37:43 -07:00
Greer Guthrie	33ff29dfae	fix(gateway): defer background review notifications until after main reply Background review notifications ("💾 Skill created", "💾 Memory updated") could race ahead of the main assistant reply in chat, making it look like the agent stopped after creating a skill. Gate bg-review notifications behind a threading.Event + pending queue. Register a release callback on the adapter's _post_delivery_callbacks dict so base.py's finally block fires it after the main response is delivered. The queued-message path in _run_agent pops and calls the callback directly to prevent double-fire. Co-authored-by: Hermes Agent <hermes@nousresearch.com> Closes #10541	2026-04-15 17:23:15 -07:00
Brooklyn Nicholson	097702c8a7	Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor	2026-04-15 19:11:07 -05:00
Teknium	9d9b424390	fix: Nous Portal rate limit guard — prevent retry amplification (#10568 ) When Nous returns a 429, the retry amplification chain burns up to 9 API requests per conversation turn (3 SDK retries × 3 Hermes retries), each counting against RPH and deepening the rate limit. With multiple concurrent sessions (cron + gateway + auxiliary), this creates a spiral where retries keep the limit tapped indefinitely. New module: agent/nous_rate_guard.py - Shared file-based rate limit state (~/.hermes/rate_limits/nous.json) - Parses reset time from x-ratelimit-reset-requests-1h, x-ratelimit- reset-requests, retry-after headers, or error context - Falls back to 5-minute default cooldown if no header data - Atomic writes (tempfile + rename) for cross-process safety - Auto-cleanup of expired state files run_agent.py changes: - Top-of-retry-loop guard: when another session already recorded Nous as rate-limited, skip the API call entirely. Try fallback provider first, then return a clear message with the reset time. - On 429 from Nous: record rate limit state and skip further retries (sets retry_count = max_retries to trigger fallback path) - On success from Nous: clear the rate limit state so other sessions know they can resume auxiliary_client.py changes: - _try_nous() checks rate guard before attempting Nous in the auxiliary fallback chain. When rate-limited, returns (None, None) so the chain skips to the next provider instead of piling more requests onto Nous. This eliminates three sources of amplification: 1. Hermes-level retries (saves 6 of 9 calls per turn) 2. Cross-session retries (cron + gateway all skip Nous) 3. Auxiliary fallback to Nous (compression/session_search skip too) Includes 24 tests covering the rate guard module, header parsing, state lifecycle, and auxiliary client integration.	2026-04-15 16:31:48 -07:00
Teknium	0d05bd34f8	feat: extend channel_prompts to Telegram, Slack, and Mattermost Extract resolve_channel_prompt() shared helper into gateway/platforms/base.py. Refactor Discord to use it. Wire channel_prompts into Telegram (groups + forum topics), Slack (channels), and Mattermost (channels). Config bridging now applies to all platforms (not just Discord). Added channel_prompts defaults to telegram/slack/mattermost config sections. Docs added to all four platform pages with platform-specific examples (topic inheritance for Telegram, channel IDs for Slack, etc.).	2026-04-15 16:31:28 -07:00
Teknium	620c296b1d	fix: discord mock setup and AUTHOR_MAP for channel_prompts tests Move _ensure_discord_mock() from module level to _make_adapter() so it doesn't poison sys.modules for other discord test files. Use types.ModuleType instead of MagicMock for the mock module to avoid auto-generated __file__ attribute confusing hasattr checks. Add BrennerSpear to AUTHOR_MAP.	2026-04-15 16:31:28 -07:00
Brenner Spear	90a6336145	fix: remove redundant key normalization and defensive getattr in channel_prompts - Remove double str() normalization in _resolve_channel_prompt since config bridging already handles numeric YAML key conversion - Remove dead prompts.get(str(key)) fallback that could never match after keys were already normalized to strings - Replace getattr(event, "channel_prompt", None) with direct attribute access since channel_prompt is a declared dataclass field - Update test to verify normalization responsibility lives in config bridging	2026-04-15 16:31:28 -07:00
Brenner Spear	2fbdc2c8fa	feat(discord): add channel_prompts config Add native Discord channel_prompts support with parent forum fallback, ephemeral runtime injection, config migration updates, docs, and tests.	2026-04-15 16:31:28 -07:00
JiaDe WU	0cb8c51fa5	feat: native AWS Bedrock provider via Converse API Salvaged from PR #7920 by JiaDe-Wu — cherry-picked Bedrock-specific additions onto current main, skipping stale-branch reverts (293 commits behind). Dual-path architecture: - Claude models → AnthropicBedrock SDK (prompt caching, thinking budgets) - Non-Claude models → Converse API via boto3 (Nova, DeepSeek, Llama, Mistral) Includes: - Core adapter (agent/bedrock_adapter.py, 1098 lines) - Full provider registration (auth, models, providers, config, runtime, main) - IAM credential chain + Bedrock API Key auth modes - Dynamic model discovery via ListFoundationModels + ListInferenceProfiles - Streaming with delta callbacks, error classification, guardrails - hermes doctor + hermes auth integration - /usage pricing for 7 Bedrock models - 130 automated tests (79 unit + 28 integration + follow-up fixes) - Documentation (website/docs/guides/aws-bedrock.md) - boto3 optional dependency (pip install hermes-agent[bedrock]) Co-authored-by: JiaDe WU <40445668+JiaDe-Wu@users.noreply.github.com>	2026-04-15 16:17:17 -07:00
MestreY0d4-Uninter	f4724803b4	fix(runtime): surface malformed proxy env and base URL before client init When proxy env vars (HTTP_PROXY, HTTPS_PROXY, ALL_PROXY) contain malformed URLs — e.g. 'http://127.0.0.1:6153export' from a broken shell config — the OpenAI/httpx client throws a cryptic 'Invalid port' error that doesn't identify the offending variable. Add _validate_proxy_env_urls() and _validate_base_url() in auxiliary_client.py, called from resolve_provider_client() and _create_openai_client() to fail fast with a clear, actionable error message naming the broken env var or URL. Closes #6360 Co-authored-by: MestreY0d4-Uninter <MestreY0d4-Uninter@users.noreply.github.com>	2026-04-15 16:10:53 -07:00
Teknium	ee9c0a3ed0	fix(security): add JWT token and Discord mention redaction (#10547 ) Found via trace data audit: JWT tokens (eyJ...) and Discord snowflake mentions (<@ID>) were passing through unredacted. JWT pattern: matches 1/2/3-part tokens starting with eyJ (base64 for '{'). Zero false-positive risk — no normal text matches eyJ + 10+ base64url chars. Discord pattern: matches <@digits> and <@!digits> with 17-20 digit snowflake IDs. Syntactically unique to Discord's mention format. Both patterns follow the same structural-uniqueness standard as existing prefix patterns (sk-, ghp_, AKIA, etc.).	2026-04-15 16:08:52 -07:00
Brooklyn Nicholson	72aebfbb24	Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor	2026-04-15 17:43:41 -05:00
Teknium	1d4b9c1a74	fix(gateway): don't treat group session user_id as thread_id in shutdown notifications (#10546 ) _parse_session_key() blindly assigned parts[5] as thread_id for all chat types. For group sessions with per-user isolation, parts[5] is a user_id, not a thread_id. This could cause shutdown notifications to route with incorrect thread metadata. Only return thread_id for chat types where the 6th element is unambiguous: dm and thread. For group/channel sessions, omit thread_id since the suffix may be a user_id. Based on the approach from PR #9938 by @Ruzzgar.	2026-04-15 15:09:23 -07:00
Ruzzgar	de3f8bc6ce	fix terminal workdir validation for Windows paths	2026-04-15 15:06:51 -07:00
Harish Kukreja	f1df83179f	fix(doctor): skip health check for OpenCode Go (no shared /models endpoint) OpenCode Go does not expose a shared /models endpoint, so the doctor probe was always failing and producing a false warning. Set the default URL to None and disable the health check for this provider.	2026-04-15 15:05:32 -07:00
helix4u	96cc556055	fix(copilot): preserve base URL and gpt-5-mini routing	2026-04-15 15:04:14 -07:00
Teknium	3b4ecf8ee7	fix: remove 'q' alias from /quit so /queue's 'q' alias works (#10467 ) (#10538 ) Both /queue and /quit registered 'q' as an alias. Since /quit appeared later in COMMAND_REGISTRY, _build_command_lookup() silently overwrote /queue's claim, making the documented /queue shorthand unusable. Fix: remove 'q' from /quit's aliases. /quit already has 'exit' as an alias plus the full '/quit' command. /queue has no other short alias. Closes #10467	2026-04-15 15:04:01 -07:00
Teknium	93b6f45224	fix: always retry on ASCII codec UnicodeEncodeError — don't gate on per-component sanitization The recovery block previously only retried (continue) when one of the per-component sanitization checks (messages, tools, system prompt, headers, credentials) found and stripped non-ASCII content. When the non-ASCII lived only in api_messages' reasoning_content field (which is built from messages['reasoning'] and not checked by the original _sanitize_messages_non_ascii), all checks returned False and the recovery fell through to the normal error path — burning a retry attempt despite _force_ascii_payload being set. Now the recovery always continues (retries) when _is_ascii_codec is detected. The _force_ascii_payload flag guarantees the next iteration runs _sanitize_structure_non_ascii(api_kwargs) on the full API payload, catching any remaining non-ASCII regardless of where it lives. Also adds test for the 'reasoning' field on canonical messages. Fixes #6843	2026-04-15 15:03:28 -07:00
MestreY0d4-Uninter	efd1ddc6e1	fix: sanitize api_messages and extra string fields during ASCII-codec recovery (#6843 ) The ASCII-locale recovery path in run_agent.py sanitized the canonical 'messages' list but left 'api_messages' untouched. api_messages is a separate API-copy built before the retry loop and may carry extra fields (reasoning_content, extra_body entries) that are not present in 'messages'. This caused the retry to still raise UnicodeEncodeError even after the 'System encoding is ASCII — stripped...' log line appeared. Two changes: - _sanitize_messages_non_ascii now walks all extra top-level string fields in each message dict (any key not in {content, name, tool_calls, role}) so reasoning_content and future extras are cleaned in both 'messages' and 'api_messages'. - The ASCII-codec recovery block now also calls sanitize on api_messages and api_kwargs so no non-ASCII survives into the next retry attempt. Adds regression tests covering: - reasoning_content with non-ASCII in api_messages - extra_body with non-ASCII in api_kwargs - canonical messages clean but api_messages dirty Fixes #6843	2026-04-15 15:03:28 -07:00
Junass1	096260ce78	fix(telegram): authorize update prompt callbacks	2026-04-15 14:54:23 -07:00
Brooklyn Nicholson	baa0de7649	Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor	2026-04-15 16:35:01 -05:00
Teknium	b3b88a279b	fix: prevent stale os.environ leak after clear_session_vars (#10304 ) (#10527 ) After clear_session_vars() reset contextvars to their default (''), get_session_env() treated the empty string as falsy and fell through to os.environ — resurrecting stale HERMES_SESSION_* values from CLI startup, cron, or previous sessions. This broke session isolation in the gateway where concurrent messages could see each other's stale environment values. Fix: use a sentinel (_UNSET) as the contextvar default instead of ''. get_session_env() now checks 'value is not _UNSET' instead of truthiness. Three states are cleanly distinguished: - _UNSET (never set): fall back to os.environ (CLI/cron compat) - '' (explicitly cleared): return '' — no os.environ fallback - 'telegram' (actively set): return the value clear_session_vars() now uses var.set('') instead of var.reset(token) to mark vars as explicitly cleared rather than reverting to _UNSET. Closes #10304	2026-04-15 14:27:17 -07:00
Teknium	e36c804bc2	fix: prevent already_sent from swallowing empty responses after tool calls (#10531 ) When a model (e.g. mimo-v2-pro) streams intermediate text alongside tool calls ("Let me search for that") but then returns empty after processing tool results, the stream consumer already_sent flag is True from the earlier text delivery. The gateway suppression check (already_sent=True, failed=False → return None) would swallow the final response, leaving the user staring at silence after the search. Two changes: 1. gateway/run.py return path: skip already_sent suppression when the final_response is "(empty)" or empty — the user needs to know the agent finished even if streaming sent partial content earlier. 2. gateway/run.py response handler: convert the internal "(empty)" sentinel to a user-friendly warning instead of delivering the raw sentinel string. Tests added for all empty/None/sentinel cases plus preserved existing suppression behavior for normal non-empty responses.	2026-04-15 14:26:45 -07:00
Teknium	a9197f9bb1	fix(memory): discover user-installed memory providers from $HERMES_HOME/plugins/ (#10529 ) Memory provider discovery (discover_memory_providers, load_memory_provider) only scanned the bundled plugins/memory/ directory. User-installed providers at $HERMES_HOME/plugins/<name>/ were invisible, forcing users to symlink into the repo source tree — which broke on hermes update and created a dual-registration path causing duplicate tool names (400 errors on strict providers like Xiaomi MiMo). Changes: - Add _get_user_plugins_dir(), _is_memory_provider_dir(), _iter_provider_dirs(), and find_provider_dir() helpers to plugins/memory/__init__.py - discover_memory_providers() now scans both bundled and user dirs - load_memory_provider() uses find_provider_dir() (bundled-first) - discover_plugin_cli_commands() uses find_provider_dir() - _install_dependencies() in memory_setup.py uses find_provider_dir() - User plugins use _hermes_user_memory namespace to avoid sys.modules collisions - Non-memory user plugins filtered via source text heuristic - Bundled providers always take precedence on name collisions Fixes #4956, #9099. Supersedes #4987, #9123, #9130, #9132, #9982.	2026-04-15 14:25:40 -07:00
Teknium	22d22cd75c	fix: auto-register all gateway commands as Discord slash commands (#10528 ) Discord's _register_slash_commands() had a hardcoded list of ~27 commands while COMMAND_REGISTRY defines 34+ gateway-available commands. Missing commands (debug, branch, rollback, snapshot, profile, yolo, fast, reload, commands) were invisible in Discord's / autocomplete — users couldn't discover them. Add a dynamic catch-all loop after the explicit registrations that iterates COMMAND_REGISTRY, skips already-registered commands, and auto-registers the rest using discord.app_commands.Command(). Commands with args_hint get an optional string parameter; parameterless commands get a simple callback. This ensures any future commands added to COMMAND_REGISTRY automatically appear on Discord without needing a manual entry in discord.py. Telegram and Slack already derive dynamically from COMMAND_REGISTRY via telegram_bot_commands() and slack_subcommand_map() — no changes needed there.	2026-04-15 14:25:27 -07:00
Teknium	305a702e09	fix: /browser connect CDP override now takes priority over Camofox (#10523 ) When a user runs /browser connect to attach browser tools to their real Chrome instance via CDP, the BROWSER_CDP_URL env var is set. However, every browser tool function checks _is_camofox_mode() first, which short-circuits to the Camofox backend before _get_session_info() ever checks for the CDP override. Fix: is_camofox_mode() now returns False when BROWSER_CDP_URL is set, so the explicit CDP connection takes priority. This is the correct behavior — /browser connect is an intentional user override. Reported by SkyLinx on Discord.	2026-04-15 14:11:18 -07:00
Teknium	824c33729d	fix(session_search): coerce limit to int to prevent TypeError with non-int values (#10522 ) Models (especially open-source like qwen3.5-plus) may send non-int values for the limit parameter — None (JSON null), string, or even a type object. This caused TypeError: '<=' not supported between instances of 'int' and 'type' when the value reached min()/comparison operations. Changes: - Add defensive int coercion at session_search() entry with fallback to 3 - Clamp limit to [1, 5] range (was only capped at 5, not floored) - Add tests for None, type object, string, negative, and zero limit values Reported by community user ludoSifu via Discord.	2026-04-15 14:11:05 -07:00
Teknium	91980e3518	fix: deduplicate memory provider tools to prevent 400 on strict providers (#10511 ) Memory provider plugins (e.g. Mnemosyne) can register tools via two paths: 1. Plugin system (ctx.register_tool) → tool registry → get_tool_definitions() 2. Memory manager → get_all_tool_schemas() → direct append in AIAgent.__init__ Path 2 blindly appended without checking if path 1 already added the same tool names. This created duplicate function names in the tools array sent to the API. Most providers silently handle duplicates, but Xiaomi MiMo (via Nous Portal) strictly rejects them with a 400 Bad Request. Fix: build a set of existing tool names before memory manager injection and skip any tool whose name is already present. Confirmed via live testing against Nous Portal: - Unique tool names → 200 OK - Duplicate tool names → 400 'Provider returned error'	2026-04-15 14:09:32 -07:00
Teknium	19142810ed	fix: /debug privacy — auto-delete pastes after 1 hour, add privacy notices (#10510 ) - Pastes uploaded by /debug now auto-delete after 1 hour via a detached background process that sends DELETE to paste.rs - CLI: shows privacy notice listing what data will be uploaded - Gateway: only uploads summary report (system info + log tails), NOT full log files containing conversation content - Added 'hermes debug delete <url>' for immediate manual deletion - 16 new tests covering auto-delete scheduling, paste deletion, privacy notices, and the delete subcommand Addresses user privacy concern where /debug uploaded full conversation logs to a public paste service with no warning or expiry.	2026-04-15 13:40:27 -07:00
Teknium	2edbf15560	fix: enforce TTL in MessageDeduplicator + use yaml for gateway --config (#10306 , #10216 ) (#10509 ) Two gateway fixes: 1. MessageDeduplicator.is_duplicate() now checks TTL at query time (#10306) Previously, is_duplicate() returned True for any previously seen ID without checking its age — expired entries were only purged when cache size exceeded max_size. On normal workloads that never overflow, message IDs stayed deduplicated forever instead of expiring after the TTL. Fix: check `now - timestamp < ttl` before returning True. Expired entries are removed and treated as new messages. 2. Gateway --config flag now uses yaml.safe_load() (#10216) The --config CLI flag in gateway/run.py main() used json.load() to parse config files. YAML is the only documented config format and every other config loader uses yaml.safe_load(). A YAML config file passed via --config would crash with json.JSONDecodeError. Closes #10306 Closes #10216	2026-04-15 13:35:40 -07:00
Teknium	af4bf505b3	fix: add on_memory_write bridge to sequential tool execution path (#10174 ) (#10507 ) The on_memory_write bridge that notifies external memory providers (ClawMem, retaindb, supermemory, etc.) of built-in memory writes was only present in the concurrent tool execution path (_invoke_tool). The sequential path (_execute_tool_calls_sequential) — which handles all single tool calls, the common case — was missing it entirely. This meant external memory providers silently missed every single-call memory write, which is the vast majority of memory operations. Fix: add the identical bridge block to the sequential path, right after the memory_tool call returns. Closes #10174	2026-04-15 13:32:59 -07:00
helix4u	93f6f66872	fix(interrupt): preserve pre-start terminal interrupts	2026-04-15 13:29:57 -07:00
Teknium	a418ddbd8b	fix: add activity heartbeats to prevent false gateway inactivity timeouts (#10501 ) Multiple gaps in activity tracking could cause the gateway's inactivity timeout to fire while the agent is actively working: 1. Streaming wait loop had no periodic heartbeat — the outer thread only touched activity when the stale-stream detector fired (180-300s), and for local providers (Ollama) the stale timeout was infinity, meaning zero heartbeats. Now touches activity every 30s. 2. Concurrent tool execution never set the activity callback on worker threads (threading.local invisible across threads) and never set _current_tool. Workers now set the callback, and the concurrent wait uses a polling loop with 30s heartbeats. 3. Modal backend's execute() override had its own polling loop without any activity callback. Now matches _wait_for_process cadence (10s).	2026-04-15 13:29:05 -07:00
Teknium	6391b46779	fix: bound auxiliary client cache to prevent fd exhaustion in long-running gateways (#10200 ) (#10470 ) The _client_cache used event loop id() as part of the cache key, so every new worker-thread event loop created a new entry for the same provider config. In long-running gateways where threads are recycled frequently, this caused unbounded cache growth — each stale entry held an unclosed AsyncOpenAI client with its httpx connection pool, eventually exhausting file descriptors. Fix: remove loop_id from the cache key and instead validate on each async cache hit that the cached loop is the current, open loop. If the loop changed or was closed, the stale entry is replaced in-place rather than creating an additional entry. This bounds cache growth to at most one entry per unique provider config. Also adds a _CLIENT_CACHE_MAX_SIZE (64) safety belt with FIFO eviction as defense-in-depth against any remaining unbounded growth. Cross-loop safety is preserved: different event loops still get different client instances (validated by existing test suite). Closes #10200	2026-04-15 13:16:28 -07:00
Brooklyn Nicholson	53a024a941	Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor	2026-04-15 14:37:54 -05:00
zhiheng.liu	7cb06e3bb3	refactor(memory): drop on_session_reset — commit-only is enough OV transparently handles message history across /new and /compress: old messages stay in the same session and extraction is idempotent, so there's no need to rebind providers to a new session_id. The only thing the session boundary actually needs is to trigger extraction. - MemoryProvider / MemoryManager: remove on_session_reset hook - OpenViking: remove on_session_reset override (nothing to do) - AIAgent: replace rotate_memory_session with commit_memory_session (just calls on_session_end, no rebind) - cli.py / run_agent.py: single commit_memory_session call at the session boundary before session_id rotates - tests: replace on_session_reset coverage with routing tests for MemoryManager.on_session_end Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-15 11:28:45 -07:00
zhiheng.liu	8275fa597a	refactor(memory): promote on_session_reset to base provider hook Replace hasattr-forked OpenViking-specific paths with a proper base-class hook. Collapse the two agent wrappers into a single rotate_memory_session so callers don't orchestrate commit + rebind themselves. - MemoryProvider: add on_session_reset(new_session_id) as a default no-op - MemoryManager: on_session_reset fans out unconditionally (no hasattr, no builtin skip — base no-op covers it) - OpenViking: rename reset_session -> on_session_reset; drop the explicit POST /api/v1/sessions (OV auto-creates on first message) and the two debug raise_for_status wrappers - AIAgent: collapse commit_memory_session + reinitialize_memory_session into rotate_memory_session(new_sid, messages) - cli.py / run_agent.py: replace hasattr blocks and the split calls with a single unconditional rotate_memory_session call; compression path now passes the real messages list instead of [] - tests: align with on_session_reset, assert reset does NOT POST /sessions Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-15 11:28:45 -07:00
zhiheng.liu	7856d304f2	fix(openviking): commit session on /new and context compression The OpenViking memory provider extracts memories when its session is committed (POST /api/v1/sessions/{id}/commit). Before this fix, the CLI had two code paths that changed the active session_id without ever committing the outgoing OpenViking session: 1. /new (new_session() in cli.py) — called flush_memories() to write MEMORY.md, then immediately discarded the old session_id. The accumulated OpenViking session was never committed, so all context from that session was lost before extraction could run. 2. /compress and auto-compress (_compress_context() in run_agent.py) — split the SQLite session (new session_id) but left the OpenViking provider pointing at the old session_id with no commit, meaning all messages synced to OpenViking were silently orphaned. The gateway already handles session commit on /new and /reset via shutdown_memory_provider() on the cached agent; the CLI path did not. Fix: introduce a lightweight session-transition lifecycle alongside the existing full shutdown path: - OpenVikingMemoryProvider.reset_session(new_session_id): waits for in-flight background threads, resets per-session counters, and creates the new OV session via POST /api/v1/sessions — without tearing down the HTTP client (avoids connection overhead on /new). - MemoryManager.restart_session(new_session_id): calls reset_session() on providers that implement it; falls back to initialize() for providers that do not. Skips the builtin provider (no per-session state). - AIAgent.commit_memory_session(messages): wraps memory_manager.on_session_end() without shutdown — commits OV session for extraction but leaves the provider alive for the next session. - AIAgent.reinitialize_memory_session(new_session_id): wraps memory_manager.restart_session() — transitions all external providers to the new session after session_id has been assigned. Call sites: - cli.py new_session(): commit BEFORE session_id changes, reinitialize AFTER — ensuring OV extraction runs on the correct session and the new session is immediately ready for the next turn. - run_agent._compress_context(): same pattern, inside the if self._session_db: block where the session_id split happens. /compress and auto-compress are functionally identical at this layer: both call _compress_context(), so both are fixed by the same change. Tests added to tests/agent/test_memory_provider.py: - TestMemoryManagerRestartSession: reset_session() routing, builtin skip, initialize() fallback, failure tolerance, empty-manager noop. - TestOpenVikingResetSession: session_id update, per-session state clear, POST /api/v1/sessions call, API failure tolerance, no-client noop. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-15 11:28:45 -07:00
Teknium	f61cc464f0	fix: include thread_id in _parse_session_key and fix stale parts reference _parse_session_key() now extracts the optional 6th part (thread_id) from session keys, and _notify_active_sessions_of_shutdown uses _parsed.get() instead of the removed 'parts' variable. Without this, shutdown notifications silently failed (NameError caught by try/except) and forum topic routing was lost.	2026-04-15 11:16:01 -07:00
kshitijk4poor	2276b72141	fix: follow-up improvements for watch notification routing (#9537 ) - Populate watcher_* routing fields for watch-only processes (not just notify_on_complete), so watch-pattern events carry direct metadata instead of relying solely on session_key parsing fallback - Extract _parse_session_key() helper to dedupe session key parsing at two call sites in gateway/run.py - Add negative test proving cross-thread leakage doesn't happen - Add edge-case tests for _build_process_event_source returning None (empty evt, invalid platform, short session_key) - Add unit tests for _parse_session_key helper	2026-04-15 11:16:01 -07:00
etcircle	dee592a0b1	fix(gateway): route synthetic background events by session	2026-04-15 11:16:01 -07:00
kshitij	da448d4fce	test(cron): add regression test for credential_files ContextVar propagation (#10462 ) Follow-up to #10459 (salvage of #7527). The copy_context() fix propagates ALL ContextVars into the cron worker thread, including credential_files. This test verifies that skill-declared required_credential_files are visible inside the worker thread, matching the existing env_passthrough regression test.	2026-04-15 11:11:08 -07:00
helix4u	aa398ad655	fix(cron): preserve skill env passthrough in worker thread	2026-04-15 11:03:49 -07:00
Brooklyn Nicholson	cc15b55bb9	chore: uptick	2026-04-15 10:23:15 -05:00
Brooklyn Nicholson	371166fe26	Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor	2026-04-15 10:21:00 -05:00
asheriif	33ae403890	fix(gateway): fix matrix lingering typing indicator	2026-04-15 04:16:16 -07:00
Teknium	47e6ea84bb	fix: file handle bug, warning text, and tests for Discord media send - Fix file handle closed before POST: nest session.post() inside the 'with open()' block so aiohttp can read the file during upload - Update warning text to include weixin (also supports media delivery) - Add 8 unit tests covering: text+media, media-only, missing files, upload failures, multiple files, and _send_to_platform routing	2026-04-15 04:16:06 -07:00
Teknium	1c4d3216d3	fix(cron): include job_id in delivery and guide models on removal workflow (#10242 ) * fix(gateway): suppress duplicate replies on interrupt and streaming flood control Three fixes for the duplicate reply bug affecting all gateway platforms: 1. base.py: Suppress stale response when the session was interrupted by a new message that hasn't been consumed yet. Checks both interrupt_event and _pending_messages to avoid false positives. (#8221, #2483) 2. run.py (return path): Remove response_previewed guard from already_sent check. Stream consumer's already_sent alone is authoritative — if content was delivered via streaming, the duplicate send must be suppressed regardless of the agent's response_previewed flag. (#8375) 3. run.py (queued-message path): Same fix — already_sent without response_previewed now correctly marks the first response as already streamed, preventing re-send before processing the queued message. The response_previewed field is still produced by the agent (run_agent.py) but is no longer required as a gate for duplicate suppression. The stream consumer's already_sent flag is the delivery-level truth about what the user actually saw. Concepts from PR #8380 (konsisumer). Closes #8375, #8221, #2483. * fix(cron): include job_id in delivery and guide models on removal workflow Users reported cron reminders keep firing after asking the agent to stop. Root cause: the conversational agent didn't know the job_id (not in delivery) and models don't reliably do the list→remove two-step without guidance. 1. Include job_id in the cron delivery wrapper so users and agents can reference it when requesting removal. 2. Replace confusing footer ('The agent cannot see this message') with actionable guidance ('To stop or manage this job, send me a new message'). 3. Add explicit list→remove guidance in the cronjob tool schema so models know to list first and never guess job IDs.	2026-04-15 03:46:58 -07:00
Teknium	2546b7acea	fix(gateway): suppress duplicate replies on interrupt and streaming flood control Three fixes for the duplicate reply bug affecting all gateway platforms: 1. base.py: Suppress stale response when the session was interrupted by a new message that hasn't been consumed yet. Checks both interrupt_event and _pending_messages to avoid false positives. (#8221, #2483) 2. run.py (return path): Remove response_previewed guard from already_sent check. Stream consumer's already_sent alone is authoritative — if content was delivered via streaming, the duplicate send must be suppressed regardless of the agent's response_previewed flag. (#8375) 3. run.py (queued-message path): Same fix — already_sent without response_previewed now correctly marks the first response as already streamed, preventing re-send before processing the queued message. The response_previewed field is still produced by the agent (run_agent.py) but is no longer required as a gate for duplicate suppression. The stream consumer's already_sent flag is the delivery-level truth about what the user actually saw. Concepts from PR #8380 (konsisumer). Closes #8375, #8221, #2483.	2026-04-15 03:42:24 -07:00
Teknium	a4e1842f12	fix: strip reasoning item IDs from Responses API input when store=False (#10217 ) With store=False (our default for the Responses API), the API does not persist response items. When reasoning items with 'id' fields were replayed on subsequent turns, the API attempted a server-side lookup for those IDs and returned 404: Item with id 'rs_...' not found. Items are not persisted when store is set to false. The encrypted_content blob is self-contained for reasoning chain continuity — the id field is unnecessary and triggers the failed lookup. Fix: strip 'id' from reasoning items in both _chat_messages_to_responses_input (message conversion) and _preflight_codex_input_items (normalization layer). The id is still used for local deduplication but never sent to the API. Reported by @zuogl448 on GPT-5.4.	2026-04-15 03:19:43 -07:00
Teknium	e69526be79	fix(send_message): URL-encode Matrix room IDs and add Matrix to schema examples (#10151 ) Matrix room IDs contain ! and : which must be percent-encoded in URI path segments per the Matrix C-S spec. Without encoding, some homeservers reject the PUT request. Also adds 'matrix:!roomid:server.org' and 'matrix:@user:server.org' to the tool schema examples so models know the correct target format.	2026-04-15 00:10:59 -07:00
Teknium	180b14442f	test: add _parse_target_ref Matrix coverage for salvaged PR #6144	2026-04-15 00:08:14 -07:00
Ubuntu	da8bab77fb	fix(cli): restore messaging toolset for gateway platforms	2026-04-14 23:13:35 -07:00
Teknium	9932366f3c	feat(doctor): add Command Installation check for hermes bin symlink hermes doctor now checks whether the ~/.local/bin/hermes symlink exists and points to the correct venv entry point. With --fix, it creates or repairs the symlink automatically. Covers: - Missing symlink at ~/.local/bin/hermes (or $PREFIX/bin on Termux) - Symlink pointing to wrong target - Missing venv entry point (venv/bin/hermes or .venv/bin/hermes) - PATH warning when ~/.local/bin is not on PATH - Skipped on Windows (different mechanism) Addresses user report: 'python -m hermes_cli.main doesn't have an option to fix the local bin/install' 10 new tests covering all scenarios.	2026-04-14 23:13:11 -07:00
Teknium	029938fbed	fix(cli): defensive subparser routing for argparse bpo-9338 (#10113 ) On some Python versions, argparse fails to route subcommand tokens when the parent parser has nargs='?' optional arguments (--continue). The symptom: 'hermes model' produces 'unrecognized arguments: model' even though 'model' is a registered subcommand. Fix: when argv contains a token matching a known subcommand, set subparsers.required=True to force deterministic routing. If that fails (e.g. 'hermes -c model' where 'model' is consumed as the session name for --continue), fall back to the default optional-subparsers behaviour. Adds 13 tests covering all key argument combinations. Reported via user screenshot showing the exact error on an installed version with the model subcommand listed in usage but rejected at parse time.	2026-04-14 23:13:02 -07:00
Teknium	5d5d21556e	fix: sync client.api_key during UnicodeEncodeError ASCII recovery (#10090 ) The existing recovery block sanitized self.api_key and self._client_kwargs['api_key'] but did not update self.client.api_key. The OpenAI SDK stores its own copy of api_key and reads it dynamically via the auth_headers property on every request. Without this fix, the retry after sanitization would still send the corrupted key in the Authorization header, causing the same UnicodeEncodeError. The bug manifests when an API key contains Unicode lookalike characters (e.g. ʋ U+028B instead of v) from copy-pasting out of PDFs, rich-text editors, or web pages with decorative fonts. httpx hard-encodes all HTTP headers as ASCII, so the non-ASCII char in the Authorization header triggers the error. Adds TestApiKeyClientSync with two tests verifying: - All three key locations are synced after sanitization - Recovery handles client=None (pre-init) without crashing	2026-04-14 22:37:45 -07:00
Teknium	93fe4ead83	fix: warn on invalid context_length format in config.yaml (#10067 ) Previously, non-integer context_length values (e.g. '256K') in config.yaml were silently ignored, causing the agent to fall back to 128K auto-detection with no user feedback. This was confusing for users with custom LiteLLM endpoints expecting larger context. Now prints a clear stderr warning and logs at WARNING level when model.context_length or custom_providers[].models.<model>.context_length cannot be parsed as an integer, telling users to use plain integers (e.g. 256000 instead of '256K'). Reported by community user ChFarhan via Discord.	2026-04-14 22:14:27 -07:00
Teknium	a8b7db35b2	fix: interrupt agent immediately when user messages during active run (#10068 ) When a user sends a message while the agent is executing a task on the gateway, the agent is now interrupted immediately — not silently queued. Previously, messages were stored in _pending_messages with zero feedback to the user, potentially leaving them waiting 1+ hours. Root cause: Level 1 guard (base.py) intercepted all messages for active sessions and returned with no response. Level 2 (gateway/run.py) which calls agent.interrupt() was never reached. Fix: Expand _handle_active_session_busy_message to handle the normal (non-draining) case: 1. Call running_agent.interrupt(text) to abort in-flight tool calls and exit the agent loop at the next check point 2. Store the message as pending so it becomes the next turn once the interrupted run returns 3. Send a brief ack: 'Interrupting current task (10 min elapsed, iteration 21/60, running: terminal). I'll respond shortly.' 4. Debounce acks to once per 30s to avoid spam on rapid messages Reported by @Lonely__MH.	2026-04-14 22:07:28 -07:00
Brooklyn Nicholson	561cea0d4a	Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor	2026-04-15 00:02:31 -05:00
Teknium	8548893d14	feat: entry-level Podman support — find_docker() + rootless entrypoint (#10066 ) - find_docker() now checks HERMES_DOCKER_BINARY env var first, then docker on PATH, then podman on PATH, then macOS known locations - Entrypoint respects HERMES_HOME env var (was hardcoded to /opt/data) - Entrypoint uses groupmod -o to tolerate non-unique GIDs (fixes macOS GID 20 conflict with Debian's dialout group) - Entrypoint makes chown best-effort so rootless Podman continues instead of failing with 'Operation not permitted' - 5 new tests covering env var override, podman fallback, precedence Based on work by alanjds (PR #3996) and malaiwah (PR #8115). Closes #4084.	2026-04-14 21:20:37 -07:00
Teknium	c5688e7c8b	fix(gateway): break compression-exhaustion infinite loop and auto-reset session (#9893 ) When compression fails after max attempts, the agent returns {completed: False, partial: True} but was missing the 'failed' flag. The gateway's agent_failed_early guard checked for 'failed' AND 'not final_response', but _run_agent_blocking always converts errors to final_response — making the guard dead code. This caused the oversized session to persist, creating an infinite fail loop where every subsequent message hits the same compression failure. Changes: - run_agent.py: add 'failed: True' and 'compression_exhausted: True' to all 5 compression-exhaustion return paths - gateway/run.py (_run_agent_blocking): forward 'failed' and 'compression_exhausted' flags through to the caller - gateway/run.py (_handle_message_with_agent): fix agent_failed_early to check bool(failed) without the broken 'not final_response' clause; auto-reset the session when compression is exhausted so the next message starts fresh - Update tests to match new guard logic and add TestCompressionExhaustedFlag test class Closes #9893	2026-04-14 21:18:17 -07:00
Greer Guthrie	4b2a1a4337	fix(tools): auto-discover built-in tool modules	2026-04-14 21:12:29 -07:00
Teknium	5cbb45d93e	fix: preserve session_id across previous_response_id chains in /v1/responses (#10059 ) The /v1/responses endpoint generated a new UUID session_id for every request, even when previous_response_id was provided. This caused each turn of a multi-turn conversation to appear as a separate session on the web dashboard, despite the conversation history being correctly chained. Fix: store session_id alongside the response in the ResponseStore, and reuse it when a subsequent request chains via previous_response_id. Applies to both the non-streaming /v1/responses path and the streaming SSE path. The /v1/runs endpoint also gains session continuity from stored responses (explicit body.session_id still takes priority). Adds test verifying session_id is preserved across chained requests.	2026-04-14 21:06:32 -07:00
Teknium	cf1d718823	fix: keep batch-path function_call_output.output as string per OpenAI spec The streaming path emits output as content-part arrays for Open WebUI compatibility, but the batch (non-streaming) Responses API path must return output as a plain string per the OpenAI Responses API spec. Reverts the _extract_output_items change from the cherry-picked commits while preserving the streaming path's array format.	2026-04-14 20:51:52 -07:00
simon-marcus	302554b158	fix(api-server): format responses tool outputs for open webui	2026-04-14 20:51:52 -07:00
simon-marcus	d6c09ab94a	feat(api-server): stream /v1/responses SSE tool events	2026-04-14 20:51:52 -07:00
Brooklyn Nicholson	496bfb3c59	Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor	2026-04-14 22:30:22 -05:00
Teknium	da528a8207	fix: detect and strip non-ASCII characters from API keys (#6843 ) API keys containing Unicode lookalike characters (e.g. ʋ U+028B instead of v) cause UnicodeEncodeError when httpx encodes the Authorization header as ASCII. This commonly happens when users copy-paste keys from PDFs, rich-text editors, or web pages with decorative fonts. Three layers of defense: 1. Save-time validation (hermes_cli/config.py): _check_non_ascii_credential() strips non-ASCII from credential values when saving to .env, with a clear warning explaining the issue. 2. Load-time sanitization (hermes_cli/env_loader.py): _sanitize_loaded_credentials() strips non-ASCII from credential env vars (those ending in _API_KEY, _TOKEN, _SECRET, _KEY) after dotenv loads them, so the rest of the codebase never sees non-ASCII keys. 3. Runtime recovery (run_agent.py): The UnicodeEncodeError recovery block now also sanitizes self.api_key and self._client_kwargs['api_key'], fixing the gap where message/tool sanitization succeeded but the API key still caused httpx to fail on the Authorization header. Also: hermes_logging.py RotatingFileHandler now explicitly sets encoding='utf-8' instead of relying on locale default (defensive hardening for ASCII-locale systems).	2026-04-14 20:20:31 -07:00
Brooklyn Nicholson	77cd5bf565	Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor	2026-04-14 19:33:03 -05:00
Greer Guthrie	c10fea8d26	fix(mcp): make server aliases explicit	2026-04-14 17:19:20 -07:00
Greer Guthrie	cda64a5961	fix(mcp): resolve toolsets from live registry	2026-04-14 17:19:20 -07:00
Teknium	2a98098035	fix: hermes gateway restart waits for service to come back up (#8260 ) Previously, systemd_restart() sent SIGUSR1 to the gateway, printed 'restart requested', and returned immediately. The gateway still needed to drain active agents, exit with code 75, wait for systemd's RestartSec=30, and start the new process. The user saw 'success' but the gateway was actually down for 30-60 seconds. Now the SIGUSR1 path blocks with progress feedback: Phase 1 — wait for old process to die: ⏳ User service draining active work... Polls os.kill(pid, 0) until ProcessLookupError (up to 90s) Phase 2 — wait for new process to become active: ⏳ Waiting for hermes-gateway to restart... Polls systemctl is-active + verifies new PID (up to 60s) Success: ✓ User service restarted (PID 12345) Timeout: ⚠ User service did not become active within 60s. Check status: hermes gateway status Check logs: journalctl --user -u hermes-gateway --since '2 min ago' The reload-or-restart fallback path (line 1189) already blocks because systemctl reload-or-restart is synchronous. Test plan: - Updated test to verify wait-for-restart behavior - All 118 gateway CLI tests pass	2026-04-14 17:12:58 -07:00
Teknium	6c89306437	fix: break stuck session resume loops after repeated restarts (#7536 ) When a session gets stuck (hung terminal, runaway tool loop) and the user restarts the gateway, the same session history loads and puts the agent right back in the stuck state. The user is trapped in a loop: restart → stuck → restart → stuck. Fix: track restart-failure counts per session using a simple JSON file (.restart_failure_counts). On each shutdown with active agents, the counter increments for those sessions. On startup, if any session has been active across 3+ consecutive restarts, it's auto-suspended — giving the user a clean slate on their next message. The counter resets to 0 when a session completes a turn successfully (response delivered), so normal sessions that happen to be active during planned restarts (/restart, hermes update) won't accumulate false counts. Implementation: - _increment_restart_failure_counts(): called during stop() when agents are active. Writes {session_key: count} to JSON file. Sessions NOT active are dropped (loop broken). - _suspend_stuck_loop_sessions(): called on startup. Reads the file, suspends sessions at threshold (3), clears the file. - _clear_restart_failure_count(): called after successful response delivery. Removes the session from the counter file. No SessionEntry schema changes. No database migration. Pure file-based tracking that naturally cleans up. Test plan: - 9 new stuck-loop tests (increment, accumulate, threshold, clear, suspend, file cleanup, edge cases) - All 28 gateway lifecycle tests pass (restart drain + auto-continue + stuck loop)	2026-04-14 17:08:35 -07:00
Teknium	e7475b1582	feat: auto-continue interrupted agent work after gateway restart (#4493 ) When the gateway restarts mid-agent-work, the session transcript ends on a tool result the agent never processed. Previously, the user had to type 'continue' or use /retry (which replays from scratch, losing all prior work). Now, when the next user message arrives and the loaded history ends with role='tool', a system note is prepended: [System note: Your previous turn was interrupted before you could process the last tool result(s). Please finish processing those results and summarize what was accomplished, then address the user's new message below.] This is injected in _run_agent()'s run_sync closure, right before calling agent.run_conversation(). The agent sees the full history (including the pending tool results) and the system note, so it can summarize what was accomplished and then handle the user's new input. Design decisions: - No new session flags or schema changes — purely detects trailing tool messages in the loaded history - Works for any restart scenario (clean, crash, SIGTERM, drain timeout) as long as the session wasn't suspended (suspended = fresh start) - The user's actual message is preserved after the note - If the session WAS suspended (unclean shutdown), the old history is abandoned and the user starts fresh — no false auto-continue Also updates the shutdown notification message from 'Use /retry after restart to continue' to 'Send any message after restart to resume where it left off' — which is now accurate. Test plan: - 6 new auto-continue tests (trailing tool detection, no false positives for assistant/user/empty history, multi-tool, message preservation) - All 13 restart drain tests pass (updated /retry assertion)	2026-04-14 16:56:49 -07:00
adybag14-cyber	56c34ac4f7	fix(browser): add termux PATH fallbacks Refactor browser tool PATH construction to include Termux directories (/data/data/com.termux/files/usr/bin, /data/data/com.termux/files/usr/sbin) so agent-browser and npx are discoverable on Android/Termux. Extracts _browser_candidate_path_dirs() and _merge_browser_path() helpers to centralize PATH construction shared between _find_agent_browser() and _run_browser_command(), replacing duplicated inline logic. Also fixes os.pathsep usage (was hardcoded ':') for cross-platform correctness. Cherry-picked from PR #9846.	2026-04-14 16:55:55 -07:00
areu01or00	cfa24532d3	fix(discord): register native /restart slash command	2026-04-14 16:55:48 -07:00
Teknium	10494b42a1	feat(discord): register skills under /skill command group with category subcommands (#9909 ) Instead of consuming one top-level slash command slot per skill (hitting the 100-command limit with ~26 built-ins + 74 skills), skills are now organized under a single /skill group command with category-based subcommand groups: /skill creative ascii-art [args] /skill media gif-search [args] /skill mlops axolotl [args] Discord supports 25 subcommand groups × 25 subcommands = 625 max skills, well beyond the previous 74-slot ceiling. Categories are derived from the skill directory structure: - skills/creative/ascii-art/ → category 'creative' - skills/mlops/training/axolotl/ → category 'mlops' (top-level parent) - skills/dogfood/ → uncategorized (direct subcommand) Changes: - hermes_cli/commands.py: add discord_skill_commands_by_category() with category grouping, hub/disabled filtering, Discord limit enforcement - gateway/platforms/discord.py: replace top-level skill registration with _register_skill_group() using app_commands.Group hierarchy - tests: 7 new tests covering group creation, category grouping, uncategorized skills, hub exclusion, deep nesting, empty skills, and handler dispatch Inspired by Discord community suggestion from bottium.	2026-04-14 16:27:02 -07:00
Brooklyn Nicholson	bf54f1fb2f	Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor	2026-04-14 18:26:05 -05:00
Teknium	1525624904	fix: block agent from self-destructing gateway via terminal (#6666 ) Add dangerous command patterns that require approval when the agent tries to run gateway lifecycle commands via the terminal tool: - hermes gateway stop/restart — kills all running agents mid-work - hermes update — pulls code and restarts the gateway - systemctl restart/stop (with optional flags like --user) These patterns fire the approval prompt so the user must explicitly approve before the agent can kill its own gateway process. In YOLO mode, the commands run without approval (by design — YOLO means the user accepts all risks). Also fixes the existing systemctl pattern to handle flags between the command and action (e.g. 'systemctl --user restart' was previously undetected because the regex expected the action immediately after 'systemctl'). Root cause: issue #6666 reported agents running 'hermes gateway restart' via terminal, killing the gateway process mid-agent-loop. The user sees the agent suddenly stop responding with no explanation. Combined with the SIGTERM auto-recovery from PR #9875, the gateway now both prevents accidental self-destruction AND recovers if it happens anyway. Test plan: - Updated test_systemctl_restart_not_flagged → test_systemctl_restart_flagged - All 119 approval tests pass - E2E verified: hermes gateway restart, hermes update, systemctl --user restart all detected; hermes gateway status, systemctl status remain safe	2026-04-14 15:43:31 -07:00
Teknium	353b5bacbd	test: add tests for /health/detailed endpoint and gateway health probe - TestHealthDetailedEndpoint: 3 tests for the new API server endpoint (returns runtime data, handles missing status, no auth required) - TestProbeGatewayHealth: 5 tests for _probe_gateway_health() (URL normalization, successful/failed probes, fallback chain) - TestStatusRemoteGateway: 4 tests for /api/status remote fallback (remote probe triggers, skipped when local PID found, null PID handling)	2026-04-14 15:41:30 -07:00
Teknium	eed891f1bb	security: supply chain hardening — CI pinning, dep pinning, and code fixes (#9801 ) CI/CD Hardening: - Pin all 12 GitHub Actions to full commit SHAs (was mutable @vN tags) - Add explicit permissions: {contents: read} to 4 workflows - Pin CI pip installs to exact versions (pyyaml==6.0.2, httpx==0.28.1) - Extend supply-chain-audit.yml to scan workflow, Dockerfile, dependency manifest, and Actions version changes Dependency Pinning: - Pin git-based Python deps to commit SHAs (atroposlib, tinker, yc-bench) - Pin WhatsApp Baileys from mutable branch to commit SHA Tool Registry: - Reject tool name shadowing from different tool families (plugins/MCP cannot overwrite built-in tools). MCP-to-MCP overwrites still allowed. MCP Security: - Add tool description content scanning for prompt injection patterns - Log detailed change diff on dynamic tool refresh at WARNING level Skill Manager: - Fix dangerous verdict bug: agent-created skills with dangerous findings were silently allowed (ask->None->allow). Now blocked.	2026-04-14 14:23:37 -07:00
Roy-oss1	1aa76620d4	fix(feishu): keep approval clicks synchronized with callback card state Feishu approval clicks need the resolved card to come back from the synchronous callback path itself. Leaving approval resolution to the generic asynchronous card-action flow made button feedback depend on later loop work instead of the callback response the client is waiting for. Change-Id: I574997cbbcaa097fdba759b47367e28d1b56b040 Constraint: Feishu card-action callbacks must acknowledge quickly and reflect final approval state from the callback response path Rejected: Keep approval handling on the generic async card-action route \| leaves card state synchronization vulnerable to callback timing and follow-up update ordering Confidence: high Scope-risk: narrow Reversibility: clean Directive: Keep approval callback response construction separate from async queue unblocking unless Feishu callback semantics change Tested: pytest tests/gateway/test_feishu.py tests/gateway/test_feishu_approval_buttons.py tests/gateway/test_approve_deny_commands.py tests/gateway/test_slack_approval_buttons.py tests/gateway/test_telegram_approval_buttons.py -q Not-tested: Live Feishu workspace end-to-end callback rendering	2026-04-14 14:22:11 -07:00
Teknium	fa8c448f7d	fix: notify active sessions on gateway shutdown + update health check Three fixes for gateway lifecycle stability: 1. Notify active sessions before shutdown (#new) When the gateway receives SIGTERM or /restart, it now sends a notification to every chat with an active agent BEFORE starting the drain. Users see: - Shutdown: 'Gateway shutting down — your task will be interrupted.' - Restart: 'Gateway restarting — use /retry after restart to continue.' Deduplicates per-chat so group sessions with multiple users get one notification. Best-effort: send failures are logged and swallowed. 2. Skip .clean_shutdown marker when drain timed out Previously, a graceful SIGTERM always wrote .clean_shutdown, even if agents were force-interrupted when the drain timed out. This meant the next startup skipped session suspension, leaving interrupted sessions in a broken state (trailing tool response, no final message). Now the marker is only written if the drain completed without timeout, so interrupted sessions get properly suspended on next startup. 3. Post-restart health check for hermes update (#6631) cmd_update() now verifies the gateway actually survived after systemctl restart (sleep 3s + is-active check). If the service crashed immediately, it retries once. If still dead, prints actionable diagnostics (journalctl command, manual restart hint). Also closes #8104 — already fixed on main (the /restart handler correctly detects systemd via INVOCATION_ID and uses via_service=True). Test plan: - 6 new tests for shutdown notifications (dedup, restart vs shutdown messaging, sentinel filtering, send failure resilience) - Existing restart drain + update tests pass (47 total)	2026-04-14 14:21:57 -07:00
Teknium	a37a095980	fix: detect qwen-oauth provider via CLI tokens in /model picker Seed qwen-oauth credentials from resolve_qwen_runtime_credentials() in _seed_from_singletons(). Users who authenticate via 'qwen auth qwen-oauth' store tokens in ~/.qwen/oauth_creds.json which the runtime resolver reads but the credential pool couldn't detect — same gap pattern as copilot. Uses refresh_if_expiring=False to avoid network calls during discovery.	2026-04-14 11:16:26 -07:00
Marvae	0bd3f521ae	fix: detect copilot provider via gh auth token in /model picker Seed copilot credentials from resolve_copilot_token() in the credential pool's _seed_from_singletons(), alongside the existing anthropic and openai-codex seeding logic. This makes copilot appear in the /model provider picker when the user authenticates solely through gh auth token. Cherry-picked from PR #9767 by Marvae.	2026-04-14 11:16:26 -07:00
Teknium	3e0bccc54c	fix: update existing webhook tests to use _webhook_register_url Follow-up for cherry-picked PR #9746 — three pre-existing tests used adapter._webhook_url (bare URL) in mock data, but _register_webhook and _unregister_webhook now compare against _webhook_register_url (password-bearing URL). Updated to match.	2026-04-14 11:02:48 -07:00
cypres0099	326cbbe40e	fix(gateway/bluebubbles): embed password in registered webhook URL for inbound auth When BlueBubbles posts webhook events to the adapter, it uses the exact URL registered via /api/v1/webhook — and BB's registration API does not support custom headers. The adapter currently registers the bare URL (no credentials), but then requires password auth on inbound POSTs, rejecting every webhook with HTTP 401. This is masked on fresh BB installs by a race condition: the webhook might register once with a prior (possibly patched) URL and keep working until the first restart. On v0.9.0, _unregister_webhook runs on clean shutdown, so the next startup re-registers with the bare URL and the 401s begin. Users see the bot go silent with no obvious cause. Root cause: there's no way to pass auth credentials from BB to the webhook handler except via the URL itself. BB accepts query params and preserves them on outbound POSTs. ## Fix Introduce `_webhook_register_url` — the URL handed to BB's registration API, with the configured password appended as a `?password=<value>` query param. The existing webhook auth handler already accepts this form (it reads `request.query.get("password")`), so no change to the receive side is needed. The bare `_webhook_url` is still used for logging and for binding the local listener, so credentials don't leak into log output. Only the registration/find/unregister paths use the password-bearing form. ## Notes - Password is URL-encoded via urllib.parse.quote, handling special characters (&, *, @, etc.) that would otherwise break parsing. - Storing the password in BB's webhook table is not a new disclosure: anyone with access to that table already has the BB admin password (same credential used for every other API call). - If `self.password` is empty (no auth configured), the register URL is the bare URL — preserves current behavior for unauthenticated local-only setups. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 11:02:48 -07:00
cypres0099	8b52356849	fix(gateway/bluebubbles): fall back to data.chats[0].guid when chatGuid missing BlueBubbles v1.9+ webhook payloads for new-message events do not always include a top-level chatGuid field on the message data object. Instead, the chat GUID is nested under data.chats[0].guid. The adapter currently checks five top-level fallback locations (record and payload, snake_case and camelCase, plus payload.guid) but never looks inside the chats array. When none of those top-level fields contain the GUID, the adapter falls through to using the sender's phone/email as the session chat ID. This causes two observable bugs when a user is a participant in both a DM and a group chat with the bot: 1. DM and group sessions merge. Every message from that user ends up with the same session_chat_id (their own address), so the bot cannot distinguish which thread the message came from. 2. Outbound routing becomes ambiguous. _resolve_chat_guid() iterates all chats and returns the first one where the address appears as a participant; group chats typically sort ahead of DMs by activity, so replies and cron messages intended for the DM can land in a group. This was observed in production: a user's morning brief cron delivered to a group chat with his spouse instead of his DM thread. The fix adds a single fallback that extracts chat_guid from record["chats"][0]["guid"] when the top-level fields are empty. The chats array is included in every new-message webhook payload in BB v1.9.9 (verified against a live server). It is backwards compatible: if a future BB version starts including chatGuid at the top level, that still wins. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 11:02:48 -07:00
Teknium	99bcc2de5b	fix(security): harden dashboard API against unauthenticated access (#9800 ) Addresses responsible disclosure from FuzzMind Security Lab (CVE pending). The web dashboard API server had 36 endpoints, of which only 5 checked the session token. The token itself was served from an unauthenticated GET /api/auth/session-token endpoint, rendering the protection circular. When bound to 0.0.0.0 (--host flag), all API keys, config, and cron management were accessible to any machine on the network. Changes: - Add auth middleware requiring session token on ALL /api/ routes except a small public whitelist (status, config/defaults, config/schema, model/info) - Remove GET /api/auth/session-token endpoint entirely; inject the token into index.html via a <script> tag at serve time instead - Replace all inline token comparisons (!=) with hmac.compare_digest() to prevent timing side-channel attacks - Block non-localhost binding by default; require --insecure flag to override (with warning log) - Update frontend fetchJSON() to send Authorization header on all requests using the injected window.__HERMES_SESSION_TOKEN__ Credit: Callum (@0xca1x) and @migraine-sudo at FuzzMind Security Lab	2026-04-14 10:57:56 -07:00
asheriif	b583210c97	fix(gateway): fix regression causing display.streaming to override root streaming key	2026-04-14 10:52:23 -07:00
Teknium	90c98345c9	feat: gateway proxy mode — forward messages to remote API server When GATEWAY_PROXY_URL (or gateway.proxy_url in config.yaml) is set, the gateway becomes a thin relay: it handles platform I/O (encryption, threading, media) and delegates all agent work to a remote Hermes API server via POST /v1/chat/completions with SSE streaming. This enables the primary use case of running a Matrix E2EE gateway in Docker on Linux while the actual agent runs on the host (e.g. macOS) with full access to local files, memory, skills, and a unified session store. Works for any platform adapter, not just Matrix. Configuration: - GATEWAY_PROXY_URL env var (Docker-friendly) - gateway.proxy_url in config.yaml - GATEWAY_PROXY_KEY env var for API auth (matches API_SERVER_KEY) - X-Hermes-Session-Id header for session continuity Architecture: - _get_proxy_url() checks env var first, then config.yaml - _run_agent_via_proxy() handles HTTP forwarding with SSE streaming - _run_agent() delegates to proxy path when URL is configured - Platform streaming (GatewayStreamConsumer) works through proxy - Returns compatible result dict for session store recording Files changed: - gateway/run.py: proxy mode implementation (~250 lines) - hermes_cli/config.py: GATEWAY_PROXY_URL + GATEWAY_PROXY_KEY env vars - tests/gateway/test_proxy_mode.py: 17 tests covering config resolution, dispatch, HTTP forwarding, error handling, message filtering, and result shape validation Closes discussion from Cars29 re: Matrix gateway mixed-mode issue.	2026-04-14 10:49:48 -07:00
Disaster-Terminator	9bdfcd1b93	feat: sort tool search results by score and add corresponding unit test	2026-04-14 10:49:35 -07:00
Teknium	b867171291	fix: preserve profile name completion in dynamic shell completion The dynamic parser walker from the contributor's commit lost the profile name tab-completion that existed in the old static generators. This adds it back for all three shells: - Bash: _hermes_profiles() helper, -p/--profile completion, profile action→name completion (use/delete/show/alias/rename/export) - Zsh: _hermes_profiles() function, -p/--profile argument spec, profile action case with name completion - Fish: __hermes_profiles function, -s p -l profile flag, profile action completions Also removes the dead fallback path in cmd_completion() that imported the old static generators from profiles.py (parser is always available via the lambda wiring) and adds 11 regression-prevention tests for profile completion.	2026-04-14 10:45:42 -07:00
leozeli	a686dbdd26	feat(cli): add dynamic shell completion for bash, zsh, and fish Replaces the hardcoded completion stubs in profiles.py with a dynamic generator that walks the live argparse parser tree at runtime. - New hermes_cli/completion.py: _walk() recursively extracts all subcommands and flags; generate_bash/zsh/fish() produce complete scripts with nested subcommand support - cmd_completion now accepts the parser via closure so completions always reflect the actual registered commands (including plugin- registered ones like honcho) - completion subcommand now accepts bash \| zsh \| fish (fish requested in issue comments) - Fix _SUBCOMMANDS set: add honcho, claw, plugins, acp, webhook, memory, dump, debug, backup, import, completion, logs so that multi-word session names after -c/-r are not broken by these commands - Add tests/hermes_cli/test_completion.py: 17 tests covering parser extraction, alias deduplication, bash/zsh/fish output content, bash syntax validation, fish syntax validation, and subcommand drift prevention Tested on Linux (Arch). bash and fish completion verified live. zsh script passes syntax check (zsh not installed on test machine). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-14 10:45:42 -07:00
N0nb0at	b21b3bfd68	feat(plugins): namespaced skill registration for plugin skill bundles Add ctx.register_skill() API so plugins can ship SKILL.md files under a 'plugin:skill' namespace, preventing name collisions with built-in Hermes skills. skill_view() detects the ':' separator and routes to the plugin registry while bare names continue through the existing flat-tree scan unchanged. Key additions: - agent/skill_utils: parse_qualified_name(), is_valid_namespace() - hermes_cli/plugins: PluginContext.register_skill(), PluginManager skill registry (find/list/remove) - tools/skills_tool: qualified name dispatch in skill_view(), _serve_plugin_skill() with full guards (disabled, platform, injection scan), bundle context banner with sibling listing, stale registry self-heal - Hoisted _INJECTION_PATTERNS to module level (dedup) - Updated skill_view schema description Based on PR #9334 by N0nb0at. Lean P1 salvage — omits autogen shim (P2) for a simpler first merge. Closes #8422	2026-04-14 10:42:58 -07:00
Dusk1e	4b47856f90	fix: load credentials from HERMES_HOME .env in trajectory_compressor	2026-04-14 10:24:19 -07:00
Teknium	8ea9ceb44c	fix: guard reply_to_text against DeletedReferencedMessage Use getattr() for resolved.content since discord.py's DeletedReferencedMessage lacks a content attribute. Adds test for the deleted-message edge case.	2026-04-14 10:22:11 -07:00
ChimingLiu	7636baf49c	feat(discord): extract reply text from message references	2026-04-14 10:22:11 -07:00
Dusk1e	420d27098f	fix(tools): keep memory tool available when fcntl is unavailable	2026-04-14 10:18:05 -07:00
Zhuofeng Wang	449c17e9a9	fix(gateway): support Telegram MarkdownV2 expandable blockquotes	2026-04-14 10:16:49 -07:00
shijianzhi	70611879de	fix(cli): fix doctor checks for Kimi China credentials	2026-04-14 10:16:30 -07:00
Brooklyn Nicholson	9a3a2925ed	feat: scroll aware sticky prompt	2026-04-14 11:49:32 -05:00
Teknium	7ad47ace51	fix: resolve remaining 4 CI test failures (#9543 ) - test_auth_commands: suppress _seed_from_singletons auto-seeding that adds extra credentials from CI env (same pattern as nearby tests) - test_interrupt: clear stale _interrupted_threads set to prevent thread ident reuse from prior tests in same xdist worker - test_code_execution: add watch_patterns to _BLOCKED_TERMINAL_PARAMS to match production _TERMINAL_BLOCKED_PARAMS	2026-04-14 02:18:38 -07:00
Teknium	b4fcec6412	fix: prevent streaming cursor from appearing as standalone messages (#9538 ) During rapid tool-calling, the model often emits 1-2 tokens before switching to tool calls. The stream consumer would create a new message with 'X ▉' (short text + cursor), and if the follow-up edit to strip the cursor was rate-limited by the platform, the cursor remained as a permanent standalone message — reported on Telegram as 'white box' artifacts. Add a minimum-content guard in _send_or_edit: when creating a new standalone message (no existing message_id), require at least 4 visible characters alongside the cursor before sending. Shorter text accumulates into the next streaming segment instead. This prevents cursor-only 'tofu' messages across all platforms without affecting normal streaming (edits to existing messages, final sends without cursor, and messages with substantial text are all unaffected). Reported by @michalkomar on X.	2026-04-14 01:52:42 -07:00
Teknium	2558d28a9b	fix: resolve CI test failures — add missing functions, fix stale tests (#9483 ) Production fixes: - Add clear_session_context() to hermes_logging.py (fixes 48 teardown errors) - Add clear_session() to tools/approval.py (fixes 9 setup errors) - Add SyncError M_UNKNOWN_TOKEN check to Matrix _sync_loop (bug fix) - Fall back to inline api_key in named custom providers when key_env is absent (runtime_provider.py) Test fixes: - test_memory_user_id: use builtin+external provider pair, fix honcho peer_name override test to match production behavior - test_display_config: remove TestHelpers for non-existent functions - test_auxiliary_client: fix OAuth tokens to match _is_oauth_token patterns, replace get_vision_auxiliary_client with resolve_vision_provider_client - test_cli_interrupt_subagent: add missing _execution_thread_id attr - test_compress_focus: add model/provider/api_key/base_url/api_mode to mock compressor - test_auth_provider_gate: add autouse fixture to clean Anthropic env vars that leak from CI secrets - test_opencode_go_in_model_list: accept both 'built-in' and 'hermes' source (models.dev API unavailable in CI) - test_email: verify email Platform enum membership instead of source inspection (build_channel_directory now uses dynamic enum loop) - test_feishu: add bot_added/bot_deleted handler mocks to _Builder - test_ws_auth_retry: add AsyncMock for sync_store.get_next_batch, add _pending_megolm and _joined_rooms to Matrix adapter mocks - test_restart_drain: monkeypatch-delete INVOCATION_ID (systemd sets this in CI, changing the restart call signature) - test_session_hygiene: add user_id to SessionSource - test_session_env: use relative baseline for contextvar clear check (pytest-xdist workers share context)	2026-04-14 01:43:45 -07:00
Jiawen-lee	2cfd2dafc6	feat(gateway): add ignored_threads config for Telegram	2026-04-14 01:40:32 -07:00
Teknium	4654f75627	fix: QQBot missing integration points, timestamp parsing, test fix - Add Platform.QQBOT to _UPDATE_ALLOWED_PLATFORMS (enables /update command) - Add 'qqbot' to webhook cross-platform delivery routing - Add 'qqbot' to hermes dump platform detection - Fix test_name_property casing: 'QQBot' not 'QQBOT' - Add _parse_qq_timestamp() for ISO 8601 + integer ms compatibility (QQ API changed timestamp format — from PR #2411 finding) - Wire timestamp parsing into all 4 message handlers	2026-04-14 00:11:49 -07:00
walli	884cd920d4	feat(gateway): unify QQBot branding, add PLATFORM_HINTS, fix streaming, restore missing setup functions - Rename platform from 'qq' to 'qqbot' across all integration points (Platform enum, toolset, config keys, import paths, file rename qq.py → qqbot.py) - Add PLATFORM_HINTS for QQBot in prompt_builder (QQ supports markdown) - Set SUPPORTS_MESSAGE_EDITING = False to skip streaming on QQ (prevents duplicate messages from non-editable partial + final sends) - Add _send_qqbot() standalone send function for cron/send_message tool - Add interactive _setup_qq() wizard in hermes_cli/setup.py - Restore missing _setup_signal/email/sms/dingtalk/feishu/wecom/wecom_callback functions that were lost during the original merge	2026-04-14 00:11:49 -07:00
Junjun Zhang	87bfc28e70	feat: add QQ Bot platform adapter (Official API v2) Add full QQ Bot integration via the Official QQ Bot API (v2): - WebSocket gateway for inbound events (C2C, group, guild, DM) - REST API for outbound text/markdown/media messages - Voice transcription (Tencent ASR + configurable STT provider) - Attachment processing (images, voice, files) - User authorization (allowlist + allow-all + DM pairing) Integration points: - gateway: Platform.QQ enum, adapter factory, allowlist maps - CLI: setup wizard, gateway config, status display, tools config - tools: send_message cross-platform routing, toolsets - cron: delivery platform support - docs: QQ Bot setup guide	2026-04-14 00:11:49 -07:00
Greer Guthrie	c7e2fe655a	fix: make tool registry reads thread-safe	2026-04-13 23:52:32 -07:00
Teknium	6dc8f8e9c0	feat(skin): add warm-lightmode skin from PR #4811 Add a second light-mode skin option with warm brown/parchment tones, adapted from ygd58's contribution in PR #4811. Includes completion menu and status bar color keys for full light-terminal support. Co-authored-by: buray <78954051+ygd58@users.noreply.github.com>	2026-04-13 23:51:21 -07:00
Liu Chongwei	bc93641c4f	feat(skins): add built-in daylight skin	2026-04-13 23:51:21 -07:00
Teknium	19199cd38d	fix: clamp 'minimal' reasoning effort to 'low' on Responses API (#9429 ) GPT-5.4 supports none/low/medium/high/xhigh but not 'minimal'. Users may configure 'minimal' via OpenRouter conventions, which would cause a 400 on native OpenAI. Clamp to 'low' in the codex_responses path before sending.	2026-04-13 23:11:13 -07:00
Teknium	38ad158b6b	fix: auto-correct close model name matches in /model validation (#9424 ) * feat(skills): add fitness-nutrition skill to optional-skills Cherry-picked from PR #9177 by @haileymarshall. Adds a fitness and nutrition skill for gym-goers and health-conscious users: - Exercise search via wger API (690+ exercises, free, no auth) - Nutrition lookup via USDA FoodData Central (380K+ foods, DEMO_KEY fallback) - Offline body composition calculators (BMI, TDEE, 1RM, macros, body fat %) - Pure stdlib Python, no pip dependencies Changes from original PR: - Moved from skills/ to optional-skills/health/ (correct location) - Fixed BMR formula in FORMULAS.md (removed confusing -5+10, now just +5) - Fixed author attribution to match PR submitter - Marked USDA_API_KEY as optional (DEMO_KEY works without signup) Also adds optional env var support to the skill readiness checker: - New 'optional: true' field in required_environment_variables entries - Optional vars are preserved in metadata but don't block skill readiness - Optional vars skip the CLI capture prompt flow - Skills with only optional missing vars show as 'available' not 'setup_needed' * fix: auto-correct close model name matches in /model validation When a user types a model name with a minor typo (e.g. gpt5.3-codex instead of gpt-5.3-codex), the validation now auto-corrects to the closest match instead of accepting the wrong name with a warning. Uses difflib get_close_matches with cutoff=0.9 to avoid false corrections (e.g. gpt-5.3 should not silently become gpt-5.4). Applied consistently across all three validation paths: codex provider, custom endpoints, and generic API-probed providers. The validate_requested_model() return dict gains an optional corrected_model key that switch_model() applies before building the result. Reported by Discord user — /model gpt5.3-codex was accepted with a warning but would fail at the API level. --------- Co-authored-by: haileymarshall <haileymarshall@users.noreply.github.com>	2026-04-13 23:09:39 -07:00
Teknium	d631431872	feat: prompt for display name when adding custom providers (#9420 ) During custom endpoint setup, users are now asked for a display name with the auto-generated name as the default. Typing 'Ollama' or 'LM Studio' replaces the generic 'Local (localhost:11434)' in the provider menu. Extracts _auto_provider_name() for reuse and adds a name= parameter to _save_custom_provider() so the caller can pass through the user-chosen label.	2026-04-13 22:41:00 -07:00
Kenny Xie	cdd44817f2	fix(anthropic): send fast mode speed via extra_body	2026-04-13 22:32:39 -07:00
Teknium	3de2b98503	fix(streaming): filter <think> blocks from gateway stream consumer Models like MiniMax emit inline <think>...</think> reasoning blocks in their content field. The CLI already suppresses these via a state machine in _stream_delta, but the gateway's GatewayStreamConsumer had no equivalent filtering — raw think blocks were streamed directly to Discord/Telegram/Slack. The fix adds a _filter_and_accumulate() method that mirrors the CLI's approach: a state machine tracks whether we're inside a reasoning block and silently discards the content. Includes the same block-boundary check (tag must appear at line start or after whitespace-only prefix) to avoid false positives when models mention <think> in prose. Handles all tag variants: <think>, <thinking>, <THINKING>, <thought>, <reasoning>, <REASONING_SCRATCHPAD>. Also handles edge cases: - Tags split across streaming deltas (partial tag buffering) - Unclosed blocks (content suppressed until stream ends) - Multiple consecutive blocks - _flush_think_buffer on stream end for held-back partial tags Adds 22 unit tests + 1 integration test covering all scenarios.	2026-04-13 22:16:20 -07:00
helix4u	e08590888a	fix: honor interrupts during MCP tool waits	2026-04-13 22:14:55 -07:00
Teknium	62fb6b2cd8	fix: guard zero context length display + add 19 tests for model info - ModelInfoCard: hide card when effective_context_length <= 0 instead of showing 'Context Window: 0 auto-detected' - Add tests for _normalize_config_for_web model_context_length extraction - Add tests for _denormalize_config_from_web round-trip (write back, remove on zero, upgrade bare string to dict, coerce string input) - Add tests for CONFIG_SCHEMA ordering (model_context_length after model) - Add tests for GET /api/model/info endpoint (dict config, bare string, empty model, capabilities, graceful error handling)	2026-04-13 22:04:35 -07:00
Gianfranco Piana	eabc0a2f66	feat(plugins): let pre_tool_call hooks block tool execution Plugins can now return {"action": "block", "message": "reason"} from their pre_tool_call hook to prevent a tool from executing. The error message is returned to the model as a tool result so it can adjust. Covers both execution paths: handle_function_call (model_tools.py) and agent-level tools (run_agent.py _invoke_tool + sequential/concurrent). Blocked tools skip all side effects (counter resets, checkpoints, callbacks, read-loop tracker). Adds skip_pre_tool_call_hook flag to avoid double-firing the hook when run_agent.py already checked and then calls handle_function_call. Salvaged from PR #5385 (gianfrancopiana) and PR #4610 (oredsecurity).	2026-04-13 22:01:49 -07:00
Teknium	5621fc449a	chore: rename AI Gateway → Vercel AI Gateway, move Xiaomi to #5 (#9326 ) - Rename 'AI Gateway' to 'Vercel AI Gateway' across auth, models, doctor, setup, and tests. - Move Xiaomi MiMo to position #5 in the provider picker.	2026-04-13 19:51:54 -07:00
Brooklyn Nicholson	6bbac046a7	Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor	2026-04-13 21:46:11 -05:00
Teknium	0cc7f79016	fix(streaming): prevent duplicate Telegram replies when stream task is cancelled (#9319 ) When the 5-second stream_task timeout in gateway/run.py expires (due to slow Telegram API calls from rate limiting after several messages), the stream consumer is cancelled via asyncio.CancelledError. The CancelledError handler did a best-effort final edit but never set final_response_sent, so the gateway fell through to the normal send path and delivered the full response again as a reply — causing a duplicate. The fix: in the CancelledError handler, set final_response_sent = True when already_sent is True (i.e., the stream consumer had already delivered content to the user). This tells the gateway's already_sent check that the response was delivered, preventing the duplicate send. Adds two tests verifying the cancellation behavior: - Cancelled with already_sent=True → final_response_sent=True (no dup) - Cancelled with already_sent=False → final_response_sent=False (normal send path proceeds) Reported by community user hume on Discord.	2026-04-13 19:22:43 -07:00
Brooklyn Nicholson	1b573b7b21	Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor	2026-04-13 21:17:41 -05:00
Teknium	f324222b79	fix: add vLLM/local server error patterns + MCP initial connection retry (#9281 ) Port two improvements inspired by Kilo-Org/kilocode analysis: 1. Error classifier: add context overflow patterns for vLLM, Ollama, and llama.cpp/llama-server. These local inference servers return different error formats than cloud providers (e.g., 'exceeds the max_model_len', 'context length exceeded', 'slot context'). Without these patterns, context overflow errors from local servers are misclassified as format errors, causing infinite retries instead of triggering compression. 2. MCP initial connection retry: previously, if the very first connection attempt to an MCP server failed (e.g., transient DNS blip at startup), the server was permanently marked as failed with no retry. Post-connect reconnection had 5 retries with exponential backoff, but initial connection had zero. Now initial connections retry up to 3 times with backoff before giving up, matching the resilience of post-connect reconnection. (Inspired by Kilo Code's MCP server disappearing fix in v1.3.3) Tests: 6 new error classifier tests, 4 new MCP retry tests, 1 updated existing test. All 276 affected tests pass.	2026-04-13 18:46:14 -07:00
arthurbr11	0a4cf5b3e1	feat(providers): add Arcee AI as direct API provider Adds Arcee AI as a standard direct provider (ARCEEAI_API_KEY) with Trinity models: trinity-large-thinking, trinity-large-preview, trinity-mini. Standard OpenAI-compatible provider checklist: auth.py, config.py, models.py, main.py, providers.py, doctor.py, model_normalize.py, model_metadata.py, setup.py, trajectory_compressor.py. Based on PR #9274 by arthurbr11, simplified to a standard direct provider without dual-endpoint OpenRouter routing.	2026-04-13 18:40:06 -07:00
Teknium	ac80bd61ad	test: add regression tests for custom_providers multi-model dedup and grouping Tests for salvaged PRs #9233 and #8011.	2026-04-13 16:41:30 -07:00
Brooklyn Nicholson	7e4dd6ea02	Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor	2026-04-13 18:32:13 -05:00
Teknium	32cea0c08d	fix: dashboard shows Nous Portal as 'not connected' despite active auth (#9261 ) The dashboard device-code flow (_nous_poller in web_server.py) saved credentials to the credential pool only, while get_nous_auth_status() only checked the auth store (auth.json). This caused the Keys tab to show 'not connected' even when the backend was fully authenticated. Two fixes: 1. get_nous_auth_status() now checks the credential pool first (like get_codex_auth_status() already does), then falls back to the auth store. 2. _nous_poller now also persists to the auth store after saving to the credential pool, matching the CLI flow (_login_nous). Adds 3 tests covering pool-only, auth-store-fallback, and empty-state scenarios.	2026-04-13 16:32:11 -07:00
Teknium	8d023e43ed	refactor: remove dead code — 1,784 lines across 77 files (#9180 ) Deep scan with vulture, pyflakes, and manual cross-referencing identified: - 41 dead functions/methods (zero callers in production) - 7 production-dead functions (only test callers, tests deleted) - 5 dead constants/variables - ~35 unused imports across agent/, hermes_cli/, tools/, gateway/ Categories of dead code removed: - Refactoring leftovers: _set_default_model, _setup_copilot_reasoning_selection, rebuild_lookups, clear_session_context, get_logs_dir, clear_session - Unused API surface: search_models_dev, get_pricing, skills_categories, get_read_files_summary, clear_read_tracker, menu_labels, get_spinner_list - Dead compatibility wrappers: schedule_cronjob, list_cronjobs, remove_cronjob - Stale debug helpers: get_debug_session_info copies in 4 tool files (centralized version in debug_helpers.py already exists) - Dead gateway methods: send_emote, send_notice (matrix), send_reaction (bluebubbles), _normalize_inbound_text (feishu), fetch_room_history (matrix), _start_typing_indicator (signal), parse_feishu_post_content - Dead constants: NOUS_API_BASE_URL, SKILLS_TOOL_DESCRIPTION, FILE_TOOLS, VALID_ASPECT_RATIOS, MEMORY_DIR - Unused UI code: _interactive_provider_selection, _interactive_model_selection (superseded by prompt_toolkit picker) Test suite verified: 609 tests covering affected files all pass. Tests for removed functions deleted. Tests using removed utilities (clear_read_tracker, MEMORY_DIR) updated to use internal APIs directly.	2026-04-13 16:32:04 -07:00
helix4u	f94f53cc22	fix(matrix): disable streaming cursor decoration on Matrix	2026-04-13 16:31:02 -07:00
helix4u	0ffb6f2dae	fix(matrix): skip cursor-only stream placeholder messages	2026-04-13 16:31:02 -07:00
Brooklyn Nicholson	aeb53131f3	fix(ui-tui): harden TUI error handling, model validation, command UX parity, and gateway lifecycle	2026-04-13 18:29:24 -05:00
helix4u	8680f61f8b	fix(copilot-acp): keep acp runtime off responses path	2026-04-13 16:17:43 -07:00
Teknium	063244bb16	test: add coverage for plugin context engine init (#9071 ) Verify that plugin context engines receive update_model() with correct context_length during AIAgent init — regression test for the ctx -- bug.	2026-04-13 15:00:57 -07:00
Teknium	952a885fbf	fix(gateway): /stop no longer resets the session (#9224 ) /stop was calling suspend_session() which marked the session for auto-reset on the next message. This meant users lost their conversation history every time they stopped a running agent — especially painful for untitled sessions that can't be resumed by name. Now /stop just interrupts the agent and cleans the session lock. The session stays intact so users can continue the conversation. The suspend behavior was introduced in #7536 to break stuck session resume loops on gateway restart. That case is already handled by suspend_recently_active() which runs at gateway startup, so removing it from /stop doesn't regress the original fix.	2026-04-13 14:59:05 -07:00
Brooklyn Nicholson	ebe3270430	fix: fake models	2026-04-13 14:57:42 -05:00
yongtenglei	2773b18b56	fix(run_agent): refresh activity during streaming responses Previously, long-running streamed responses could be incorrectly treated as idle by the gateway/cron inactivity timeout even while tokens were actively arriving. The _touch_activity() call (which feeds get_activity_summary() polled by the external timeout) was either called only on the first chunk (chat completions) or not at all (Anthropic, Codex, Codex fallback). Add _touch_activity() on every chunk/event in all four streaming paths so the inactivity monitor knows data is still flowing. Fixes #8760	2026-04-13 10:55:51 -07:00
墨綠BG	c449cd1af5	fix(config): restore custom providers after v11→v12 migration The v11→v12 migration converts custom_providers (list) into providers (dict), then deletes the list. But all runtime resolvers read from custom_providers — after migration, named custom endpoints silently stop resolving and fallback chains fail with AuthError. Add get_compatible_custom_providers() that reads from both config schemas (legacy custom_providers list + v12+ providers dict), normalizes entries, deduplicates, and returns a unified list. Update ALL consumers: - hermes_cli/runtime_provider.py: _get_named_custom_provider() + key_env - hermes_cli/auth_commands.py: credential pool provider names - hermes_cli/main.py: model picker + _model_flow_named_custom() - agent/auxiliary_client.py: key_env + custom_entry model fallback - agent/credential_pool.py: _iter_custom_providers() - cli.py + gateway/run.py: /model switch custom_providers passthrough - run_agent.py + gateway/run.py: per-model context_length lookup Also: use config.pop() instead of del for safer migration, fix stale _config_version assertions in tests, add pool mock to codex test. Co-authored-by: 墨綠BG <s5460703@gmail.com> Closes #8776, salvaged from PR #8814	2026-04-13 10:50:52 -07:00
Teknium	0dd26c9495	fix(tests): fix 78 CI test failures and remove dead test (#9036 ) Production fixes: - voice_mode.py: add is_recording property to AudioRecorder (parity with TermuxAudioRecorder) - cronjob_tools.py: add sms example to deliver description Test fixes: - test_real_interrupt_subagent: add missing _execution_thread_id (fixes 19 cascading failures from leaked _build_system_prompt patch) - test_anthropic_error_handling: add _FakeMessages, override _interruptible_streaming_api_call (6 fixes) - test_ctx_halving_fix: add missing request_overrides attribute (4 fixes) - test_context_token_tracking: set _disable_streaming=True for non-streaming test path (4 fixes) - test_dict_tool_call_args: set _disable_streaming=True (1 fix) - test_provider_parity: add model='gpt-4o' for AIGateway tests to meet 64K minimum context (4 fixes) - test_session_race_guard: add user_id to SessionSource (5 fixes) - test_restart_drain/helpers: add user_id to SessionSource (2 fixes) - test_telegram_photo_interrupts: add user_id to SessionSource - test_interrupt: target thread_id for per-thread interrupt system (2 fixes) - test_zombie_process_cleanup: rewrite with object.__new__ for refactored GatewayRunner.stop() (1 fix) - test_browser_camofox_state: update config version 15->17 (1 fix) - test_trajectory_compressor_async: widen lookback window 10->20 for line-shifted AsyncOpenAI (1 fix) - test_voice_mode: fixed by production is_recording addition (5 fixes) - test_voice_cli_integration: add _attached_images to CLI stub (2 fixes) - test_hermes_logging: explicit propagation/level reset for cross-test pollution defense (1 fix) - test_run_agent: add base_url for OpenRouter detection tests (2 fixes) Deleted: - test_inline_think_blocks_reasoning_only_accepted: tested unimplemented inline <think> handling	2026-04-13 10:50:24 -07:00
kimsr96	b909a9efef	fix: extend ASCII-locale UnicodeEncodeError recovery to full request payload The existing ASCII codec handler only sanitized conversation messages, leaving tool schemas, system prompts, ephemeral prompts, prefill messages, and HTTP headers as unhandled sources of non-ASCII content. On systems with LANG=C or non-UTF-8 locale, Unicode symbols in tool descriptions (e.g. arrows, em-dashes from prompt_builder) and system prompt content would cause UnicodeEncodeError that fell through to the error path. Changes: - Add _sanitize_structure_non_ascii() generic recursive walker for nested dict/list payloads - Add _sanitize_tools_non_ascii() thin wrapper for tool schemas - Add _force_ascii_payload flag: once ASCII locale is detected, all subsequent API calls get proactively sanitized (prevents recurring failures from new tool results bringing fresh Unicode each turn) - Extend the ASCII codec error handler to sanitize: prefill_messages, tool schemas (self.tools), system prompt, ephemeral system prompt, and default HTTP headers - Update stale comment that acknowledged the gap Cherry-picked from PR #8834 (credential pool changes dropped as separate concern).	2026-04-13 05:16:35 -07:00
Geoff	76eecf3819	fix(model): Support providers: dict for custom endpoints in /model Two fixes for user-defined providers in config.yaml: 1. list_authenticated_providers() - now includes full models list from providers.*.models array, not just default_model. This fixes /model showing only one model when multiple are configured. 2. _get_named_custom_provider() - now checks providers: dict (new-style) in addition to custom_providers: list (legacy). This fixes credential resolution errors when switching models via /model command. Both changes are backwards compatible with existing custom_providers list format. Fixes: Only one model appears for custom providers in /model selection	2026-04-13 05:16:21 -07:00
konsisumer	311dac1971	fix(file_tools): block /private/etc writes on macOS symlink bypass On macOS, /etc is a symlink to /private/etc, so os.path.realpath() resolves /etc/hosts to /private/etc/hosts. The sensitive path check only matched /etc/ prefixes against the resolved path, allowing writes to system files on macOS. - Add /private/etc/ and /private/var/ to _SENSITIVE_PATH_PREFIXES - Check both realpath-resolved and normpath-normalized paths - Add regression tests for macOS symlink bypass Closes #8734 Co-authored-by: ElhamDevelopmentStudio (PR #8829)	2026-04-13 05:15:05 -07:00
Teknium	587eeb56b9	chore: remove duplicate dead _try_gh_cli_token / _gh_cli_candidates from auth.py These functions were duplicated between auth.py and copilot_auth.py. The auth.py copies had zero production callers — only copilot_auth.py's versions are used. Redirect the test import to the live copy and update monkeypatch targets accordingly.	2026-04-13 05:12:36 -07:00
luyao618	8ec1608642	fix(agent): propagate api_mode to vision provider resolution resolve_vision_provider_client() computed resolved_api_mode from config but never passed it to downstream resolve_provider_client() or _get_cached_client() calls, causing custom providers with api_mode: anthropic_messages to crash when used for vision tasks. Also remove the for_vision special case in _normalize_aux_provider() that incorrectly discarded named custom provider identifiers. Fixes #8857 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-13 05:02:54 -07:00
Teknium	e3ffe5b75f	fix: remove legacy compression.summary_* config and env var fallbacks (#8992 ) Remove the backward-compat code paths that read compression provider/model settings from legacy config keys and env vars, which caused silent failures when auto-detection resolved to incompatible backends. What changed: - Remove compression.summary_model, summary_provider, summary_base_url from DEFAULT_CONFIG and cli.py defaults - Remove backward-compat block in _resolve_task_provider_model() that read from the legacy compression section - Remove _get_auxiliary_provider() and _get_auxiliary_env_override() helper functions (AUXILIARY_/CONTEXT_ env var readers) - Remove env var fallback chain for per-task overrides - Update hermes config show to read from auxiliary.compression - Add config migration (v16→17) that moves non-empty legacy values to auxiliary.compression and strips the old keys - Update example config and openclaw migration script - Remove/update tests for deleted code paths Compression model/provider is now configured exclusively via: auxiliary.compression.provider / auxiliary.compression.model Closes #8923	2026-04-13 04:59:26 -07:00
WorldInnovationsDepartment	c1809e85e7	fix(gateway): handle stale lock files in acquire_scoped_lock Updated the acquire_scoped_lock function to treat empty or corrupt lock files as stale. This change ensures that if a lock file exists but is invalid, it will be removed to prevent issues with stale locks. Added tests to verify recovery from both empty and corrupt lock files.	2026-04-13 04:59:25 -07:00
Teknium	a5bd56eae3	fix: eliminate provider hang dead zones in retry/timeout architecture (#8985 ) Three targeted changes to close the gaps between retry layers that caused users to experience 'No response from provider for 580s' and 'No activity for 15 minutes' despite having 5 layers of retry: 1. Remove non-streaming fallback from streaming path Previously, when all 3 stream retries exhausted, the code fell back to _interruptible_api_call() which had no stale detection and no activity tracking — a black hole that could hang for up to 1800s. Now errors propagate to the main retry loop which has richer recovery (credential rotation, provider fallback, backoff). For 'stream not supported' errors, sets _disable_streaming flag so the main retry loop automatically switches to non-streaming on the next attempt. 2. Add _touch_activity to recovery dead zones The gateway inactivity monitor relies on _touch_activity() to know the agent is alive, but activity was never touched during: - Stale stream detection/kill cycles (180-300s gaps) - Stream retry connection rebuilds - Main retry backoff sleeps (up to 120s) - Error recovery classification Now all these paths touch activity every ~30s, keeping the gateway informed during recovery cycles. 3. Add stale-call detector to non-streaming path _interruptible_api_call() now has the same stale detection pattern as the streaming path: kills hung connections after 300s (default, configurable via HERMES_API_CALL_STALE_TIMEOUT), scaled for large contexts (450s for 50K+ tokens, 600s for 100K+ tokens), disabled for local providers. Also touches activity every ~30s during the wait so the gateway monitor stays informed. Env vars: - HERMES_API_CALL_STALE_TIMEOUT: non-streaming stale timeout (default 300s) - HERMES_STREAM_STALE_TIMEOUT: unchanged (default 180s) Before: worst case ~2+ hours of sequential retries with no feedback After: worst case bounded by gateway inactivity timeout (default 1800s) with continuous activity reporting	2026-04-13 04:55:20 -07:00
Teknium	acdff020b7	test: add multi-word query tests for truncation match strategy Tests phrase matching, proximity co-occurrence, and sliding window coverage maximisation — the three new tiers from the truncation fix.	2026-04-13 04:54:42 -07:00
landy	dbed40f39b	fix: reopen resumed gateway sessions in sqlite	2026-04-13 04:54:07 -07:00
twilwa	3a64348772	fix(discord): voice session continuity and signal handler thread safety - Store source metadata on /voice channel join so voice input shares the same session as the linked text channel conversation - Treat voice-linked text channels as free-response (skip @mention and auto-thread) while voice is active - Scope the voice-linked exemption to the exact bound channel, not sibling threads - Guard signal handler registration in start_gateway() for non-main threads (prevents RuntimeError when gateway runs in a daemon thread) - Clean up _voice_sources on leave_voice_channel Salvaged from PR #3475 by twilwa (Modal runtime portions excluded).	2026-04-13 04:49:21 -07:00
Teknium	381810ad50	feat: fix SQLite safety in hermes backup + add --quick snapshots + /snapshot command (#8971 ) Three changes consolidated into the existing backup system: 1. Fix: hermes backup now uses sqlite3.Connection.backup() for .db files instead of raw file copy. Raw copy of a WAL-mode database can produce a corrupted backup — the backup() API handles this correctly. 2. hermes backup --quick: fast snapshot of just critical state files (config.yaml, state.db, .env, auth.json, cron/jobs.json, etc.) stored in ~/.hermes/state-snapshots/. Auto-prunes to 20 snapshots. 3. /snapshot slash command (alias /snap): in-session interface for quick state snapshots. create/list/restore/prune subcommands. Restore by ID or number. Powered by the same backup module. No new modules — everything lives in hermes_cli/backup.py alongside the existing full backup/import code. No hooks in run_agent.py — purely on-demand, zero runtime overhead. Closes the use case from PRs #8406 and #7813 with ~200 lines of new logic instead of a 1090-line content-addressed storage engine.	2026-04-13 04:46:13 -07:00
Teknium	8dfee98d06	fix: clean up description escaping, add string-data tests Follow-up for cherry-picked PR #8918.	2026-04-13 04:45:07 -07:00
Teknium	cea34dc7ef	fix: follow-up for salvaged PR #8939 - Move test file to tests/hermes_cli/ (consistent with test layout) - Remove unused imports (os, pytest) from test file - Update _sanitize_env_lines docstring: now used on read + write paths	2026-04-13 04:35:37 -07:00
Mil Wang (from Dev Box)	e469f3f3db	fix: sanitize .env before loading to prevent token duplication (#8908 ) When .env files become corrupted (e.g. concatenated KEY=VALUE pairs on a single line due to concurrent writes or encoding issues), both python-dotenv and load_env() would parse the entire concatenated string as a single value. This caused bot tokens to appear duplicated up to 8×, triggering InvalidToken errors from the Telegram API. Root cause: _sanitize_env_lines() — which correctly splits concatenated lines — was only called during save_env_value() writes, not during reads. Fix: - load_env() now calls _sanitize_env_lines() before parsing - env_loader.load_hermes_dotenv() sanitizes the .env file on disk before python-dotenv reads it, so os.getenv() also returns clean values - Added tests reproducing the exact corruption pattern from #8908 Closes #8908	2026-04-13 04:35:37 -07:00
ismell0992-afk	e77f135ed8	fix(cli): narrow Nous Hermes non-agentic warning to actual hermes-3/-4 models The startup warning that Nous Research Hermes 3 & 4 models are not agentic fired on any model whose name contained "hermes" anywhere, via a plain substring check. That false-positived on unrelated local Modelfiles such as `hermes-brain:qwen3-14b-ctx16k` — a tool-capable Qwen3 wrapper that happens to live under a custom "hermes" tag namespace — making the warning noise for legitimate setups. Replace the substring check with a narrow regex anchored on `^`, `/`, or `:` boundaries that only matches the real Hermes-3 / Hermes-4 chat family (e.g. `NousResearch/Hermes-3-Llama-3.1-70B`, `hermes-4-405b`, `openrouter/hermes3:70b`). Consolidate into a single helper `is_nous_hermes_non_agentic()` in `hermes_cli.model_switch` so the CLI and the canonical check don't drift, and route the duplicate inline site in `cli.HermesCLI._print_warnings()` through the helper. Add a parametrized test covering positive matches (real Hermes-3/-4 names) and a broad set of negatives (custom Modelfiles, Qwen/Claude/GPT, older Nous-Hermes-2 families, bare "hermes", empty string, and the "brain-hermes-3-impostor" boundary case).	2026-04-13 04:33:52 -07:00
ismell0992-afk	3e99964789	fix(agent): prefer Ollama Modelfile num_ctx over GGUF training max _query_local_context_length was checking model_info.context_length (the GGUF training max) before num_ctx (the Modelfile runtime override), inverse to query_ollama_num_ctx. The two helpers therefore disagreed on the same model: hermes-brain:qwen3-14b-ctx32k # Modelfile: num_ctx 32768 underlying qwen3:14b GGUF # qwen3.context_length: 40960 query_ollama_num_ctx correctly returned 32768 (the value Ollama will actually allocate KV cache for). _query_local_context_length returned 40960, which let ContextCompressor grow conversations past 32768 before triggering compression — at which point Ollama silently truncated the prefix, corrupting context. Swap the order so num_ctx is checked first, matching query_ollama_num_ctx. Adds a parametrized test that seeds both values and asserts num_ctx wins. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-13 04:24:07 -07:00
Teknium	397eae5d93	fix: recover partial streamed content on connection failure When streaming fails after partial content delivery (e.g. OpenRouter timeout kills connection mid-response), the stub response now carries the accumulated streamed text instead of content=None. Two fixes: 1. The partial-stream stub response includes recovered content from _current_streamed_assistant_text — the text that was already delivered to the user via stream callbacks before the connection died. 2. The empty response recovery chain now checks for partial stream content BEFORE falling back to _last_content_with_tools (prior turn content) or wasting API calls on retries. This prevents: - Showing wrong content from a prior turn - Burning 3+ unnecessary retry API calls - Falling through to '(empty)' when the user already saw content The root cause: OpenRouter has a ~125s inactivity timeout. When Anthropic's SSE stream goes silent during extended reasoning, the proxy kills the connection. The model's text was already partially streamed but the stub discarded it, triggering the empty recovery chain which would show stale prior-turn content or waste retries.	2026-04-13 02:12:01 -07:00
Teknium	276d20e62c	fix(gateway): /restart uses service restart under systemd instead of detached subprocess The detached bash subprocess spawned by /restart gets killed by systemd's KillMode=mixed cgroup cleanup, leaving the gateway dead. Under systemd (detected via INVOCATION_ID env var), /restart now uses via_service=True which exits with code 75 — RestartForceExitStatus=75 in the unit file makes systemd auto-restart the service. The detached subprocess approach is preserved as fallback for non-systemd environments (Docker, tmux, foreground mode).	2026-04-12 22:32:19 -07:00
Teknium	e2a9b5369f	feat: web UI dashboard for managing Hermes Agent (#8756 ) * feat: web UI dashboard for managing Hermes Agent (salvage of #8204/#7621) Adds an embedded web UI dashboard accessible via `hermes web`: - Status page: agent version, active sessions, gateway status, connected platforms - Config editor: schema-driven form with tabbed categories, import/export, reset - API Keys page: set, clear, and view redacted values with category grouping - Sessions, Skills, Cron, Logs, and Analytics pages Backend: - hermes_cli/web_server.py: FastAPI server with REST endpoints - hermes_cli/config.py: reload_env() utility for hot-reloading .env - hermes_cli/main.py: `hermes web` subcommand (--port, --host, --no-open) - cli.py / commands.py: /reload slash command for .env hot-reload - pyproject.toml: [web] optional dependency extra (fastapi + uvicorn) - Both update paths (git + zip) auto-build web frontend when npm available Frontend: - Vite + React + TypeScript + Tailwind v4 SPA in web/ - shadcn/ui-style components, Nous design language - Auto-refresh status page, toast notifications, masked password inputs Security: - Path traversal guard (resolve().is_relative_to()) on SPA file serving - CORS localhost-only via allow_origin_regex - Generic error messages (no internal leak), SessionDB handles closed properly Tests: 47 tests covering reload_env, redact_key, API endpoints, schema generation, path traversal, category merging, internal key stripping, and full config round-trip. Original work by @austinpickett (PR #1813), salvaged by @kshitijk4poor (PR #7621 → #8204), re-salvaged onto current main with stale-branch regressions removed. * fix(web): clean up status page cards, always rebuild on `hermes web` - Remove config version migration alert banner from status page - Remove config version card (internal noise, not surfaced in TUI) - Reorder status cards: Agent → Gateway → Active Sessions (3-col grid) - `hermes web` now always rebuilds from source before serving, preventing stale web_dist when editing frontend files * feat(web): full-text search across session messages - Add GET /api/sessions/search endpoint backed by FTS5 - Auto-append prefix wildcards so partial words match (e.g. 'nimb' → 'nimby') - Debounced search (300ms) with spinner in the search icon slot - Search results show FTS5 snippets with highlighted match delimiters - Expanding a search hit auto-scrolls to the first matching message - Matching messages get a warning ring + 'match' badge - Inline term highlighting within Markdown (text, bold, italic, headings, lists) - Clear button (x) on search input for quick reset --------- Co-authored-by: emozilla <emozilla@nousresearch.com>	2026-04-12 22:26:28 -07:00
Dusk1e	c052cf0eea	fix(security): validate domain/service params in ha_call_service to prevent path traversal	2026-04-12 22:26:15 -07:00
Teknium	8a64f3e368	feat(gateway): notify /restart requester when gateway comes back online When a user sends /restart, the gateway now persists their routing info (platform, chat_id, thread_id) to .restart_notify.json. After the new gateway process starts and adapters connect, it reads the file, sends a 'Gateway restarted successfully' message to that specific chat, and cleans up the file. This follows the same pattern as _send_update_notification (used by /update). Thread IDs are preserved so the notification lands in the correct Telegram topic or Discord thread. Previously, after /restart the user had no feedback that the gateway was back — they had to send a message to find out. Now they get a proactive notification and know their session continues.	2026-04-12 22:23:48 -07:00
Teknium	83ca0844f7	fix: preserve dots in model names for OpenCode Zen and ZAI providers (#8794 ) OpenCode Zen was in _DOT_TO_HYPHEN_PROVIDERS, causing all dotted model names (minimax-m2.5-free, gpt-5.4, glm-5.1) to be mangled. The fix: Layer 1 (model_normalize.py): Remove opencode-zen from the blanket dot-to-hyphen set. Add an explicit block that preserves dots for non-Claude models while keeping Claude hyphenated (Zen's Claude endpoint uses anthropic_messages mode which expects hyphens). Layer 2 (run_agent.py _anthropic_preserve_dots): Add opencode-zen and zai to the provider allowlist. Broaden URL check from opencode.ai/zen/go to opencode.ai/zen/ to cover both Go and Zen endpoints. Add bigmodel.cn for ZAI URL detection. Also adds glm-5.1 to ZAI model lists in models.py and setup.py. Closes #7710 Salvaged from contributions by: - konsisumer (PR #7739, #7719) - DomGrieco (PR #8708) - Esashiero (PR #7296) - sharziki (PR #7497) - XiaoYingGee (PR #8750) - APTX4869-maker (PR #8752) - kagura-agent (PR #7157)	2026-04-12 21:22:59 -07:00
Teknium	a0cd2c5338	fix(gateway): verbose tool progress no longer truncates args when tool_preview_length is 0 (#8735 ) When tool_preview_length is 0 (default for platforms without a tier default, like Session), verbose mode was truncating args JSON to 200 characters. Since the user explicitly opted into verbose mode, they expect full tool call detail — the 200-char cap defeated the purpose. Now: tool_preview_length=0 means no truncation in verbose mode. Positive values still cap as before. Platform message-length limits handle overflow naturally.	2026-04-12 20:05:12 -07:00
Teknium	15b1a3aa69	fix: improve WhatsApp UX — chunking, formatting, streaming (#8723 ) Three changes that address the poor WhatsApp experience reported by users: 1. Reclassify WhatsApp from TIER_LOW to TIER_MEDIUM in display_config.py — enables streaming and tool progress via the existing Baileys /edit bridge endpoint. Users now see progressive responses instead of minutes of silence followed by a wall of text. 2. Lower MAX_MESSAGE_LENGTH from 65536 to 4096 and add proper chunking — send() now calls format_message() and truncate_message() before sending, then loops through chunks with a small delay between them. The base class truncate_message() already handles code block boundary detection (closes/reopens fences at chunk boundaries). reply_to is only set on the first chunk. 3. Override format_message() with WhatsApp-specific markdown conversion — converts bold to bold, ~~strike~~ to ~strike~, headers to bold text, and [links](url) to text (url). Code blocks and inline code are protected from conversion via placeholder substitution. Together these fix the two user complaints: - 'sends the whole code all the time' → now chunked at 4K with proper formatting - 'terminal gets interrupted and gets cooked' → streaming + tool progress give visual feedback so users don't accidentally interrupt with follow-up messages	2026-04-12 19:20:13 -07:00
Teknium	5fae356a85	fix: show full last assistant response when resuming a session (#8724 ) When resuming a session with --resume or -c, the last assistant response was truncated to 200 chars / 3 lines just like older messages in the recap. This forced users to waste tokens re-asking for the response. Now the last assistant message in the recap is shown in full with non-dim styling, so users can see exactly where they left off. Earlier messages remain truncated for compact display. Changes: - Track un-truncated text for the last assistant entry during collection - Replace last entry with full text after history trimming - Render last assistant entry with bold (non-dim) styling - Update existing truncation tests to use multi-message histories - Add new tests for full last response display (char + multiline)	2026-04-12 19:07:14 -07:00
Teknium	9e992df8ae	fix(telegram): use UTF-16 code units for message length splitting (#8725 ) Port from nearai/ironclaw#2304: Telegram's 4096 character limit is measured in UTF-16 code units, not Unicode codepoints. Characters outside the Basic Multilingual Plane (emoji like 😀, CJK Extension B, musical symbols) are surrogate pairs: 1 Python char but 2 UTF-16 units. Previously, truncate_message() used Python's len() which counts codepoints. This could produce chunks exceeding Telegram's actual limit when messages contain many astral-plane characters. Changes: - Add utf16_len() helper and _prefix_within_utf16_limit() for UTF-16-aware string measurement and truncation - Add _custom_unit_to_cp() binary-search helper that maps a custom-unit budget to the largest safe codepoint slice position - Update truncate_message() to accept optional len_fn parameter - Telegram adapter now passes len_fn=utf16_len when splitting messages - Fix fallback truncation in Telegram error handler to use _prefix_within_utf16_limit instead of codepoint slicing - Update send_message_tool.py to use utf16_len for Telegram platform - Add comprehensive tests: utf16_len, _prefix_within_utf16_limit, truncate_message with len_fn (emoji splitting, content preservation, code block handling) - Update mock lambdas in reply_mode tests to accept **kw for len_fn	2026-04-12 19:06:20 -07:00
Teknium	f724079d3b	fix(gateway): reject known-weak placeholder credentials at startup Port from openclaw/openclaw#64586: users who copy .env.example without changing placeholder values now get a clear error at startup instead of a confusing auth failure from the platform API. Also rejects placeholder API_SERVER_KEY when binding to a network-accessible address. Cherry-picked from PR #8677.	2026-04-12 18:05:41 -07:00
Teknium	c7d8d109ff	fix(matrix): trust m.mentions.user_ids as authoritative mention signal Port from openclaw/openclaw#64796: Per MSC3952 / Matrix v1.7, the m.mentions.user_ids field is the authoritative mention signal. Clients that populate m.mentions but don't duplicate @bot in the body text were being silently dropped when MATRIX_REQUIRE_MENTION=true. Cherry-picked from PR #8673.	2026-04-12 18:05:41 -07:00
Teknium	88a12af58c	feat: add `hermes debug share` — upload debug report to pastebin (#8681 ) * feat: add `hermes debug share` — upload debug report to pastebin Adds a new `hermes debug share` command that collects system info (via hermes dump), recent logs (agent.log, errors.log, gateway.log), and uploads the combined report to a paste service (paste.rs primary, dpaste.com fallback). Returns a shareable URL for support. Options: --lines N Number of log lines per file (default: 200) --expire N Paste expiry in days (default: 7, dpaste.com only) --local Print report locally without uploading Files: hermes_cli/debug.py - New module: paste upload + report collection hermes_cli/main.py - Wire cmd_debug + argparse subparser tests/hermes_cli/test_debug.py - 19 tests covering upload, collection, CLI * feat: upload full agent.log and gateway.log as separate pastes hermes debug share now uploads up to 3 pastes: 1. Summary report (system info + log tails) — always 2. Full agent.log (last ~500KB) — if file exists 3. Full gateway.log (last ~500KB) — if file exists Each paste uploads independently; log upload failures are noted but don't block the main report. Output shows all links aligned: Report https://paste.rs/abc agent.log https://paste.rs/def gateway.log https://paste.rs/ghi Also adds _read_full_log() with size-capped tail reading to stay within paste service limits (~512KB per file). * feat: prepend hermes dump to each log paste for self-contained context Each paste (agent.log, gateway.log) now starts with the hermes dump output so clicking any single link gives full system context without needing to cross-reference the summary report. Refactored dump capture into _capture_dump() — called once and reused across the summary report and each log paste. * fix: fall back to .1 rotated log when primary log is missing or empty When gateway.log (or agent.log) doesn't exist or is empty, the debug share now checks for the .1 rotation file. This is common — the gateway rotates logs and the primary file may not exist yet. Extracted _resolve_log_path() to centralize the fallback logic for both _read_log_tail() and _read_full_log(). * chore: remove unused display_hermes_home import	2026-04-12 18:05:14 -07:00
Teknium	bcad679799	fix(api_server): normalize array-based content parts in chat completions Some OpenAI-compatible clients (Open WebUI, LobeChat, etc.) send message content as an array of typed parts instead of a plain string: [{"type": "text", "text": "hello"}] The agent pipeline expects strings, so these array payloads caused silent failures or empty messages. Add _normalize_chat_content() with defensive limits (recursion depth, list size, output length) and apply it to both the Chat Completions and Responses API endpoints. The Responses path had inline normalization that only handled input_text/output_text — the shared function also handles the standard 'text' type. Salvaged from PR #7980 (ikelvingo) — only the content normalization; the SSE and Weixin changes in that PR were regressions and are not included. Co-authored-by: ikelvingo <ikelvingo@users.noreply.github.com>	2026-04-12 18:03:16 -07:00
Teknium	bc4e2744c3	test: add tests for compression config_context_length passthrough - Test that auxiliary.compression.context_length from config is forwarded to get_model_context_length (positive case) - Test that invalid/non-integer config values are silently ignored - Fix _make_agent() to set config=None (cherry-picked code reads self.config)	2026-04-12 17:52:34 -07:00
Teknium	0d0d27d45e	test(tts): add speed config tests for Edge, OpenAI, and MiniMax 12 tests covering: - Provider-specific speed overrides global speed - Global speed used as fallback - Default (no speed) preserves existing behavior - Edge SSML rate string conversion (positive/negative) - OpenAI speed clamping to 0.25-4.0 range	2026-04-12 16:46:18 -07:00
Teknium	a266238e1e	fix(weixin): streaming cursor, media uploads, markdown links, blank messages (#8665 ) Four fixes for the Weixin/WeChat adapter, synthesized from the best aspects of community PRs #8407, #8521, #8360, #7695, #8308, #8525, #7531, #8144, #8251. 1. Streaming cursor (▉) stuck permanently — WeChat doesn't support message editing, so the cursor appended during streaming can never be removed. Add SUPPORTS_MESSAGE_EDITING = False to WeixinAdapter and check it in gateway/run.py to use an empty cursor for non-edit platforms. (Fixes #8307, #8326) 2. Media upload failures — two bugs in _send_file(): a) upload_full_url path used PUT (404 on WeChat CDN); now uses POST. b) aes_key was base64(raw_bytes) but the iLink API expects base64(hex_string); images showed as grey boxes. (Fixes #8352, #7529) Also: unified both upload paths into _upload_ciphertext(), preferring upload_full_url. Added send_video/send_voice methods and voice_item media builder for audio/.silk files. Added video_md5 field. 3. Markdown links stripped — WeChat can't render [text](url), so format_message() now converts them to 'text (url)' plaintext. Code blocks are preserved. (Fixes #7617) 4. Blank message prevention — three guards: a) _split_text_for_weixin_delivery('') returns [] not [''] b) send() filters empty/whitespace chunks before _send_text_chunk c) _send_message() raises ValueError for empty text as safety net Community credit: joei4cm (#8407), lyonDan (#8521), SKFDJKLDG (#8360), tomqiaozc (#7695), joshleeeeee (#8308), luoxiao6645(#8525), longsizhuo (#7531), Astral-Yang (#8144), QingWei-Li (#8251).	2026-04-12 16:43:25 -07:00
Teknium	c83674dd77	fix: unify OpenClaw detection, add isatty guard, fix print_warning import Combines detection from both PRs into _detect_openclaw_processes(): - Cross-platform process scan (pgrep/tasklist/PowerShell) from PR #8102 - systemd service check from PR #8555 - Returns list[str] with details about what's found Fixes in cleanup warning (from PR #8555): - print_warning -> print_error/print_info (print_warning not in import chain) - Added isatty() guard for non-interactive sessions - Removed duplicate _check_openclaw_running() in favor of shared function Updated all tests to match new API.	2026-04-12 16:40:37 -07:00
dirtyfancy	9fb36738a7	fix(claw): address Copilot review on Windows detection and non-interactive prompt - Use PowerShell to inspect node.exe command lines on Windows, since tasklist output does not include them. - Also check for dedicated openclaw.exe/clawd.exe processes. - Skip the interactive prompt in non-interactive sessions so the preview-only behavior is preserved. - Update tests accordingly. Relates to #7907	2026-04-12 16:40:37 -07:00
dirtyfancy	5af9614f6d	fix(claw): warn if OpenClaw is running before migration Add _is_openclaw_running() and _warn_if_openclaw_running() to detect OpenClaw processes (via pgrep/tasklist) before hermes claw migrate. Warns the user that messaging platforms only allow one active session per bot token, and lets them cancel or continue. Fixes #7907	2026-04-12 16:40:37 -07:00
alt-glitch	5e1197a42e	fix(gateway): harden Docker/container gateway pathway Centralize container detection in hermes_constants.is_container() with process-lifetime caching, matching existing is_wsl()/is_termux() patterns. Dedup _is_inside_container() in config.py to delegate to the new function. Add _run_systemctl() wrapper that converts FileNotFoundError to RuntimeError for defense-in-depth — all 10 bare subprocess.run(_systemctl_cmd(...)) call sites now route through it. Make supports_systemd_services() return False in containers and when systemctl binary is absent (shutil.which check). Add Docker-specific guidance in gateway_command() for install/uninstall/start subcommands — exit 0 with helpful instructions instead of crashing. Make 'hermes status' show 'Manager: docker (foreground)' and 'hermes dump' show 'running (docker, pid N)' inside containers. Fix setup_gateway() to use supports_systemd instead of _is_linux for all systemd-related branches, and show Docker restart policy instructions in containers. Replace inline /.dockerenv check in voice_mode.py with is_container(). Fixes #7420 Co-authored-by: teknium1 <teknium1@users.noreply.github.com>	2026-04-12 16:36:11 -07:00
sprmn24	18ab5c99d1	fix(backup): correct marker filenames in _validate_backup_zip The backup validation checked for 'hermes_state.db' and 'memory_store.db' as telltale markers of a valid Hermes backup zip. Neither name exists in a real Hermes installation — the actual database file is 'state.db' (hermes_state.py: DEFAULT_DB_PATH = get_hermes_home() / 'state.db'). A fresh Hermes installation produces: ~/.hermes/state.db (actual name) ~/.hermes/config.yaml ~/.hermes/.env Because the marker set never matched 'state.db', a backup zip containing only 'state.db' plus 'config.yaml' would fail validation with: 'zip does not appear to be a Hermes backup' and the import would exit with sys.exit(1), silently rejecting a valid backup. Fix: replace the wrong marker names with the correct filename. Adds TestValidateBackupZip with three cases: - state.db is accepted as a valid marker - old wrong names (hermes_state.db, memory_store.db) alone are rejected - config.yaml continues to pass (existing behaviour preserved)	2026-04-12 16:35:56 -07:00
Teknium	d6785dc4d4	fix: empty response recovery for reasoning models (mimo, qwen, GLM) (#8609 ) Three fixes for the (empty) response bug affecting open reasoning models: 1. Allow retries after prefill exhaustion — models like mimo-v2-pro always populate reasoning fields via OpenRouter, so the old 'not _has_structured' guard on the retry path blocked retries for EVERY reasoning model after the 2 prefill attempts. Now: 2 prefills + 3 retries = 6 total attempts before (empty). 2. Reset prefill/retry counters on tool-call recovery — the counters accumulated across the entire conversation, never resetting during tool-calling turns. A model cycling empty→prefill→tools→empty burned both prefill attempts and the third empty got zero recovery. Now counters reset when prefill succeeds with tool calls. 3. Strip think blocks before _truly_empty check — inline <think> content made the string non-empty, skipping both retry paths. Reported by users on Telegram with xiaomi/mimo-v2-pro and qwen3.5 models. Reproduced: qwen3.5-9b emits tool calls as XML in reasoning field instead of proper function calls, causing content=None + tool_calls=None + reasoning with embedded <tool_call> XML. Prefill recovery works but counter accumulation caused permanent (empty) in long sessions.	2026-04-12 15:38:11 -07:00
Teknium	1179918746	fix: salvage follow-ups for Feishu QR onboarding (#7706 ) - Remove duplicate _setup_feishu() definition (old 3-line version left behind by cherry-pick — Python picked the new one but dead code remained) - Remove misleading 'Disable direct messages' DM option — the Feishu adapter has no DM policy mechanism, so 'disable' produced identical env vars to 'pairing'. Users who chose 'disable' would still see pairing prompts. Reduced to 3 options: pairing, allow-all, allowlist. - Fix test_probe_returns_bot_info_on_success and test_probe_returns_none_on_failure: patch FEISHU_AVAILABLE=True so probe_bot() takes the SDK path when lark_oapi is not installed	2026-04-12 13:05:56 -07:00
Shuo	d7785f4d5b	feat(feishu): add scan-to-create onboarding for Feishu / Lark Add a QR-based onboarding flow to `hermes gateway setup` for Feishu / Lark. Users scan a QR code with their phone and the platform creates a fully configured bot application automatically — matching the existing WeChat QR login experience. Setup flow: - Choose between QR scan-to-create (new app) or manual credential input (existing app) - Connection mode selection (WebSocket / Webhook) - DM security policy (pairing / open / allowlist / disabled) - Group chat policy (open with @mention / disabled) Implementation: - Onboard functions (init/begin/poll/QR/probe) in gateway/platforms/feishu.py - _setup_feishu() in hermes_cli/gateway.py with manual fallback - probe_bot uses lark_oapi SDK when available, raw HTTP fallback otherwise - qr_register() catches expected errors (network/protocol), propagates bugs - Poll handles HTTP 4xx JSON responses and feishu/lark domain auto-detection Tests: - 25 tests for onboard module (registration, QR, probe, contract, negative paths) - 16 tests for setup flow (credentials, connection mode, DM policy, group policy, adapter integration verifying env vars produce valid FeishuAdapterSettings) Change-Id: I720591ee84755f32dda95fbac4b26dc82cbcf823	2026-04-12 13:05:56 -07:00
Teknium	400fe9b2a1	fix: add <thought> stripping to auxiliary_client + tests auxiliary_client.py had its own regex mirroring _strip_think_blocks but was missing the <thought> variant. Also adds test coverage for <thought> paired and orphaned tags.	2026-04-12 12:44:49 -07:00
Teknium	06a17c57ae	fix: improve profile creation UX — seed SOUL.md + credential warning (#8553 ) Fresh profiles (created without --clone) now: - Auto-seed a default SOUL.md immediately, so users have a file to customize right away instead of discovering it only after first use - Print a clear warning that the profile has no API keys and will inherit from the shell environment unless configured separately - Show the SOUL.md path for personality customization Previously, fresh profiles started with no SOUL.md (only seeded on first use via ensure_hermes_home), no mention of credential isolation, and no guidance about customizing personality. Users reported confusion about profiles using the wrong model/plan tokens and SOUL.md not being read — both traced to operational gaps in the creation UX. Closes #8093 (investigated: code correctly loads SOUL.md from profile HERMES_HOME; issue was operational, not a code bug).	2026-04-12 12:22:34 -07:00
Brooklyn Nicholson	2aea75e91e	Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor	2026-04-12 13:18:55 -05:00
Teknium	4eecaf06e4	fix: prevent duplicate update prompt spam in gateway watcher (#8343 ) The _watch_update_progress() poll loop never deleted .update_prompt.json after forwarding the prompt to the user, causing the same prompt to be re-sent every poll cycle (2s). Two fixes: 1. Delete .update_prompt.json after forwarding — the update process only polls for .update_response, it doesn't need the prompt file to persist. 2. Guard re-sends with _update_prompt_pending check — belt-and-suspenders to prevent duplicates even under race conditions. Add regression test asserting the prompt is sent exactly once.	2026-04-12 04:52:59 -07:00
Teknium	45e60904c6	fix: fall back to provider's default model when model config is empty (#8303 ) When a user configures a provider (e.g. `hermes auth add openai-codex`) but never selects a model via `hermes model`, the gateway and CLI would pass an empty model string to the API, causing: 'Codex Responses request model must be a non-empty string' Now both gateway (_resolve_session_agent_runtime) and CLI (_ensure_runtime_credentials) detect an empty model and fill it from the provider's first catalog entry in _PROVIDER_MODELS. This covers all providers that have a static model list (openai-codex, anthropic, gemini, copilot, etc.). The fix is conservative: it only triggers when model is truly empty and a known provider was resolved. Explicit model choices are never overridden.	2026-04-12 03:53:30 -07:00
Teknium	b6b6b02f0f	fix: prevent unwanted session auto-reset after graceful gateway restarts (#8299 ) When the gateway shuts down gracefully (hermes update, gateway restart, /restart), it now writes a .clean_shutdown marker file. On the next startup, if this marker exists, suspend_recently_active() is skipped and the marker is cleaned up. Previously, suspend_recently_active() fired on EVERY startup — including planned restarts from hermes update or hermes gateway restart. This caused users to lose their conversation history unexpectedly: the session would be marked as suspended, and the next message would trigger an auto-reset with a notification the user never asked for. The original purpose of suspend_recently_active() is crash recovery — preventing stuck sessions that were mid-processing when the gateway died unexpectedly. Graceful shutdowns already drain active agents via _drain_active_agents(), so there is no stuck-session risk. After a crash (no marker written), suspension still fires as before. Fixes the scenario where a user asks the agent to run hermes update, the gateway restarts, and the user's next message gets an unwanted 'Session automatically reset' notification with their history cleared.	2026-04-12 03:03:07 -07:00
Teknium	56e3ee2440	fix: write update exit code before gateway restart (cgroup kill race) (#8288 ) When /update runs via Telegram, hermes update --gateway is spawned inside the gateway's systemd cgroup. The update process itself calls systemctl restart hermes-gateway, which tears down the cgroup with KillMode=mixed — SIGKILL to all remaining processes. The wrapping bash shell is killed before it can execute the exit-code epilogue, so .update_exit_code is never created. The new gateway's update watcher then polls for 30 minutes and sends a spurious timeout message. Fix: write .update_exit_code from Python inside cmd_update() immediately after the git pull + pip install succeed ("Update complete!"), before attempting the gateway restart. The shell epilogue still writes it too (idempotent overwrite), but now the marker exists even when the process is killed mid-restart.	2026-04-12 02:33:21 -07:00
Teknium	b321330362	feat: add WSL environment hint to system prompt (#8285 ) When running inside WSL (Windows Subsystem for Linux), inject a hint into the system prompt explaining that the Windows host filesystem is mounted at /mnt/c/, /mnt/d/, etc. This lets the agent naturally translate Windows paths (Desktop, Documents) to their /mnt/ equivalents without the user needing to configure anything. Uses the existing is_wsl() detection from hermes_constants (cached, checks /proc/version for 'microsoft'). Adds build_environment_hints() in prompt_builder.py — extensible for Termux, Docker, etc. later. Closes the UX gap where WSL users had to manually explain path translation to the agent every session.	2026-04-12 02:26:28 -07:00
Teknium	95fa78eb6c	fix: write refreshed Codex tokens back to ~/.codex/auth.json (#8277 ) OpenAI OAuth refresh tokens are single-use and rotate on every refresh. When Hermes refreshes a Codex token, it consumed the old refresh_token but never wrote the new pair back to ~/.codex/auth.json. This caused Codex CLI and VS Code to fail with 'refresh_token_reused' on their next refresh attempt. This mirrors the existing Anthropic write-back pattern where refreshed tokens are written to ~/.claude/.credentials.json via _write_claude_code_credentials(). Changes: - Add _write_codex_cli_tokens() in hermes_cli/auth.py (parallel to _write_claude_code_credentials in anthropic_adapter.py) - Call it from _refresh_codex_auth_tokens() (non-pool refresh path) - Call it from credential_pool._refresh_entry() (pool happy path + retry) - Add tests for the new write-back behavior - Update existing test docstring to clarify _save_codex_tokens vs _write_codex_cli_tokens separation Fixes refresh token conflict reported by @ec12edfae2cb221	2026-04-12 02:05:20 -07:00
Teknium	ae6820a45a	fix(setup): validate base URL input in hermes model flow (#8264 ) Reject non-URL values (e.g. shell commands typed by mistake) in the base URL prompt during provider setup. Previously any string was saved as-is to .env, breaking connectivity when the garbage value was used as the API endpoint. Adds http:// / https:// prefix check with a clear error message. The custom-endpoint flow already had this validation (line 1620); this brings the generic API-key provider flow to parity. Triggered by a user support case where 'nano ~/.hermes/.env' was accidentally entered as GLM_BASE_URL during Z.AI setup.	2026-04-12 01:51:57 -07:00
Teknium	078dba015d	fix: three provider-related bugs (#8161 , #8181 , #8147 ) (#8243 ) - Add openai/openai-codex -> openai mapping to PROVIDER_TO_MODELS_DEV so context-length lookups use models.dev data instead of 128k fallback. Fixes #8161. - Set api_mode from custom_providers entry when switching via hermes model, and clear stale api_mode when the entry has none. Also extract api_mode in _named_custom_provider_map(). Fixes #8181. - Convert OpenAI image_url content blocks to Anthropic image blocks when the endpoint is Anthropic-compatible (MiniMax, MiniMax-CN, or any URL containing /anthropic). Fixes #8147.	2026-04-12 01:44:18 -07:00
Harish Kukreja	b1f13a8c5f	fix(agent): route compression aux through live session runtime	2026-04-12 01:34:52 -07:00
bravohenry	81ac62c0e9	fix(weixin): split chatty short replies into separate bubbles, keep structured content together Add content-aware splitting to compact mode: short chat-like exchanges (2-6 short lines without headings/lists/quotes) get separate message bubbles for a natural chat feel, while structured content (tables, headings with body, numbered lists) stays in a single message. Cherry-picked from PR #7587 by bravohenry, adapted to the compact/legacy split_per_line architecture from #7903.	2026-04-12 00:38:07 -07:00
Teknium	f53a5a7fe1	fix: suppress duplicate completion notifications when agent already consumed output via wait/poll/log (#8228 ) When the agent calls process(action='wait') or process(action='poll') and gets the exited status, the completion_queue notification is redundant — the agent already has the output from the tool return. Previously, the drain loops in CLI and gateway would still inject the [SYSTEM: Background process completed] message, causing the agent to receive the same information twice. Fix: track session IDs in _completion_consumed set when wait/poll/log returns an exited process. Drain loops in cli.py and gateway watcher skip completion events for consumed sessions. Watch pattern events are never suppressed (they have independent semantics). Adds 4 tests covering wait/poll/log marking and running-process negative case.	2026-04-12 00:36:22 -07:00
Teknium	fdf55e0fe9	feat(cli): show random tip on new session start (#8225 ) Add a 'tip of the day' feature that displays a random one-liner about Hermes Agent features on every new session — CLI startup, /clear, /new, and gateway /new across all messaging platforms. - New hermes_cli/tips.py module with 210 curated tips covering slash commands, keybindings, CLI flags, config options, tools, gateway platforms, profiles, sessions, memory, skills, cron, voice, security, and more - CLI: tips display in skin-aware dim gold color after the welcome line - Gateway: tips append to the /new and /reset response on all platforms - Fully wrapped in try/except — tips are non-critical and never break startup or reset Display format (CLI): ✦ Tip: /btw <question> asks a quick side question without tools or history. Display format (gateway): ✨ Session reset! Starting fresh. ✦ Tip: hermes -c resumes your most recent CLI session.	2026-04-12 00:34:01 -07:00
opriz	36f57dbc51	fix(migration): don't auto-archive OpenClaw source directory Remove auto-archival from hermes claw migrate — not its responsibility (hermes claw cleanup is still there for that). Skip MESSAGING_CWD when it points inside the OpenClaw source directory, which was the actual root cause of agent confusion after migration. Use Path.is_relative_to() for robust path containment check. Salvaged from PR #8192 by opriz. Co-authored-by: opriz <opriz@users.noreply.github.com>	2026-04-12 00:33:54 -07:00
Teknium	1871227198	feat: rebrand OpenClaw references to Hermes during migration - Add rebrand_text() that replaces OpenClaw, Open Claw, Open-Claw, ClawdBot, and MoltBot with Hermes (case-insensitive, word-boundary) - Apply rebranding to memory entries (MEMORY.md, USER.md, daily memory) - Apply rebranding to SOUL.md and workspace instructions via new transform parameter on copy_file() - Fix moldbot -> moltbot typo across codebase (claw.py, migration script, docs, tests) - Add unit tests for rebrand_text and integration tests for memory and soul migration rebranding	2026-04-12 00:33:54 -07:00
Teknium	eb2a49f95a	fix: openai-codex and anthropic not appearing in /model picker for external credentials (#8224 ) Users whose credentials exist only in external files — OpenAI Codex OAuth tokens in ~/.codex/auth.json or Anthropic Claude Code credentials in ~/.claude/.credentials.json — would not see those providers in the /model picker, even though hermes auth and hermes model detected them. Root cause: list_authenticated_providers() only checked the raw Hermes auth store and env vars. External credential file fallbacks (Codex CLI import, Claude Code file discovery) were never triggered. Fix (three parts): 1. _seed_from_singletons() in credential_pool.py: openai-codex now imports from ~/.codex/auth.json when the Hermes auth store is empty, mirroring resolve_codex_runtime_credentials(). 2. list_authenticated_providers() in model_switch.py: auth store + pool checks now run for ALL providers (not just OAuth auth_type), catching providers like anthropic that support both API key and OAuth. 3. list_authenticated_providers(): direct check for anthropic external credential files (Claude Code, Hermes PKCE). The credential pool intentionally gates anthropic behind is_provider_explicitly_configured() to prevent auxiliary tasks from silently consuming tokens. The /model picker bypasses this gate since it is discovery-oriented.	2026-04-12 00:33:42 -07:00
Teknium	4cadfef8e3	fix(cli): restore stacked tool progress scrollback in TUI (#8201 ) The TUI transition (`4970705`, `f83e86d`) replaced stacked per-tool history lines with a single live-updating spinner widget. While the spinner provides a nice live timer, it removed the scrollback history that users relied on to see what the agent did during a session. This restores stacked tool progress lines in 'all' and 'new' modes by printing persistent scrollback lines via _cprint() when tools complete, in addition to the existing live spinner display. Behavior per mode: - off: no scrollback lines, no spinner (unchanged) - new: scrollback line on completion, skipping consecutive same-tool repeats - all: scrollback line on every tool completion - verbose: no scrollback (run_agent.py handles verbose output directly) Implementation: - Store function_args from tool.started events in _pending_tool_info - On tool.completed, pop stored args and format via get_cute_tool_message() - FIFO queue per function_name handles concurrent tool execution - 'new' mode tracks _last_scrollback_tool for dedup - State cleared at end of agent run Reported by community user Mr.D — the stacked history provides transparency into what the agent is doing, which builds trust. Addresses user report from Discord about lost tool call visibility.	2026-04-11 23:22:34 -07:00
Teknium	1ca9b19750	feat: add network.force_ipv4 config to fix IPv6 timeout issues (#8196 ) On servers with broken or unreachable IPv6, Python's socket.getaddrinfo returns AAAA records first. urllib/httpx/requests all try IPv6 connections first and hang for the full TCP timeout before falling back to IPv4. This affects web_extract, web_search, the OpenAI SDK, and all HTTP tools. Adds network.force_ipv4 config option (default: false) that monkey-patches socket.getaddrinfo to resolve as AF_INET when the caller didn't specify a family. Falls back to full resolution if no A record exists, so pure-IPv6 hosts still work. Applied early at all three entry points (CLI, gateway, cron scheduler) before any HTTP clients are created. Reported by user @29n — Chinese Ubuntu server with unreachable IPv6 causing timeouts on lobste.rs and other IPv6-enabled sites while Google/GitHub worked fine (IPv4-only resolution).	2026-04-11 23:12:11 -07:00
Tom Qiao	8a48c58bd3	fix(gateway): add missing RedactingFormatter import The gateway startup path references RedactingFormatter without importing it, causing a NameError crash when launched with a verbosity flag (e.g. via launchd --replace). Fixes #8044 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-11 19:38:05 -07:00
Teknium	a0a02c1bc0	feat: /compress <focus> — guided compression with focus topic (#8017 ) Adds an optional focus topic to /compress: `/compress database schema` guides the summariser to preserve information related to the focus topic (60-70% of summary budget) while compressing everything else more aggressively. Inspired by Claude Code's /compact <focus>. Changes: - context_compressor.py: focus_topic parameter on _generate_summary() and compress(); appends FOCUS TOPIC guidance block to the LLM prompt - run_agent.py: focus_topic parameter on _compress_context(), passed through to the compressor - cli.py: _manual_compress() extracts focus topic from command string, preserves existing manual_compression_feedback integration (no regression) - gateway/run.py: _handle_compress_command() extracts focus from event args and passes through — full gateway parity - commands.py: args_hint="[focus topic]" on /compress CommandDef Salvaged from PR #7459 (CLI /compress focus only — /context command deferred). 15 new tests across CLI, compressor, and gateway.	2026-04-11 19:23:29 -07:00
helix4u	cfbfc4c3f1	fix(discord): decouple readiness from slash sync	2026-04-11 19:22:14 -07:00
Teknium	fa7cd44b92	feat: add hermes backup and hermes import commands (#7997 ) * feat: add `hermes backup` and `hermes import` commands hermes backup — creates a zip of ~/.hermes/ (config, skills, sessions, profiles, memories, skins, cron jobs, etc.) excluding the hermes-agent codebase, __pycache__, and runtime PID files. Defaults to ~/hermes-backup-<timestamp>.zip, customizable with -o. hermes import <zipfile> — restores from a backup zip, validating it looks like a hermes backup before extracting. Handles .hermes/ prefix stripping, path traversal protection, and confirmation prompts (skip with --force). 29 tests covering exclusion rules, backup creation, import validation, prefix detection, path traversal blocking, confirmation flow, and a full round-trip test. * test: improve backup/import coverage to 97% Add 17 additional tests covering: - _format_size helper (bytes through terabytes) - Nonexistent hermes home error exit - Output path is a directory (auto-names inside it) - Output without .zip suffix (auto-appends) - Empty hermes home (all files excluded) - Permission errors during backup and import - Output zip inside hermes root (skips itself) - Not-a-zip file rejection - EOFError and KeyboardInterrupt during confirmation - 500+ file progress display - Directory-only zip prefix detection Remove dead code branch in _detect_prefix (unreachable guard). * feat: auto-restore profile wrapper scripts on import After extracting backup files, hermes import now scans profiles/ for subdirectories with config.yaml or .env and recreates the ~/.local/bin wrapper scripts so profile aliases (e.g. 'coder chat') work immediately. Also prints guidance for re-installing gateway services per profile. Handles edge cases: - Skips profile dirs without config (not real profiles) - Skips aliases that collide with existing commands - Gracefully degrades if hermes_cli.profiles isn't available (fresh install) - Shows PATH hint if ~/.local/bin isn't in PATH 3 new profile restoration tests (49 total).	2026-04-11 19:15:50 -07:00
Siddharth Balyan	50d86b3c71	fix(matrix): replace pickle crypto store with SQLite, fix E2EE decryption (#7981 ) Fixes #7952 — Matrix E2EE completely broken after mautrix migration. - Replace MemoryCryptoStore + pickle/HMAC persistence with mautrix's PgCryptoStore backed by SQLite via aiosqlite. Crypto state now persists reliably across restarts without fragile serialization. - Add handle_sync() call on initial sync response so to-device events (queued Megolm key shares) are dispatched to OlmMachine instead of being silently dropped. - Add _verify_device_keys_on_server() after loading crypto state. Detects missing keys (re-uploads), stale keys from migration (attempts re-upload), and corrupted state (refuses E2EE). - Add _CryptoStateStore adapter wrapping MemoryStateStore to satisfy mautrix crypto's StateStore interface (is_encrypted, get_encryption_info, find_shared_rooms). - Remove redundant share_keys() call from sync loop — OlmMachine already handles this via DEVICE_OTK_COUNT event handler. - Fix datetime vs float TypeError in session.py suspend_recently_active() that crashed gateway startup. - Add aiosqlite and asyncpg to [matrix] extra in pyproject.toml. - Update test mocks for PgCryptoStore/Database and add query_keys mock for key verification. 174 tests pass. - Add E2EE upgrade/migration docs to Matrix user guide.	2026-04-12 07:24:46 +05:30
Siddharth Balyan	27eeea0555	perf(ssh,modal): bulk file sync via tar pipe and tar/base64 archive (#8014 ) * perf(ssh,modal): bulk file sync via tar pipe and tar/base64 archive SSH: symlink-staging + tar -ch piped over SSH in a single TCP stream. Eliminates per-file scp round-trips. Handles timeout (kills both processes), SSH Popen failure (kills tar), and tar create failure. Modal: in-memory gzipped tar archive, base64-encoded, decoded+extracted in one exec call. Checks exit code and raises on failure. Both backends use shared helpers extracted into file_sync.py: - quoted_mkdir_command() — mirrors existing quoted_rm_command() - unique_parent_dirs() — deduplicates parent dirs from file pairs Migrates _ensure_remote_dirs to use the new helpers. 28 new tests (21 SSH + 7 Modal), all passing. Closes #7465 Closes #7467 * fix(modal): pipe stdin to avoid ARG_MAX, clean up review findings - Modal bulk upload: stream base64 payload through proc.stdin in 1MB chunks instead of embedding in command string (Modal SDK enforces 64KB ARG_MAX_BYTES — typical payloads are ~4.3MB) - Modal single-file upload: same stdin fix, add exit code checking - Remove what-narrating comments in ssh.py and modal.py (keep WHY comments: symlink staging rationale, SIGPIPE, deadlock avoidance) - Remove unnecessary `sandbox = self._sandbox` alias in modal bulk - Daytona: use shared helpers (unique_parent_dirs, quoted_mkdir_command) instead of inlined duplicates --------- Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com>	2026-04-12 06:18:05 +05:30
Teknium	fd73937ec8	feat: component-separated logging with session context and filtering (#7991 ) * feat: component-separated logging with session context and filtering Phase 1 — Gateway log isolation: - gateway.log now only receives records from gateway.* loggers (platform adapters, session management, slash commands, delivery) - agent.log remains the catch-all (all components) - errors.log remains WARNING+ catch-all - Moved gateway.log handler creation from gateway/run.py into hermes_logging.setup_logging(mode='gateway') with _ComponentFilter Phase 2 — Session ID injection: - Added set_session_context(session_id) / clear_session_context() API using threading.local() for per-thread session tracking - _SessionFilter enriches every log record with session_tag attribute - Log format: '2026-04-11 10:23:45 INFO [session_id] logger.name: msg' - Session context set at start of run_conversation() in run_agent.py - Thread-isolated: gateway conversations on different threads don't leak Phase 3 — Component filtering in hermes logs: - Added --component flag: hermes logs --component gateway\|agent\|tools\|cli\|cron - COMPONENT_PREFIXES maps component names to logger name prefixes - Works with all existing filters (--level, --session, --since, -f) - Logger name extraction handles both old and new log formats Files changed: - hermes_logging.py: _SessionFilter, _ComponentFilter, COMPONENT_PREFIXES, set/clear_session_context(), gateway.log creation in setup_logging() - gateway/run.py: removed redundant gateway.log handler (now in hermes_logging) - run_agent.py: set_session_context() at start of run_conversation() - hermes_cli/logs.py: --component filter, logger name extraction - hermes_cli/main.py: --component argument on logs subparser Addresses community request for component-separated, filterable logging. Zero changes to existing logger names — __name__ already provides hierarchy. * fix: use LogRecord factory instead of per-handler _SessionFilter The _SessionFilter approach required attaching a filter to every handler we create. Any handler created outside our _add_rotating_handler (like the gateway stderr handler, or third-party handlers) would crash with KeyError: 'session_tag' if it used our format string. Replace with logging.setLogRecordFactory() which injects session_tag into every LogRecord at creation time — process-global, zero per-handler wiring needed. The factory is installed at import time (before setup_logging) so session_tag is available from the moment hermes_logging is imported. - Idempotent: marker attribute prevents double-wrapping on module reload - Chains with existing factory: won't break third-party record factories - Removes _SessionFilter from _add_rotating_handler and setup_verbose_logging - Adds tests: record factory injection, idempotency, arbitrary handler compat	2026-04-11 17:23:36 -07:00
Teknium	723b5bec85	feat: per-platform display verbosity configuration (#8006 ) Add display.platforms section to config.yaml for per-platform overrides of display settings (tool_progress, show_reasoning, streaming, tool_preview_length). Each platform gets sensible built-in defaults based on capability tier: - High (telegram, discord): tool_progress=all, streaming follows global - Medium (slack, mattermost, matrix, feishu): tool_progress=new - Low (signal, whatsapp, bluebubbles, wecom, etc.): tool_progress=off, streaming=false - Minimal (email, sms, webhook, homeassistant): tool_progress=off, streaming=false Example config: display: platforms: telegram: tool_progress: all show_reasoning: true slack: tool_progress: off Resolution order: platform override > global setting > built-in platform default. Changes: - New gateway/display_config.py: resolver module with tier-based platform defaults - gateway/run.py: tool_progress, tool_preview_length, streaming, show_reasoning all resolve per-platform via the new resolver - /verbose command: now cycles tool_progress per-platform (saves to display.platforms.<platform>.tool_progress instead of global) - /reasoning show\|hide: now saves show_reasoning per-platform - Config version 15 -> 16: migrates tool_progress_overrides into display.platforms - Backward compat: legacy tool_progress_overrides still read as fallback - 27 new tests for resolver, normalization, migration, backward compat - Updated verbose command tests for per-platform behavior Addresses community request for per-channel verbosity control (Guillaume Meyer, Nathan Danielsen) — high verbosity on backchannel Telegram, low on customer-facing Slack, none on email.	2026-04-11 17:20:34 -07:00
Teknium	14ccd32cee	refactor(terminal): remove check_interval parameter (#8001 ) The check_interval parameter on terminal_tool sent periodic output updates to the gateway chat, but these were display-only — the agent couldn't see or act on them. This added schema bloat and introduced a bug where notify_on_complete=True was silently dropped when check_interval was also set (the not-check_interval guard skipped fast-watcher registration, and the check_interval watcher dict was missing the notify_on_complete key). Removing check_interval entirely: - Eliminates the notify_on_complete interaction bug - Reduces tool schema size (one fewer parameter for the model) - Simplifies the watcher registration path - notify_on_complete (agent wake-on-completion) still works - watch_patterns (output alerting) still works - process(action='poll') covers manual status checking Closes #7947 (root cause eliminated rather than patched).	2026-04-11 17:16:11 -07:00
Mateus Scheuer Macedo	06f862fa1b	feat(cli): add native /model picker modal for provider → model selection When /model is called with no arguments in the interactive CLI, open a two-step prompt_toolkit modal instead of the previous text-only listing: 1. Provider selection — curses_single_select with all authenticated providers 2. Model selection — live API fetch with curated fallback Also fixes: - OpenAI Codex model normalization (openai/gpt-5.4 → gpt-5.4) - Dedicated Codex validation path using provider_model_ids() Preserves curses_radiolist (used by setup, tools, plugins) alongside the new curses_single_select. Retains tool elapsed timer in spinner. Cherry-picked from PR #7438 by MestreY0d4-Uninter.	2026-04-11 17:16:06 -07:00
Teknium	39cd57083a	refactor: remove budget warning injection system (dead code) The _get_budget_warning() method already returned None unconditionally — the entire budget warning system was disabled. Remove all dead code: - _BUDGET_WARNING_RE regex - _strip_budget_warnings_from_history() function and its call site - Both injection blocks (concurrent + sequential tool execution) - _get_budget_warning() method - 7 tests for the removed functions The budget exhaustion grace call system (_budget_exhausted_injected, _budget_grace_call) is a separate recovery mechanism and is preserved.	2026-04-11 16:56:33 -07:00
Siddharth Balyan	cab814af15	feat(nix): container-aware CLI — auto-route into managed container (#7543 ) * feat(nix): container-aware CLI — auto-route all subcommands into managed container When container.enable = true, the host `hermes` CLI transparently execs every subcommand into the managed Docker/Podman container. A symlink bridge (~/.hermes -> /var/lib/hermes/.hermes) unifies state between host and container so sessions, config, and memories are shared. CLI changes: - Global routing before subcommand dispatch (all commands forwarded) - docker exec with -u exec_user, env passthrough (TERM, COLORTERM, LANG, LC_ALL), TTY-aware flags - Retry with spinner on failure (TTY: 5s, non-TTY: 10s silent) - Hard fail instead of silent fallback - HERMES_DEV=1 env var bypasses routing for development - No routing messages (invisible to user) NixOS module changes: - container.hostUsers option: lists users who get ~/.hermes symlink and automatic hermes group membership - Activation script creates symlink bridge (with backup of existing ~/.hermes dirs), writes exec_user to .container-mode - Cleanup on disable: removes symlinks + .container-mode + stops service - Warning when hostUsers set without addToSystemPackages * fix: address review — reuse sudo var, add chown -h on symlink update - hermes_cli/main.py: reuse the existing `sudo` variable instead of redundant `shutil.which("sudo")` call that could return None - nix/nixosModules.nix: add missing `chown -h` when updating an existing symlink target so ownership stays consistent with the fresh-create and backup-replace branches * fix: address remaining review items from cursor bugbot - hermes_cli/main.py: move container routing BEFORE parse_args() so --help, unrecognised flags, and all subcommands are forwarded transparently into the container instead of being intercepted by argparse on the host (high severity) - nix/nixosModules.nix: resolve home dirs via config.users.users.${user}.home instead of hardcoding /home/${user}, supporting users with custom home directories (medium severity) - nix/nixosModules.nix: gate hostUsers group membership on container.enable so setting hostUsers without container mode doesn't silently add users to the hermes group (low severity) * fix: simplify container routing — execvp, no retries, let it crash - Replace subprocess.run retry loop with os.execvp (no idle parent process) - Extract _probe_container helper for sudo detection with 15s timeout - Narrow exception handling: FileNotFoundError only in get_container_exec_info, catch TimeoutExpired specifically, remove silent except Exception: pass - Collapse needs_sudo + sudo into single sudo_path variable - Simplify NixOS symlink creation from 4 branches to 2 - Gate NixOS sudoers hint with "On NixOS:" prefix - Full test rewrite: 18 tests covering execvp, sudo probe, timeout, permissions --------- Co-authored-by: Hermes Agent <hermes@nousresearch.com>	2026-04-12 05:17:46 +05:30
Teknium	5c2ecdec49	fix: use ceiling division for token estimation, deduplicate inline formula Switch estimate_tokens_rough(), estimate_messages_tokens_rough(), and estimate_request_tokens_rough() from floor division (len // 4) to ceiling division ((len + 3) // 4). Short texts (1-3 chars) previously estimated as 0 tokens, causing the compressor and pre-flight checks to systematically undercount when many short tool results are present. Also replaced the inline duplicate formula in run_conversation() (total_chars // 4) with a call to the shared estimate_messages_tokens_rough() function. Updated 4 tests that hardcoded floor-division expected values. Related: issue #6217, PR #6629	2026-04-11 16:33:40 -07:00
Brooklyn Nicholson	a1d2a0c0fd	feat: self update npm deps on hermes update	2026-04-11 18:29:18 -05:00
WAXLYY	6d272ba477	fix(tools): enforce ID uniqueness in TODO store during replace operations Deduplicate todo items by ID before writing to the store, keeping the last occurrence. Prevents ghost entries when the model sends duplicate IDs in a single write() call, which corrupts subsequent merge operations. Co-authored-by: WAXLYY <WAXLYY@users.noreply.github.com>	2026-04-11 16:22:50 -07:00
asheriif	97b0cd51ee	feat(gateway): surface natural mid-turn assistant messages in chat platforms Add display.interim_assistant_messages config (enabled by default) that forwards completed assistant commentary between tool calls to the user as separate chat messages. Models already emit useful status text like 'I'll inspect the repo first.' — this surfaces it on Telegram, Discord, and other messaging platforms instead of swallowing it. Independent from tool_progress and gateway streaming. Disabled for webhooks. Uses GatewayStreamConsumer when available, falls back to direct adapter send. Tracks response_previewed to prevent double-delivery when interim message matches the final response. Also fixes: cursor not stripped from fallback prefix in stream consumer (affected continuation calculation on no-edit platforms like Signal). Cherry-picked from PR #7885 by asheriif, default changed to enabled. Fixes #5016	2026-04-11 16:21:39 -07:00
Teknium	c8aff74632	fix: prevent agent from stopping mid-task — compression floor, budget overhaul, activity tracking Three root causes of the 'agent stops mid-task' gateway bug: 1. Compression threshold floor (64K tokens minimum) - The 50% threshold on a 100K-context model fired at 50K tokens, causing premature compression that made models lose track of multi-step plans. Now threshold_tokens = max(50% * context, 64K). - Models with <64K context are rejected at startup with a clear error. 2. Budget warning removal — grace call instead - Removed the 70%/90% iteration budget warnings entirely. These injected '[BUDGET WARNING: Provide your final response NOW]' into tool results, causing models to abandon complex tasks prematurely. - Now: no warnings during normal execution. When the budget is actually exhausted (90/90), inject a user message asking the model to summarise, allow one grace API call, and only then fall back to _handle_max_iterations. 3. Activity touches during long terminal execution - _wait_for_process polls every 0.2s but never reported activity. The gateway's inactivity timeout (default 1800s) would fire during long-running commands that appeared 'idle.' - Now: thread-local activity callback fires every 10s during the poll loop, keeping the gateway's activity tracker alive. - Agent wires _touch_activity into the callback before each tool call. Also: docs update noting 64K minimum context requirement. Closes #7915 (root cause was agent-loop termination, not Weixin delivery limits).	2026-04-11 16:18:57 -07:00
Koichi Tsutsumi	fc417ed049	fix(cli): add ChatConsole.status for /skills search	2026-04-11 15:38:43 -07:00
0xbyt4	32519066dc	fix(gateway): add HERMES_SESSION_KEY to session_context contextvars Complete the contextvars migration by adding HERMES_SESSION_KEY to the unified _VAR_MAP in session_context.py. Without this, concurrent gateway handlers race on os.environ["HERMES_SESSION_KEY"]. - Add _SESSION_KEY ContextVar to _VAR_MAP, set_session_vars(), clear_session_vars() - Wire session_key through _set_session_env() from SessionContext - Replace os.getenv fallback in tools/approval.py with get_session_env() (function-level import to avoid cross-layer coupling) - Keep os.environ set as CLI/cron fallback Cherry-picked from PR #7878 by 0xbyt4.	2026-04-11 15:35:04 -07:00
syaor4n	689c515090	feat: add --env and --preset support to hermes mcp add - Add --env KEY=VALUE for passing environment variables to stdio MCP servers - Add --preset for known MCP server templates (empty for now, extensible) - Validate env var names, reject --env for HTTP servers - Explicit --command/--url overrides preset defaults - Remove unused getpass import Based on PR #7936 by @syaor4n (stitch preset removed, generic infra kept).	2026-04-11 15:34:57 -07:00
chqchshj	5f0caf54d6	feat(gateway): add WeCom callback-mode adapter for self-built apps Add a second WeCom integration mode for regular enterprise self-built applications. Unlike the existing bot/websocket adapter (wecom.py), this handles WeCom's standard callback flow: WeCom POSTs encrypted XML to an HTTP endpoint, the adapter decrypts, queues for the agent, and immediately acknowledges. The agent's reply is delivered proactively via the message/send API. Key design choice: always acknowledge immediately and use proactive send — agent sessions take 3-30 minutes, so the 5-second inline reply window is never useful. The original PR's Future/pending-reply machinery was removed in favour of this simpler architecture. Features: - AES-CBC encrypt/decrypt (BizMsgCrypt-compatible) - Multi-app routing scoped by corp_id:user_id - Legacy bare user_id fallback for backward compat - Access-token management with auto-refresh - WECOM_CALLBACK_* env var overrides - Port-in-use pre-check before binding - Health endpoint at /health Salvaged from PR #7774 by @chqchshj. Simplified by removing the inline reply Future system and fixing: secrets.choice for nonce generation, immediate plain-text acknowledgment (not encrypted XML containing 'success'), and initial token refresh error handling.	2026-04-11 15:22:49 -07:00
Brooklyn Nicholson	ec553fdb49	Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor	2026-04-11 17:15:41 -05:00

... 3 4 5 6 7 ...

2092 Commits