Commit Graph

2092 Commits

Author SHA1 Message Date
Teknium
22d22cd75c
fix: auto-register all gateway commands as Discord slash commands (#10528)
Discord's _register_slash_commands() had a hardcoded list of ~27 commands
while COMMAND_REGISTRY defines 34+ gateway-available commands. Missing
commands (debug, branch, rollback, snapshot, profile, yolo, fast, reload,
commands) were invisible in Discord's / autocomplete — users couldn't
discover them.

Add a dynamic catch-all loop after the explicit registrations that
iterates COMMAND_REGISTRY, skips already-registered commands, and
auto-registers the rest using discord.app_commands.Command(). Commands
with args_hint get an optional string parameter; parameterless commands
get a simple callback.

This ensures any future commands added to COMMAND_REGISTRY automatically
appear on Discord without needing a manual entry in discord.py.

Telegram and Slack already derive dynamically from COMMAND_REGISTRY
via telegram_bot_commands() and slack_subcommand_map() — no changes
needed there.
2026-04-15 14:25:27 -07:00
Teknium
305a702e09
fix: /browser connect CDP override now takes priority over Camofox (#10523)
When a user runs /browser connect to attach browser tools to their real
Chrome instance via CDP, the BROWSER_CDP_URL env var is set. However,
every browser tool function checks _is_camofox_mode() first, which
short-circuits to the Camofox backend before _get_session_info() ever
checks for the CDP override.

Fix: is_camofox_mode() now returns False when BROWSER_CDP_URL is set,
so the explicit CDP connection takes priority. This is the correct
behavior — /browser connect is an intentional user override.

Reported by SkyLinx on Discord.
2026-04-15 14:11:18 -07:00
Teknium
824c33729d
fix(session_search): coerce limit to int to prevent TypeError with non-int values (#10522)
Models (especially open-source like qwen3.5-plus) may send non-int values
for the limit parameter — None (JSON null), string, or even a type object.
This caused TypeError: '<=' not supported between instances of 'int' and
'type' when the value reached min()/comparison operations.

Changes:
- Add defensive int coercion at session_search() entry with fallback to 3
- Clamp limit to [1, 5] range (was only capped at 5, not floored)
- Add tests for None, type object, string, negative, and zero limit values

Reported by community user ludoSifu via Discord.
2026-04-15 14:11:05 -07:00
Teknium
91980e3518
fix: deduplicate memory provider tools to prevent 400 on strict providers (#10511)
Memory provider plugins (e.g. Mnemosyne) can register tools via two paths:
1. Plugin system (ctx.register_tool) → tool registry → get_tool_definitions()
2. Memory manager → get_all_tool_schemas() → direct append in AIAgent.__init__

Path 2 blindly appended without checking if path 1 already added the same
tool names. This created duplicate function names in the tools array sent
to the API. Most providers silently handle duplicates, but Xiaomi MiMo
(via Nous Portal) strictly rejects them with a 400 Bad Request.

Fix: build a set of existing tool names before memory manager injection
and skip any tool whose name is already present.

Confirmed via live testing against Nous Portal:
- Unique tool names → 200 OK
- Duplicate tool names → 400 'Provider returned error'
2026-04-15 14:09:32 -07:00
Teknium
19142810ed
fix: /debug privacy — auto-delete pastes after 1 hour, add privacy notices (#10510)
- Pastes uploaded by /debug now auto-delete after 1 hour via a detached
  background process that sends DELETE to paste.rs
- CLI: shows privacy notice listing what data will be uploaded
- Gateway: only uploads summary report (system info + log tails), NOT
  full log files containing conversation content
- Added 'hermes debug delete <url>' for immediate manual deletion
- 16 new tests covering auto-delete scheduling, paste deletion, privacy
  notices, and the delete subcommand

Addresses user privacy concern where /debug uploaded full conversation
logs to a public paste service with no warning or expiry.
2026-04-15 13:40:27 -07:00
Teknium
2edbf15560
fix: enforce TTL in MessageDeduplicator + use yaml for gateway --config (#10306, #10216) (#10509)
Two gateway fixes:

1. MessageDeduplicator.is_duplicate() now checks TTL at query time (#10306)

   Previously, is_duplicate() returned True for any previously seen ID
   without checking its age — expired entries were only purged when cache
   size exceeded max_size.  On normal workloads that never overflow, message
   IDs stayed deduplicated forever instead of expiring after the TTL.

   Fix: check `now - timestamp < ttl` before returning True.  Expired
   entries are removed and treated as new messages.

2. Gateway --config flag now uses yaml.safe_load() (#10216)

   The --config CLI flag in gateway/run.py main() used json.load() to
   parse config files.  YAML is the only documented config format and
   every other config loader uses yaml.safe_load().  A YAML config file
   passed via --config would crash with json.JSONDecodeError.

Closes #10306
Closes #10216
2026-04-15 13:35:40 -07:00
Teknium
af4bf505b3
fix: add on_memory_write bridge to sequential tool execution path (#10174) (#10507)
The on_memory_write bridge that notifies external memory providers
(ClawMem, retaindb, supermemory, etc.) of built-in memory writes was
only present in the concurrent tool execution path (_invoke_tool).
The sequential path (_execute_tool_calls_sequential) — which handles
all single tool calls, the common case — was missing it entirely.

This meant external memory providers silently missed every single-call
memory write, which is the vast majority of memory operations.

Fix: add the identical bridge block to the sequential path, right
after the memory_tool call returns.

Closes #10174
2026-04-15 13:32:59 -07:00
helix4u
93f6f66872 fix(interrupt): preserve pre-start terminal interrupts 2026-04-15 13:29:57 -07:00
Teknium
a418ddbd8b
fix: add activity heartbeats to prevent false gateway inactivity timeouts (#10501)
Multiple gaps in activity tracking could cause the gateway's inactivity
timeout to fire while the agent is actively working:

1. Streaming wait loop had no periodic heartbeat — the outer thread only
   touched activity when the stale-stream detector fired (180-300s), and
   for local providers (Ollama) the stale timeout was infinity, meaning
   zero heartbeats. Now touches activity every 30s.

2. Concurrent tool execution never set the activity callback on worker
   threads (threading.local invisible across threads) and never set
   _current_tool. Workers now set the callback, and the concurrent wait
   uses a polling loop with 30s heartbeats.

3. Modal backend's execute() override had its own polling loop without
   any activity callback. Now matches _wait_for_process cadence (10s).
2026-04-15 13:29:05 -07:00
Teknium
6391b46779
fix: bound auxiliary client cache to prevent fd exhaustion in long-running gateways (#10200) (#10470)
The _client_cache used event loop id() as part of the cache key, so
every new worker-thread event loop created a new entry for the same
provider config.  In long-running gateways where threads are recycled
frequently, this caused unbounded cache growth — each stale entry
held an unclosed AsyncOpenAI client with its httpx connection pool,
eventually exhausting file descriptors.

Fix: remove loop_id from the cache key and instead validate on each
async cache hit that the cached loop is the current, open loop.  If
the loop changed or was closed, the stale entry is replaced in-place
rather than creating an additional entry.  This bounds cache growth
to at most one entry per unique provider config.

Also adds a _CLIENT_CACHE_MAX_SIZE (64) safety belt with FIFO
eviction as defense-in-depth against any remaining unbounded growth.

Cross-loop safety is preserved: different event loops still get
different client instances (validated by existing test suite).

Closes #10200
2026-04-15 13:16:28 -07:00
Brooklyn Nicholson
53a024a941 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-15 14:37:54 -05:00
zhiheng.liu
7cb06e3bb3 refactor(memory): drop on_session_reset — commit-only is enough
OV transparently handles message history across /new and /compress: old
messages stay in the same session and extraction is idempotent, so there's
no need to rebind providers to a new session_id. The only thing the
session boundary actually needs is to trigger extraction.

- MemoryProvider / MemoryManager: remove on_session_reset hook
- OpenViking: remove on_session_reset override (nothing to do)
- AIAgent: replace rotate_memory_session with commit_memory_session
  (just calls on_session_end, no rebind)
- cli.py / run_agent.py: single commit_memory_session call at the
  session boundary before session_id rotates
- tests: replace on_session_reset coverage with routing tests for
  MemoryManager.on_session_end

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-15 11:28:45 -07:00
zhiheng.liu
8275fa597a refactor(memory): promote on_session_reset to base provider hook
Replace hasattr-forked OpenViking-specific paths with a proper base-class
hook. Collapse the two agent wrappers into a single rotate_memory_session
so callers don't orchestrate commit + rebind themselves.

- MemoryProvider: add on_session_reset(new_session_id) as a default no-op
- MemoryManager: on_session_reset fans out unconditionally (no hasattr,
  no builtin skip — base no-op covers it)
- OpenViking: rename reset_session -> on_session_reset; drop the explicit
  POST /api/v1/sessions (OV auto-creates on first message) and the two
  debug raise_for_status wrappers
- AIAgent: collapse commit_memory_session + reinitialize_memory_session
  into rotate_memory_session(new_sid, messages)
- cli.py / run_agent.py: replace hasattr blocks and the split calls with
  a single unconditional rotate_memory_session call; compression path
  now passes the real messages list instead of []
- tests: align with on_session_reset, assert reset does NOT POST /sessions

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-15 11:28:45 -07:00
zhiheng.liu
7856d304f2 fix(openviking): commit session on /new and context compression
The OpenViking memory provider extracts memories when its session is
committed (POST /api/v1/sessions/{id}/commit).  Before this fix, the
CLI had two code paths that changed the active session_id without ever
committing the outgoing OpenViking session:

1. /new (new_session() in cli.py) — called flush_memories() to write
   MEMORY.md, then immediately discarded the old session_id.  The
   accumulated OpenViking session was never committed, so all context
   from that session was lost before extraction could run.

2. /compress and auto-compress (_compress_context() in run_agent.py) —
   split the SQLite session (new session_id) but left the OpenViking
   provider pointing at the old session_id with no commit, meaning all
   messages synced to OpenViking were silently orphaned.

The gateway already handles session commit on /new and /reset via
shutdown_memory_provider() on the cached agent; the CLI path did not.

Fix: introduce a lightweight session-transition lifecycle alongside
the existing full shutdown path:

- OpenVikingMemoryProvider.reset_session(new_session_id): waits for
  in-flight background threads, resets per-session counters, and
  creates the new OV session via POST /api/v1/sessions — without
  tearing down the HTTP client (avoids connection overhead on /new).

- MemoryManager.restart_session(new_session_id): calls reset_session()
  on providers that implement it; falls back to initialize() for
  providers that do not.  Skips the builtin provider (no per-session
  state).

- AIAgent.commit_memory_session(messages): wraps
  memory_manager.on_session_end() without shutdown — commits OV session
  for extraction but leaves the provider alive for the next session.

- AIAgent.reinitialize_memory_session(new_session_id): wraps
  memory_manager.restart_session() — transitions all external providers
  to the new session after session_id has been assigned.

Call sites:
- cli.py new_session(): commit BEFORE session_id changes, reinitialize
  AFTER — ensuring OV extraction runs on the correct session and the
  new session is immediately ready for the next turn.
- run_agent._compress_context(): same pattern, inside the
  if self._session_db: block where the session_id split happens.

/compress and auto-compress are functionally identical at this layer:
both call _compress_context(), so both are fixed by the same change.

Tests added to tests/agent/test_memory_provider.py:
- TestMemoryManagerRestartSession: reset_session() routing, builtin
  skip, initialize() fallback, failure tolerance, empty-manager noop.
- TestOpenVikingResetSession: session_id update, per-session state
  clear, POST /api/v1/sessions call, API failure tolerance, no-client
  noop.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-15 11:28:45 -07:00
Teknium
f61cc464f0 fix: include thread_id in _parse_session_key and fix stale parts reference
_parse_session_key() now extracts the optional 6th part (thread_id) from
session keys, and _notify_active_sessions_of_shutdown uses _parsed.get()
instead of the removed 'parts' variable. Without this, shutdown notifications
silently failed (NameError caught by try/except) and forum topic routing
was lost.
2026-04-15 11:16:01 -07:00
kshitijk4poor
2276b72141 fix: follow-up improvements for watch notification routing (#9537)
- Populate watcher_* routing fields for watch-only processes (not just
  notify_on_complete), so watch-pattern events carry direct metadata
  instead of relying solely on session_key parsing fallback
- Extract _parse_session_key() helper to dedupe session key parsing
  at two call sites in gateway/run.py
- Add negative test proving cross-thread leakage doesn't happen
- Add edge-case tests for _build_process_event_source returning None
  (empty evt, invalid platform, short session_key)
- Add unit tests for _parse_session_key helper
2026-04-15 11:16:01 -07:00
etcircle
dee592a0b1 fix(gateway): route synthetic background events by session 2026-04-15 11:16:01 -07:00
kshitij
da448d4fce
test(cron): add regression test for credential_files ContextVar propagation (#10462)
Follow-up to #10459 (salvage of #7527). The copy_context() fix propagates
ALL ContextVars into the cron worker thread, including credential_files.
This test verifies that skill-declared required_credential_files are
visible inside the worker thread, matching the existing env_passthrough
regression test.
2026-04-15 11:11:08 -07:00
helix4u
aa398ad655 fix(cron): preserve skill env passthrough in worker thread 2026-04-15 11:03:49 -07:00
Brooklyn Nicholson
cc15b55bb9 chore: uptick 2026-04-15 10:23:15 -05:00
Brooklyn Nicholson
371166fe26 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-15 10:21:00 -05:00
asheriif
33ae403890 fix(gateway): fix matrix lingering typing indicator 2026-04-15 04:16:16 -07:00
Teknium
47e6ea84bb fix: file handle bug, warning text, and tests for Discord media send
- Fix file handle closed before POST: nest session.post() inside
  the 'with open()' block so aiohttp can read the file during upload
- Update warning text to include weixin (also supports media delivery)
- Add 8 unit tests covering: text+media, media-only, missing files,
  upload failures, multiple files, and _send_to_platform routing
2026-04-15 04:16:06 -07:00
Teknium
1c4d3216d3
fix(cron): include job_id in delivery and guide models on removal workflow (#10242)
* fix(gateway): suppress duplicate replies on interrupt and streaming flood control

Three fixes for the duplicate reply bug affecting all gateway platforms:

1. base.py: Suppress stale response when the session was interrupted by a
   new message that hasn't been consumed yet. Checks both interrupt_event
   and _pending_messages to avoid false positives. (#8221, #2483)

2. run.py (return path): Remove response_previewed guard from already_sent
   check. Stream consumer's already_sent alone is authoritative — if
   content was delivered via streaming, the duplicate send must be
   suppressed regardless of the agent's response_previewed flag. (#8375)

3. run.py (queued-message path): Same fix — already_sent without
   response_previewed now correctly marks the first response as already
   streamed, preventing re-send before processing the queued message.

The response_previewed field is still produced by the agent (run_agent.py)
but is no longer required as a gate for duplicate suppression. The stream
consumer's already_sent flag is the delivery-level truth about what the
user actually saw.

Concepts from PR #8380 (konsisumer). Closes #8375, #8221, #2483.

* fix(cron): include job_id in delivery and guide models on removal workflow

Users reported cron reminders keep firing after asking the agent to stop.
Root cause: the conversational agent didn't know the job_id (not in delivery)
and models don't reliably do the list→remove two-step without guidance.

1. Include job_id in the cron delivery wrapper so users and agents can
   reference it when requesting removal.

2. Replace confusing footer ('The agent cannot see this message') with
   actionable guidance ('To stop or manage this job, send me a new
   message').

3. Add explicit list→remove guidance in the cronjob tool schema so models
   know to list first and never guess job IDs.
2026-04-15 03:46:58 -07:00
Teknium
2546b7acea fix(gateway): suppress duplicate replies on interrupt and streaming flood control
Three fixes for the duplicate reply bug affecting all gateway platforms:

1. base.py: Suppress stale response when the session was interrupted by a
   new message that hasn't been consumed yet. Checks both interrupt_event
   and _pending_messages to avoid false positives. (#8221, #2483)

2. run.py (return path): Remove response_previewed guard from already_sent
   check. Stream consumer's already_sent alone is authoritative — if
   content was delivered via streaming, the duplicate send must be
   suppressed regardless of the agent's response_previewed flag. (#8375)

3. run.py (queued-message path): Same fix — already_sent without
   response_previewed now correctly marks the first response as already
   streamed, preventing re-send before processing the queued message.

The response_previewed field is still produced by the agent (run_agent.py)
but is no longer required as a gate for duplicate suppression. The stream
consumer's already_sent flag is the delivery-level truth about what the
user actually saw.

Concepts from PR #8380 (konsisumer). Closes #8375, #8221, #2483.
2026-04-15 03:42:24 -07:00
Teknium
a4e1842f12
fix: strip reasoning item IDs from Responses API input when store=False (#10217)
With store=False (our default for the Responses API), the API does not
persist response items.  When reasoning items with 'id' fields were
replayed on subsequent turns, the API attempted a server-side lookup
for those IDs and returned 404:

  Item with id 'rs_...' not found. Items are not persisted when store
  is set to false.

The encrypted_content blob is self-contained for reasoning chain
continuity — the id field is unnecessary and triggers the failed lookup.

Fix: strip 'id' from reasoning items in both _chat_messages_to_responses_input
(message conversion) and _preflight_codex_input_items (normalization layer).
The id is still used for local deduplication but never sent to the API.

Reported by @zuogl448 on GPT-5.4.
2026-04-15 03:19:43 -07:00
Teknium
e69526be79
fix(send_message): URL-encode Matrix room IDs and add Matrix to schema examples (#10151)
Matrix room IDs contain ! and : which must be percent-encoded in URI
path segments per the Matrix C-S spec. Without encoding, some
homeservers reject the PUT request.

Also adds 'matrix:!roomid:server.org' and 'matrix:@user:server.org'
to the tool schema examples so models know the correct target format.
2026-04-15 00:10:59 -07:00
Teknium
180b14442f test: add _parse_target_ref Matrix coverage for salvaged PR #6144 2026-04-15 00:08:14 -07:00
Ubuntu
da8bab77fb fix(cli): restore messaging toolset for gateway platforms 2026-04-14 23:13:35 -07:00
Teknium
9932366f3c feat(doctor): add Command Installation check for hermes bin symlink
hermes doctor now checks whether the ~/.local/bin/hermes symlink exists
and points to the correct venv entry point. With --fix, it creates or
repairs the symlink automatically.

Covers:
- Missing symlink at ~/.local/bin/hermes (or $PREFIX/bin on Termux)
- Symlink pointing to wrong target
- Missing venv entry point (venv/bin/hermes or .venv/bin/hermes)
- PATH warning when ~/.local/bin is not on PATH
- Skipped on Windows (different mechanism)

Addresses user report: 'python -m hermes_cli.main doesn't have an option
to fix the local bin/install'

10 new tests covering all scenarios.
2026-04-14 23:13:11 -07:00
Teknium
029938fbed
fix(cli): defensive subparser routing for argparse bpo-9338 (#10113)
On some Python versions, argparse fails to route subcommand tokens when
the parent parser has nargs='?' optional arguments (--continue).  The
symptom: 'hermes model' produces 'unrecognized arguments: model' even
though 'model' is a registered subcommand.

Fix: when argv contains a token matching a known subcommand, set
subparsers.required=True to force deterministic routing.  If that fails
(e.g. 'hermes -c model' where 'model' is consumed as the session name
for --continue), fall back to the default optional-subparsers behaviour.

Adds 13 tests covering all key argument combinations.

Reported via user screenshot showing the exact error on an installed
version with the model subcommand listed in usage but rejected at parse
time.
2026-04-14 23:13:02 -07:00
Teknium
5d5d21556e
fix: sync client.api_key during UnicodeEncodeError ASCII recovery (#10090)
The existing recovery block sanitized self.api_key and
self._client_kwargs['api_key'] but did not update self.client.api_key.
The OpenAI SDK stores its own copy of api_key and reads it dynamically
via the auth_headers property on every request. Without this fix, the
retry after sanitization would still send the corrupted key in the
Authorization header, causing the same UnicodeEncodeError.

The bug manifests when an API key contains Unicode lookalike characters
(e.g. ʋ U+028B instead of v) from copy-pasting out of PDFs, rich-text
editors, or web pages with decorative fonts. httpx hard-encodes all
HTTP headers as ASCII, so the non-ASCII char in the Authorization
header triggers the error.

Adds TestApiKeyClientSync with two tests verifying:
- All three key locations are synced after sanitization
- Recovery handles client=None (pre-init) without crashing
2026-04-14 22:37:45 -07:00
Teknium
93fe4ead83
fix: warn on invalid context_length format in config.yaml (#10067)
Previously, non-integer context_length values (e.g. '256K') in
config.yaml were silently ignored, causing the agent to fall back
to 128K auto-detection with no user feedback. This was confusing
for users with custom LiteLLM endpoints expecting larger context.

Now prints a clear stderr warning and logs at WARNING level when
model.context_length or custom_providers[].models.<model>.context_length
cannot be parsed as an integer, telling users to use plain integers
(e.g. 256000 instead of '256K').

Reported by community user ChFarhan via Discord.
2026-04-14 22:14:27 -07:00
Teknium
a8b7db35b2
fix: interrupt agent immediately when user messages during active run (#10068)
When a user sends a message while the agent is executing a task on the
gateway, the agent is now interrupted immediately — not silently queued.
Previously, messages were stored in _pending_messages with zero feedback
to the user, potentially leaving them waiting 1+ hours.

Root cause: Level 1 guard (base.py) intercepted all messages for active
sessions and returned with no response. Level 2 (gateway/run.py) which
calls agent.interrupt() was never reached.

Fix: Expand _handle_active_session_busy_message to handle the normal
(non-draining) case:
  1. Call running_agent.interrupt(text) to abort in-flight tool calls
     and exit the agent loop at the next check point
  2. Store the message as pending so it becomes the next turn once the
     interrupted run returns
  3. Send a brief ack: 'Interrupting current task (10 min elapsed,
     iteration 21/60, running: terminal). I'll respond shortly.'
  4. Debounce acks to once per 30s to avoid spam on rapid messages

Reported by @Lonely__MH.
2026-04-14 22:07:28 -07:00
Brooklyn Nicholson
561cea0d4a Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-15 00:02:31 -05:00
Teknium
8548893d14
feat: entry-level Podman support — find_docker() + rootless entrypoint (#10066)
- find_docker() now checks HERMES_DOCKER_BINARY env var first, then
  docker on PATH, then podman on PATH, then macOS known locations
- Entrypoint respects HERMES_HOME env var (was hardcoded to /opt/data)
- Entrypoint uses groupmod -o to tolerate non-unique GIDs (fixes macOS
  GID 20 conflict with Debian's dialout group)
- Entrypoint makes chown best-effort so rootless Podman continues
  instead of failing with 'Operation not permitted'
- 5 new tests covering env var override, podman fallback, precedence

Based on work by alanjds (PR #3996) and malaiwah (PR #8115).
Closes #4084.
2026-04-14 21:20:37 -07:00
Teknium
c5688e7c8b fix(gateway): break compression-exhaustion infinite loop and auto-reset session (#9893)
When compression fails after max attempts, the agent returns
{completed: False, partial: True} but was missing the 'failed' flag.
The gateway's agent_failed_early guard checked for 'failed' AND
'not final_response', but _run_agent_blocking always converts errors
to final_response — making the guard dead code.  This caused the
oversized session to persist, creating an infinite fail loop where
every subsequent message hits the same compression failure.

Changes:
- run_agent.py: add 'failed: True' and 'compression_exhausted: True'
  to all 5 compression-exhaustion return paths
- gateway/run.py (_run_agent_blocking): forward 'failed' and
  'compression_exhausted' flags through to the caller
- gateway/run.py (_handle_message_with_agent): fix agent_failed_early
  to check bool(failed) without the broken 'not final_response' clause;
  auto-reset the session when compression is exhausted so the next
  message starts fresh
- Update tests to match new guard logic and add
  TestCompressionExhaustedFlag test class

Closes #9893
2026-04-14 21:18:17 -07:00
Greer Guthrie
4b2a1a4337 fix(tools): auto-discover built-in tool modules 2026-04-14 21:12:29 -07:00
Teknium
5cbb45d93e
fix: preserve session_id across previous_response_id chains in /v1/responses (#10059)
The /v1/responses endpoint generated a new UUID session_id for every
request, even when previous_response_id was provided. This caused each
turn of a multi-turn conversation to appear as a separate session on the
web dashboard, despite the conversation history being correctly chained.

Fix: store session_id alongside the response in the ResponseStore, and
reuse it when a subsequent request chains via previous_response_id.
Applies to both the non-streaming /v1/responses path and the streaming
SSE path. The /v1/runs endpoint also gains session continuity from
stored responses (explicit body.session_id still takes priority).

Adds test verifying session_id is preserved across chained requests.
2026-04-14 21:06:32 -07:00
Teknium
cf1d718823 fix: keep batch-path function_call_output.output as string per OpenAI spec
The streaming path emits output as content-part arrays for Open WebUI
compatibility, but the batch (non-streaming) Responses API path must
return output as a plain string per the OpenAI Responses API spec.
Reverts the _extract_output_items change from the cherry-picked commits
while preserving the streaming path's array format.
2026-04-14 20:51:52 -07:00
simon-marcus
302554b158 fix(api-server): format responses tool outputs for open webui 2026-04-14 20:51:52 -07:00
simon-marcus
d6c09ab94a feat(api-server): stream /v1/responses SSE tool events 2026-04-14 20:51:52 -07:00
Brooklyn Nicholson
496bfb3c59 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-14 22:30:22 -05:00
Teknium
da528a8207 fix: detect and strip non-ASCII characters from API keys (#6843)
API keys containing Unicode lookalike characters (e.g. ʋ U+028B instead
of v) cause UnicodeEncodeError when httpx encodes the Authorization
header as ASCII.  This commonly happens when users copy-paste keys from
PDFs, rich-text editors, or web pages with decorative fonts.

Three layers of defense:

1. **Save-time validation** (hermes_cli/config.py):
   _check_non_ascii_credential() strips non-ASCII from credential values
   when saving to .env, with a clear warning explaining the issue.

2. **Load-time sanitization** (hermes_cli/env_loader.py):
   _sanitize_loaded_credentials() strips non-ASCII from credential env
   vars (those ending in _API_KEY, _TOKEN, _SECRET, _KEY) after dotenv
   loads them, so the rest of the codebase never sees non-ASCII keys.

3. **Runtime recovery** (run_agent.py):
   The UnicodeEncodeError recovery block now also sanitizes self.api_key
   and self._client_kwargs['api_key'], fixing the gap where message/tool
   sanitization succeeded but the API key still caused httpx to fail on
   the Authorization header.

Also: hermes_logging.py RotatingFileHandler now explicitly sets
encoding='utf-8' instead of relying on locale default (defensive
hardening for ASCII-locale systems).
2026-04-14 20:20:31 -07:00
Brooklyn Nicholson
77cd5bf565 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-14 19:33:03 -05:00
Greer Guthrie
c10fea8d26 fix(mcp): make server aliases explicit 2026-04-14 17:19:20 -07:00
Greer Guthrie
cda64a5961 fix(mcp): resolve toolsets from live registry 2026-04-14 17:19:20 -07:00
Teknium
2a98098035 fix: hermes gateway restart waits for service to come back up (#8260)
Previously, systemd_restart() sent SIGUSR1 to the gateway, printed
'restart requested', and returned immediately. The gateway still
needed to drain active agents, exit with code 75, wait for systemd's
RestartSec=30, and start the new process. The user saw 'success' but
the gateway was actually down for 30-60 seconds.

Now the SIGUSR1 path blocks with progress feedback:

Phase 1 — wait for old process to die:
   User service draining active work...
  Polls os.kill(pid, 0) until ProcessLookupError (up to 90s)

Phase 2 — wait for new process to become active:
   Waiting for hermes-gateway to restart...
  Polls systemctl is-active + verifies new PID (up to 60s)

Success:
  ✓ User service restarted (PID 12345)

Timeout:
  ⚠ User service did not become active within 60s.
    Check status: hermes gateway status
    Check logs: journalctl --user -u hermes-gateway --since '2 min ago'

The reload-or-restart fallback path (line 1189) already blocks because
systemctl reload-or-restart is synchronous.

Test plan:
- Updated test to verify wait-for-restart behavior
- All 118 gateway CLI tests pass
2026-04-14 17:12:58 -07:00
Teknium
6c89306437 fix: break stuck session resume loops after repeated restarts (#7536)
When a session gets stuck (hung terminal, runaway tool loop) and the
user restarts the gateway, the same session history loads and puts the
agent right back in the stuck state. The user is trapped in a loop:
restart → stuck → restart → stuck.

Fix: track restart-failure counts per session using a simple JSON file
(.restart_failure_counts). On each shutdown with active agents, the
counter increments for those sessions. On startup, if any session has
been active across 3+ consecutive restarts, it's auto-suspended —
giving the user a clean slate on their next message.

The counter resets to 0 when a session completes a turn successfully
(response delivered), so normal sessions that happen to be active
during planned restarts (/restart, hermes update) won't accumulate
false counts.

Implementation:
- _increment_restart_failure_counts(): called during stop() when
  agents are active. Writes {session_key: count} to JSON file.
  Sessions NOT active are dropped (loop broken).
- _suspend_stuck_loop_sessions(): called on startup. Reads the file,
  suspends sessions at threshold (3), clears the file.
- _clear_restart_failure_count(): called after successful response
  delivery. Removes the session from the counter file.

No SessionEntry schema changes. No database migration. Pure file-based
tracking that naturally cleans up.

Test plan:
- 9 new stuck-loop tests (increment, accumulate, threshold, clear,
  suspend, file cleanup, edge cases)
- All 28 gateway lifecycle tests pass (restart drain + auto-continue
  + stuck loop)
2026-04-14 17:08:35 -07:00
Teknium
e7475b1582 feat: auto-continue interrupted agent work after gateway restart (#4493)
When the gateway restarts mid-agent-work, the session transcript ends
on a tool result the agent never processed. Previously, the user had
to type 'continue' or use /retry (which replays from scratch, losing
all prior work).

Now, when the next user message arrives and the loaded history ends
with role='tool', a system note is prepended:

  [System note: Your previous turn was interrupted before you could
  process the last tool result(s). Please finish processing those
  results and summarize what was accomplished, then address the
  user's new message below.]

This is injected in _run_agent()'s run_sync closure, right before
calling agent.run_conversation(). The agent sees the full history
(including the pending tool results) and the system note, so it can
summarize what was accomplished and then handle the user's new input.

Design decisions:
- No new session flags or schema changes — purely detects trailing
  tool messages in the loaded history
- Works for any restart scenario (clean, crash, SIGTERM, drain timeout)
  as long as the session wasn't suspended (suspended = fresh start)
- The user's actual message is preserved after the note
- If the session WAS suspended (unclean shutdown), the old history is
  abandoned and the user starts fresh — no false auto-continue

Also updates the shutdown notification message from 'Use /retry after
restart to continue' to 'Send any message after restart to resume
where it left off' — which is now accurate.

Test plan:
- 6 new auto-continue tests (trailing tool detection, no false
  positives for assistant/user/empty history, multi-tool, message
  preservation)
- All 13 restart drain tests pass (updated /retry assertion)
2026-04-14 16:56:49 -07:00
adybag14-cyber
56c34ac4f7 fix(browser): add termux PATH fallbacks
Refactor browser tool PATH construction to include Termux directories
(/data/data/com.termux/files/usr/bin, /data/data/com.termux/files/usr/sbin)
so agent-browser and npx are discoverable on Android/Termux.

Extracts _browser_candidate_path_dirs() and _merge_browser_path() helpers
to centralize PATH construction shared between _find_agent_browser() and
_run_browser_command(), replacing duplicated inline logic.

Also fixes os.pathsep usage (was hardcoded ':') for cross-platform correctness.

Cherry-picked from PR #9846.
2026-04-14 16:55:55 -07:00
areu01or00
cfa24532d3 fix(discord): register native /restart slash command 2026-04-14 16:55:48 -07:00
Teknium
10494b42a1
feat(discord): register skills under /skill command group with category subcommands (#9909)
Instead of consuming one top-level slash command slot per skill (hitting the
100-command limit with ~26 built-ins + 74 skills), skills are now organized
under a single /skill group command with category-based subcommand groups:

  /skill creative ascii-art [args]
  /skill media gif-search [args]
  /skill mlops axolotl [args]

Discord supports 25 subcommand groups × 25 subcommands = 625 max skills,
well beyond the previous 74-slot ceiling.

Categories are derived from the skill directory structure:
- skills/creative/ascii-art/ → category 'creative'
- skills/mlops/training/axolotl/ → category 'mlops' (top-level parent)
- skills/dogfood/ → uncategorized (direct subcommand)

Changes:
- hermes_cli/commands.py: add discord_skill_commands_by_category() with
  category grouping, hub/disabled filtering, Discord limit enforcement
- gateway/platforms/discord.py: replace top-level skill registration with
  _register_skill_group() using app_commands.Group hierarchy
- tests: 7 new tests covering group creation, category grouping,
  uncategorized skills, hub exclusion, deep nesting, empty skills,
  and handler dispatch

Inspired by Discord community suggestion from bottium.
2026-04-14 16:27:02 -07:00
Brooklyn Nicholson
bf54f1fb2f Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-14 18:26:05 -05:00
Teknium
1525624904 fix: block agent from self-destructing gateway via terminal (#6666)
Add dangerous command patterns that require approval when the agent
tries to run gateway lifecycle commands via the terminal tool:

- hermes gateway stop/restart — kills all running agents mid-work
- hermes update — pulls code and restarts the gateway
- systemctl restart/stop (with optional flags like --user)

These patterns fire the approval prompt so the user must explicitly
approve before the agent can kill its own gateway process. In YOLO
mode, the commands run without approval (by design — YOLO means the
user accepts all risks).

Also fixes the existing systemctl pattern to handle flags between
the command and action (e.g. 'systemctl --user restart' was previously
undetected because the regex expected the action immediately after
'systemctl').

Root cause: issue #6666 reported agents running 'hermes gateway
restart' via terminal, killing the gateway process mid-agent-loop.
The user sees the agent suddenly stop responding with no explanation.
Combined with the SIGTERM auto-recovery from PR #9875, the gateway
now both prevents accidental self-destruction AND recovers if it
happens anyway.

Test plan:
- Updated test_systemctl_restart_not_flagged → test_systemctl_restart_flagged
- All 119 approval tests pass
- E2E verified: hermes gateway restart, hermes update, systemctl
  --user restart all detected; hermes gateway status, systemctl
  status remain safe
2026-04-14 15:43:31 -07:00
Teknium
353b5bacbd test: add tests for /health/detailed endpoint and gateway health probe
- TestHealthDetailedEndpoint: 3 tests for the new API server endpoint
  (returns runtime data, handles missing status, no auth required)
- TestProbeGatewayHealth: 5 tests for _probe_gateway_health()
  (URL normalization, successful/failed probes, fallback chain)
- TestStatusRemoteGateway: 4 tests for /api/status remote fallback
  (remote probe triggers, skipped when local PID found, null PID handling)
2026-04-14 15:41:30 -07:00
Teknium
eed891f1bb
security: supply chain hardening — CI pinning, dep pinning, and code fixes (#9801)
CI/CD Hardening:
- Pin all 12 GitHub Actions to full commit SHAs (was mutable @vN tags)
- Add explicit permissions: {contents: read} to 4 workflows
- Pin CI pip installs to exact versions (pyyaml==6.0.2, httpx==0.28.1)
- Extend supply-chain-audit.yml to scan workflow, Dockerfile, dependency
  manifest, and Actions version changes

Dependency Pinning:
- Pin git-based Python deps to commit SHAs (atroposlib, tinker, yc-bench)
- Pin WhatsApp Baileys from mutable branch to commit SHA

Tool Registry:
- Reject tool name shadowing from different tool families (plugins/MCP
  cannot overwrite built-in tools). MCP-to-MCP overwrites still allowed.

MCP Security:
- Add tool description content scanning for prompt injection patterns
- Log detailed change diff on dynamic tool refresh at WARNING level

Skill Manager:
- Fix dangerous verdict bug: agent-created skills with dangerous
  findings were silently allowed (ask->None->allow). Now blocked.
2026-04-14 14:23:37 -07:00
Roy-oss1
1aa76620d4 fix(feishu): keep approval clicks synchronized with callback card state
Feishu approval clicks need the resolved card to come back from the
synchronous callback path itself. Leaving approval resolution to the
generic asynchronous card-action flow made button feedback depend on
later loop work instead of the callback response the client is waiting
for.

Change-Id: I574997cbbcaa097fdba759b47367e28d1b56b040
Constraint: Feishu card-action callbacks must acknowledge quickly and reflect final approval state from the callback response path
Rejected: Keep approval handling on the generic async card-action route | leaves card state synchronization vulnerable to callback timing and follow-up update ordering
Confidence: high
Scope-risk: narrow
Reversibility: clean
Directive: Keep approval callback response construction separate from async queue unblocking unless Feishu callback semantics change
Tested: pytest tests/gateway/test_feishu.py tests/gateway/test_feishu_approval_buttons.py tests/gateway/test_approve_deny_commands.py tests/gateway/test_slack_approval_buttons.py tests/gateway/test_telegram_approval_buttons.py -q
Not-tested: Live Feishu workspace end-to-end callback rendering
2026-04-14 14:22:11 -07:00
Teknium
fa8c448f7d fix: notify active sessions on gateway shutdown + update health check
Three fixes for gateway lifecycle stability:

1. Notify active sessions before shutdown (#new)
   When the gateway receives SIGTERM or /restart, it now sends a
   notification to every chat with an active agent BEFORE starting
   the drain. Users see:
   - Shutdown: 'Gateway shutting down — your task will be interrupted.'
   - Restart: 'Gateway restarting — use /retry after restart to continue.'
   Deduplicates per-chat so group sessions with multiple users get
   one notification. Best-effort: send failures are logged and swallowed.

2. Skip .clean_shutdown marker when drain timed out
   Previously, a graceful SIGTERM always wrote .clean_shutdown, even if
   agents were force-interrupted when the drain timed out. This meant
   the next startup skipped session suspension, leaving interrupted
   sessions in a broken state (trailing tool response, no final message).
   Now the marker is only written if the drain completed without timeout,
   so interrupted sessions get properly suspended on next startup.

3. Post-restart health check for hermes update (#6631)
   cmd_update() now verifies the gateway actually survived after
   systemctl restart (sleep 3s + is-active check). If the service
   crashed immediately, it retries once. If still dead, prints
   actionable diagnostics (journalctl command, manual restart hint).

Also closes #8104 — already fixed on main (the /restart handler
correctly detects systemd via INVOCATION_ID and uses via_service=True).

Test plan:
- 6 new tests for shutdown notifications (dedup, restart vs shutdown
  messaging, sentinel filtering, send failure resilience)
- Existing restart drain + update tests pass (47 total)
2026-04-14 14:21:57 -07:00
Teknium
a37a095980 fix: detect qwen-oauth provider via CLI tokens in /model picker
Seed qwen-oauth credentials from resolve_qwen_runtime_credentials() in
_seed_from_singletons(). Users who authenticate via 'qwen auth qwen-oauth'
store tokens in ~/.qwen/oauth_creds.json which the runtime resolver reads
but the credential pool couldn't detect — same gap pattern as copilot.

Uses refresh_if_expiring=False to avoid network calls during discovery.
2026-04-14 11:16:26 -07:00
Marvae
0bd3f521ae fix: detect copilot provider via gh auth token in /model picker
Seed copilot credentials from resolve_copilot_token() in the credential
pool's _seed_from_singletons(), alongside the existing anthropic and
openai-codex seeding logic. This makes copilot appear in the /model
provider picker when the user authenticates solely through gh auth token.

Cherry-picked from PR #9767 by Marvae.
2026-04-14 11:16:26 -07:00
Teknium
3e0bccc54c fix: update existing webhook tests to use _webhook_register_url
Follow-up for cherry-picked PR #9746 — three pre-existing tests used
adapter._webhook_url (bare URL) in mock data, but _register_webhook
and _unregister_webhook now compare against _webhook_register_url
(password-bearing URL). Updated to match.
2026-04-14 11:02:48 -07:00
cypres0099
326cbbe40e fix(gateway/bluebubbles): embed password in registered webhook URL for inbound auth
When BlueBubbles posts webhook events to the adapter, it uses the exact
URL registered via /api/v1/webhook — and BB's registration API does not
support custom headers. The adapter currently registers the bare URL
(no credentials), but then requires password auth on inbound POSTs,
rejecting every webhook with HTTP 401.

This is masked on fresh BB installs by a race condition: the webhook
might register once with a prior (possibly patched) URL and keep working
until the first restart. On v0.9.0, _unregister_webhook runs on clean
shutdown, so the next startup re-registers with the bare URL and the
401s begin. Users see the bot go silent with no obvious cause.

Root cause: there's no way to pass auth credentials from BB to the
webhook handler except via the URL itself. BB accepts query params and
preserves them on outbound POSTs.

## Fix

Introduce `_webhook_register_url` — the URL handed to BB's registration
API, with the configured password appended as a `?password=<value>`
query param. The existing webhook auth handler already accepts this
form (it reads `request.query.get("password")`), so no change to the
receive side is needed.

The bare `_webhook_url` is still used for logging and for binding the
local listener, so credentials don't leak into log output. Only the
registration/find/unregister paths use the password-bearing form.

## Notes

- Password is URL-encoded via urllib.parse.quote, handling special
  characters (&, *, @, etc.) that would otherwise break parsing.
- Storing the password in BB's webhook table is not a new disclosure:
  anyone with access to that table already has the BB admin password
  (same credential used for every other API call).
- If `self.password` is empty (no auth configured), the register URL
  is the bare URL — preserves current behavior for unauthenticated
  local-only setups.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 11:02:48 -07:00
cypres0099
8b52356849 fix(gateway/bluebubbles): fall back to data.chats[0].guid when chatGuid missing
BlueBubbles v1.9+ webhook payloads for new-message events do not always
include a top-level chatGuid field on the message data object. Instead,
the chat GUID is nested under data.chats[0].guid.

The adapter currently checks five top-level fallback locations (record and
payload, snake_case and camelCase, plus payload.guid) but never looks
inside the chats array. When none of those top-level fields contain the
GUID, the adapter falls through to using the sender's phone/email as the
session chat ID.

This causes two observable bugs when a user is a participant in both a DM
and a group chat with the bot:

1. DM and group sessions merge. Every message from that user ends up with
   the same session_chat_id (their own address), so the bot cannot
   distinguish which thread the message came from.

2. Outbound routing becomes ambiguous. _resolve_chat_guid() iterates all
   chats and returns the first one where the address appears as a
   participant; group chats typically sort ahead of DMs by activity, so
   replies and cron messages intended for the DM can land in a group.

This was observed in production: a user's morning brief cron delivered to
a group chat with his spouse instead of his DM thread.

The fix adds a single fallback that extracts chat_guid from
record["chats"][0]["guid"] when the top-level fields are empty. The chats
array is included in every new-message webhook payload in BB v1.9.9
(verified against a live server). It is backwards compatible: if a future
BB version starts including chatGuid at the top level, that still wins.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-14 11:02:48 -07:00
Teknium
99bcc2de5b
fix(security): harden dashboard API against unauthenticated access (#9800)
Addresses responsible disclosure from FuzzMind Security Lab (CVE pending).

The web dashboard API server had 36 endpoints, of which only 5 checked
the session token. The token itself was served from an unauthenticated
GET /api/auth/session-token endpoint, rendering the protection circular.
When bound to 0.0.0.0 (--host flag), all API keys, config, and cron
management were accessible to any machine on the network.

Changes:
- Add auth middleware requiring session token on ALL /api/ routes except
  a small public whitelist (status, config/defaults, config/schema,
  model/info)
- Remove GET /api/auth/session-token endpoint entirely; inject the token
  into index.html via a <script> tag at serve time instead
- Replace all inline token comparisons (!=) with hmac.compare_digest()
  to prevent timing side-channel attacks
- Block non-localhost binding by default; require --insecure flag to
  override (with warning log)
- Update frontend fetchJSON() to send Authorization header on all
  requests using the injected window.__HERMES_SESSION_TOKEN__

Credit: Callum (@0xca1x) and @migraine-sudo at FuzzMind Security Lab
2026-04-14 10:57:56 -07:00
asheriif
b583210c97 fix(gateway): fix regression causing display.streaming to override root streaming key 2026-04-14 10:52:23 -07:00
Teknium
90c98345c9 feat: gateway proxy mode — forward messages to remote API server
When GATEWAY_PROXY_URL (or gateway.proxy_url in config.yaml) is set,
the gateway becomes a thin relay: it handles platform I/O (encryption,
threading, media) and delegates all agent work to a remote Hermes API
server via POST /v1/chat/completions with SSE streaming.

This enables the primary use case of running a Matrix E2EE gateway in
Docker on Linux while the actual agent runs on the host (e.g. macOS)
with full access to local files, memory, skills, and a unified session
store. Works for any platform adapter, not just Matrix.

Configuration:
  - GATEWAY_PROXY_URL env var (Docker-friendly)
  - gateway.proxy_url in config.yaml
  - GATEWAY_PROXY_KEY env var for API auth (matches API_SERVER_KEY)
  - X-Hermes-Session-Id header for session continuity

Architecture:
  - _get_proxy_url() checks env var first, then config.yaml
  - _run_agent_via_proxy() handles HTTP forwarding with SSE streaming
  - _run_agent() delegates to proxy path when URL is configured
  - Platform streaming (GatewayStreamConsumer) works through proxy
  - Returns compatible result dict for session store recording

Files changed:
  - gateway/run.py: proxy mode implementation (~250 lines)
  - hermes_cli/config.py: GATEWAY_PROXY_URL + GATEWAY_PROXY_KEY env vars
  - tests/gateway/test_proxy_mode.py: 17 tests covering config
    resolution, dispatch, HTTP forwarding, error handling, message
    filtering, and result shape validation

Closes discussion from Cars29 re: Matrix gateway mixed-mode issue.
2026-04-14 10:49:48 -07:00
Disaster-Terminator
9bdfcd1b93 feat: sort tool search results by score and add corresponding unit test 2026-04-14 10:49:35 -07:00
Teknium
b867171291 fix: preserve profile name completion in dynamic shell completion
The dynamic parser walker from the contributor's commit lost the profile
name tab-completion that existed in the old static generators. This adds
it back for all three shells:

- Bash: _hermes_profiles() helper, -p/--profile completion, profile
  action→name completion (use/delete/show/alias/rename/export)
- Zsh: _hermes_profiles() function, -p/--profile argument spec, profile
  action case with name completion
- Fish: __hermes_profiles function, -s p -l profile flag, profile action
  completions

Also removes the dead fallback path in cmd_completion() that imported
the old static generators from profiles.py (parser is always available
via the lambda wiring) and adds 11 regression-prevention tests for
profile completion.
2026-04-14 10:45:42 -07:00
leozeli
a686dbdd26 feat(cli): add dynamic shell completion for bash, zsh, and fish
Replaces the hardcoded completion stubs in profiles.py with a dynamic
generator that walks the live argparse parser tree at runtime.

- New hermes_cli/completion.py: _walk() recursively extracts all
  subcommands and flags; generate_bash/zsh/fish() produce complete
  scripts with nested subcommand support
- cmd_completion now accepts the parser via closure so completions
  always reflect the actual registered commands (including plugin-
  registered ones like honcho)
- completion subcommand now accepts bash | zsh | fish (fish requested
  in issue comments)
- Fix _SUBCOMMANDS set: add honcho, claw, plugins, acp, webhook,
  memory, dump, debug, backup, import, completion, logs so that
  multi-word session names after -c/-r are not broken by these commands
- Add tests/hermes_cli/test_completion.py: 17 tests covering parser
  extraction, alias deduplication, bash/zsh/fish output content,
  bash syntax validation, fish syntax validation, and subcommand
  drift prevention

Tested on Linux (Arch). bash and fish completion verified live.
zsh script passes syntax check (zsh not installed on test machine).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 10:45:42 -07:00
N0nb0at
b21b3bfd68 feat(plugins): namespaced skill registration for plugin skill bundles
Add ctx.register_skill() API so plugins can ship SKILL.md files under
a 'plugin:skill' namespace, preventing name collisions with built-in
Hermes skills. skill_view() detects the ':' separator and routes to
the plugin registry while bare names continue through the existing
flat-tree scan unchanged.

Key additions:
- agent/skill_utils: parse_qualified_name(), is_valid_namespace()
- hermes_cli/plugins: PluginContext.register_skill(), PluginManager
  skill registry (find/list/remove)
- tools/skills_tool: qualified name dispatch in skill_view(),
  _serve_plugin_skill() with full guards (disabled, platform,
  injection scan), bundle context banner with sibling listing,
  stale registry self-heal
- Hoisted _INJECTION_PATTERNS to module level (dedup)
- Updated skill_view schema description

Based on PR #9334 by N0nb0at. Lean P1 salvage — omits autogen shim
(P2) for a simpler first merge.

Closes #8422
2026-04-14 10:42:58 -07:00
Dusk1e
4b47856f90 fix: load credentials from HERMES_HOME .env in trajectory_compressor 2026-04-14 10:24:19 -07:00
Teknium
8ea9ceb44c fix: guard reply_to_text against DeletedReferencedMessage
Use getattr() for resolved.content since discord.py's
DeletedReferencedMessage lacks a content attribute. Adds test
for the deleted-message edge case.
2026-04-14 10:22:11 -07:00
ChimingLiu
7636baf49c feat(discord): extract reply text from message references 2026-04-14 10:22:11 -07:00
Dusk1e
420d27098f fix(tools): keep memory tool available when fcntl is unavailable 2026-04-14 10:18:05 -07:00
Zhuofeng Wang
449c17e9a9 fix(gateway): support Telegram MarkdownV2 expandable blockquotes 2026-04-14 10:16:49 -07:00
shijianzhi
70611879de fix(cli): fix doctor checks for Kimi China credentials 2026-04-14 10:16:30 -07:00
Brooklyn Nicholson
9a3a2925ed feat: scroll aware sticky prompt 2026-04-14 11:49:32 -05:00
Teknium
7ad47ace51
fix: resolve remaining 4 CI test failures (#9543)
- test_auth_commands: suppress _seed_from_singletons auto-seeding that
  adds extra credentials from CI env (same pattern as nearby tests)
- test_interrupt: clear stale _interrupted_threads set to prevent
  thread ident reuse from prior tests in same xdist worker
- test_code_execution: add watch_patterns to _BLOCKED_TERMINAL_PARAMS
  to match production _TERMINAL_BLOCKED_PARAMS
2026-04-14 02:18:38 -07:00
Teknium
b4fcec6412
fix: prevent streaming cursor from appearing as standalone messages (#9538)
During rapid tool-calling, the model often emits 1-2 tokens before
switching to tool calls. The stream consumer would create a new message
with 'X ▉' (short text + cursor), and if the follow-up edit to strip
the cursor was rate-limited by the platform, the cursor remained as
a permanent standalone message — reported on Telegram as 'white box'
artifacts.

Add a minimum-content guard in _send_or_edit: when creating a new
standalone message (no existing message_id), require at least 4
visible characters alongside the cursor before sending. Shorter text
accumulates into the next streaming segment instead.

This prevents cursor-only 'tofu' messages across all platforms without
affecting normal streaming (edits to existing messages, final sends
without cursor, and messages with substantial text are all unaffected).

Reported by @michalkomar on X.
2026-04-14 01:52:42 -07:00
Teknium
2558d28a9b
fix: resolve CI test failures — add missing functions, fix stale tests (#9483)
Production fixes:
- Add clear_session_context() to hermes_logging.py (fixes 48 teardown errors)
- Add clear_session() to tools/approval.py (fixes 9 setup errors)
- Add SyncError M_UNKNOWN_TOKEN check to Matrix _sync_loop (bug fix)
- Fall back to inline api_key in named custom providers when key_env
  is absent (runtime_provider.py)

Test fixes:
- test_memory_user_id: use builtin+external provider pair, fix honcho
  peer_name override test to match production behavior
- test_display_config: remove TestHelpers for non-existent functions
- test_auxiliary_client: fix OAuth tokens to match _is_oauth_token
  patterns, replace get_vision_auxiliary_client with resolve_vision_provider_client
- test_cli_interrupt_subagent: add missing _execution_thread_id attr
- test_compress_focus: add model/provider/api_key/base_url/api_mode
  to mock compressor
- test_auth_provider_gate: add autouse fixture to clean Anthropic env
  vars that leak from CI secrets
- test_opencode_go_in_model_list: accept both 'built-in' and 'hermes'
  source (models.dev API unavailable in CI)
- test_email: verify email Platform enum membership instead of source
  inspection (build_channel_directory now uses dynamic enum loop)
- test_feishu: add bot_added/bot_deleted handler mocks to _Builder
- test_ws_auth_retry: add AsyncMock for sync_store.get_next_batch,
  add _pending_megolm and _joined_rooms to Matrix adapter mocks
- test_restart_drain: monkeypatch-delete INVOCATION_ID (systemd sets
  this in CI, changing the restart call signature)
- test_session_hygiene: add user_id to SessionSource
- test_session_env: use relative baseline for contextvar clear check
  (pytest-xdist workers share context)
2026-04-14 01:43:45 -07:00
Jiawen-lee
2cfd2dafc6 feat(gateway): add ignored_threads config for Telegram 2026-04-14 01:40:32 -07:00
Teknium
4654f75627 fix: QQBot missing integration points, timestamp parsing, test fix
- Add Platform.QQBOT to _UPDATE_ALLOWED_PLATFORMS (enables /update command)
- Add 'qqbot' to webhook cross-platform delivery routing
- Add 'qqbot' to hermes dump platform detection
- Fix test_name_property casing: 'QQBot' not 'QQBOT'
- Add _parse_qq_timestamp() for ISO 8601 + integer ms compatibility
  (QQ API changed timestamp format — from PR #2411 finding)
- Wire timestamp parsing into all 4 message handlers
2026-04-14 00:11:49 -07:00
walli
884cd920d4 feat(gateway): unify QQBot branding, add PLATFORM_HINTS, fix streaming, restore missing setup functions
- Rename platform from 'qq' to 'qqbot' across all integration points
  (Platform enum, toolset, config keys, import paths, file rename qq.py → qqbot.py)
- Add PLATFORM_HINTS for QQBot in prompt_builder (QQ supports markdown)
- Set SUPPORTS_MESSAGE_EDITING = False to skip streaming on QQ
  (prevents duplicate messages from non-editable partial + final sends)
- Add _send_qqbot() standalone send function for cron/send_message tool
- Add interactive _setup_qq() wizard in hermes_cli/setup.py
- Restore missing _setup_signal/email/sms/dingtalk/feishu/wecom/wecom_callback
  functions that were lost during the original merge
2026-04-14 00:11:49 -07:00
Junjun Zhang
87bfc28e70 feat: add QQ Bot platform adapter (Official API v2)
Add full QQ Bot integration via the Official QQ Bot API (v2):
- WebSocket gateway for inbound events (C2C, group, guild, DM)
- REST API for outbound text/markdown/media messages
- Voice transcription (Tencent ASR + configurable STT provider)
- Attachment processing (images, voice, files)
- User authorization (allowlist + allow-all + DM pairing)

Integration points:
- gateway: Platform.QQ enum, adapter factory, allowlist maps
- CLI: setup wizard, gateway config, status display, tools config
- tools: send_message cross-platform routing, toolsets
- cron: delivery platform support
- docs: QQ Bot setup guide
2026-04-14 00:11:49 -07:00
Greer Guthrie
c7e2fe655a fix: make tool registry reads thread-safe 2026-04-13 23:52:32 -07:00
Teknium
6dc8f8e9c0 feat(skin): add warm-lightmode skin from PR #4811
Add a second light-mode skin option with warm brown/parchment tones,
adapted from ygd58's contribution in PR #4811. Includes completion
menu and status bar color keys for full light-terminal support.

Co-authored-by: buray <78954051+ygd58@users.noreply.github.com>
2026-04-13 23:51:21 -07:00
Liu Chongwei
bc93641c4f feat(skins): add built-in daylight skin 2026-04-13 23:51:21 -07:00
Teknium
19199cd38d
fix: clamp 'minimal' reasoning effort to 'low' on Responses API (#9429)
GPT-5.4 supports none/low/medium/high/xhigh but not 'minimal'.
Users may configure 'minimal' via OpenRouter conventions, which would
cause a 400 on native OpenAI. Clamp to 'low' in the codex_responses
path before sending.
2026-04-13 23:11:13 -07:00
Teknium
38ad158b6b
fix: auto-correct close model name matches in /model validation (#9424)
* feat(skills): add fitness-nutrition skill to optional-skills

Cherry-picked from PR #9177 by @haileymarshall.

Adds a fitness and nutrition skill for gym-goers and health-conscious users:
- Exercise search via wger API (690+ exercises, free, no auth)
- Nutrition lookup via USDA FoodData Central (380K+ foods, DEMO_KEY fallback)
- Offline body composition calculators (BMI, TDEE, 1RM, macros, body fat %)
- Pure stdlib Python, no pip dependencies

Changes from original PR:
- Moved from skills/ to optional-skills/health/ (correct location)
- Fixed BMR formula in FORMULAS.md (removed confusing -5+10, now just +5)
- Fixed author attribution to match PR submitter
- Marked USDA_API_KEY as optional (DEMO_KEY works without signup)

Also adds optional env var support to the skill readiness checker:
- New 'optional: true' field in required_environment_variables entries
- Optional vars are preserved in metadata but don't block skill readiness
- Optional vars skip the CLI capture prompt flow
- Skills with only optional missing vars show as 'available' not 'setup_needed'

* fix: auto-correct close model name matches in /model validation

When a user types a model name with a minor typo (e.g. gpt5.3-codex instead
of gpt-5.3-codex), the validation now auto-corrects to the closest match
instead of accepting the wrong name with a warning.

Uses difflib get_close_matches with cutoff=0.9 to avoid false corrections
(e.g. gpt-5.3 should not silently become gpt-5.4). Applied consistently
across all three validation paths: codex provider, custom endpoints, and
generic API-probed providers.

The validate_requested_model() return dict gains an optional corrected_model
key that switch_model() applies before building the result.

Reported by Discord user — /model gpt5.3-codex was accepted with a warning
but would fail at the API level.

---------

Co-authored-by: haileymarshall <haileymarshall@users.noreply.github.com>
2026-04-13 23:09:39 -07:00
Teknium
d631431872
feat: prompt for display name when adding custom providers (#9420)
During custom endpoint setup, users are now asked for a display name
with the auto-generated name as the default. Typing 'Ollama' or
'LM Studio' replaces the generic 'Local (localhost:11434)' in the
provider menu.

Extracts _auto_provider_name() for reuse and adds a name= parameter
to _save_custom_provider() so the caller can pass through the
user-chosen label.
2026-04-13 22:41:00 -07:00
Kenny Xie
cdd44817f2 fix(anthropic): send fast mode speed via extra_body 2026-04-13 22:32:39 -07:00
Teknium
3de2b98503 fix(streaming): filter <think> blocks from gateway stream consumer
Models like MiniMax emit inline <think>...</think> reasoning blocks in
their content field. The CLI already suppresses these via a state machine
in _stream_delta, but the gateway's GatewayStreamConsumer had no
equivalent filtering — raw think blocks were streamed directly to
Discord/Telegram/Slack.

The fix adds a _filter_and_accumulate() method that mirrors the CLI's
approach: a state machine tracks whether we're inside a reasoning block
and silently discards the content. Includes the same block-boundary
check (tag must appear at line start or after whitespace-only prefix)
to avoid false positives when models mention <think> in prose.

Handles all tag variants: <think>, <thinking>, <THINKING>, <thought>,
<reasoning>, <REASONING_SCRATCHPAD>.

Also handles edge cases:
- Tags split across streaming deltas (partial tag buffering)
- Unclosed blocks (content suppressed until stream ends)
- Multiple consecutive blocks
- _flush_think_buffer on stream end for held-back partial tags

Adds 22 unit tests + 1 integration test covering all scenarios.
2026-04-13 22:16:20 -07:00
helix4u
e08590888a fix: honor interrupts during MCP tool waits 2026-04-13 22:14:55 -07:00
Teknium
62fb6b2cd8 fix: guard zero context length display + add 19 tests for model info
- ModelInfoCard: hide card when effective_context_length <= 0 instead
  of showing 'Context Window: 0 auto-detected'
- Add tests for _normalize_config_for_web model_context_length extraction
- Add tests for _denormalize_config_from_web round-trip (write back,
  remove on zero, upgrade bare string to dict, coerce string input)
- Add tests for CONFIG_SCHEMA ordering (model_context_length after model)
- Add tests for GET /api/model/info endpoint (dict config, bare string,
  empty model, capabilities, graceful error handling)
2026-04-13 22:04:35 -07:00
Gianfranco Piana
eabc0a2f66 feat(plugins): let pre_tool_call hooks block tool execution
Plugins can now return {"action": "block", "message": "reason"} from
their pre_tool_call hook to prevent a tool from executing. The error
message is returned to the model as a tool result so it can adjust.

Covers both execution paths: handle_function_call (model_tools.py) and
agent-level tools (run_agent.py _invoke_tool + sequential/concurrent).
Blocked tools skip all side effects (counter resets, checkpoints,
callbacks, read-loop tracker).

Adds skip_pre_tool_call_hook flag to avoid double-firing the hook when
run_agent.py already checked and then calls handle_function_call.

Salvaged from PR #5385 (gianfrancopiana) and PR #4610 (oredsecurity).
2026-04-13 22:01:49 -07:00
Teknium
5621fc449a
chore: rename AI Gateway → Vercel AI Gateway, move Xiaomi to #5 (#9326)
- Rename 'AI Gateway' to 'Vercel AI Gateway' across auth, models,
  doctor, setup, and tests.
- Move Xiaomi MiMo to position #5 in the provider picker.
2026-04-13 19:51:54 -07:00
Brooklyn Nicholson
6bbac046a7 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-13 21:46:11 -05:00
Teknium
0cc7f79016
fix(streaming): prevent duplicate Telegram replies when stream task is cancelled (#9319)
When the 5-second stream_task timeout in gateway/run.py expires (due to
slow Telegram API calls from rate limiting after several messages), the
stream consumer is cancelled via asyncio.CancelledError. The
CancelledError handler did a best-effort final edit but never set
final_response_sent, so the gateway fell through to the normal send path
and delivered the full response again as a reply — causing a duplicate.

The fix: in the CancelledError handler, set final_response_sent = True
when already_sent is True (i.e., the stream consumer had already
delivered content to the user). This tells the gateway's already_sent
check that the response was delivered, preventing the duplicate send.

Adds two tests verifying the cancellation behavior:
- Cancelled with already_sent=True → final_response_sent=True (no dup)
- Cancelled with already_sent=False → final_response_sent=False (normal
  send path proceeds)

Reported by community user hume on Discord.
2026-04-13 19:22:43 -07:00
Brooklyn Nicholson
1b573b7b21 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-13 21:17:41 -05:00
Teknium
f324222b79
fix: add vLLM/local server error patterns + MCP initial connection retry (#9281)
Port two improvements inspired by Kilo-Org/kilocode analysis:

1. Error classifier: add context overflow patterns for vLLM, Ollama,
   and llama.cpp/llama-server. These local inference servers return
   different error formats than cloud providers (e.g., 'exceeds the
   max_model_len', 'context length exceeded', 'slot context'). Without
   these patterns, context overflow errors from local servers are
   misclassified as format errors, causing infinite retries instead
   of triggering compression.

2. MCP initial connection retry: previously, if the very first
   connection attempt to an MCP server failed (e.g., transient DNS
   blip at startup), the server was permanently marked as failed with
   no retry. Post-connect reconnection had 5 retries with exponential
   backoff, but initial connection had zero. Now initial connections
   retry up to 3 times with backoff before giving up, matching the
   resilience of post-connect reconnection.
   (Inspired by Kilo Code's MCP server disappearing fix in v1.3.3)

Tests: 6 new error classifier tests, 4 new MCP retry tests, 1
updated existing test. All 276 affected tests pass.
2026-04-13 18:46:14 -07:00
arthurbr11
0a4cf5b3e1 feat(providers): add Arcee AI as direct API provider
Adds Arcee AI as a standard direct provider (ARCEEAI_API_KEY) with
Trinity models: trinity-large-thinking, trinity-large-preview, trinity-mini.

Standard OpenAI-compatible provider checklist: auth.py, config.py,
models.py, main.py, providers.py, doctor.py, model_normalize.py,
model_metadata.py, setup.py, trajectory_compressor.py.

Based on PR #9274 by arthurbr11, simplified to a standard direct
provider without dual-endpoint OpenRouter routing.
2026-04-13 18:40:06 -07:00
Teknium
ac80bd61ad test: add regression tests for custom_providers multi-model dedup and grouping
Tests for salvaged PRs #9233 and #8011.
2026-04-13 16:41:30 -07:00
Brooklyn Nicholson
7e4dd6ea02 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-13 18:32:13 -05:00
Teknium
32cea0c08d
fix: dashboard shows Nous Portal as 'not connected' despite active auth (#9261)
The dashboard device-code flow (_nous_poller in web_server.py) saved
credentials to the credential pool only, while get_nous_auth_status()
only checked the auth store (auth.json). This caused the Keys tab to
show 'not connected' even when the backend was fully authenticated.

Two fixes:
1. get_nous_auth_status() now checks the credential pool first (like
   get_codex_auth_status() already does), then falls back to the auth
   store.
2. _nous_poller now also persists to the auth store after saving to
   the credential pool, matching the CLI flow (_login_nous).

Adds 3 tests covering pool-only, auth-store-fallback, and empty-state
scenarios.
2026-04-13 16:32:11 -07:00
Teknium
8d023e43ed
refactor: remove dead code — 1,784 lines across 77 files (#9180)
Deep scan with vulture, pyflakes, and manual cross-referencing identified:
- 41 dead functions/methods (zero callers in production)
- 7 production-dead functions (only test callers, tests deleted)
- 5 dead constants/variables
- ~35 unused imports across agent/, hermes_cli/, tools/, gateway/

Categories of dead code removed:
- Refactoring leftovers: _set_default_model, _setup_copilot_reasoning_selection,
  rebuild_lookups, clear_session_context, get_logs_dir, clear_session
- Unused API surface: search_models_dev, get_pricing, skills_categories,
  get_read_files_summary, clear_read_tracker, menu_labels, get_spinner_list
- Dead compatibility wrappers: schedule_cronjob, list_cronjobs, remove_cronjob
- Stale debug helpers: get_debug_session_info copies in 4 tool files
  (centralized version in debug_helpers.py already exists)
- Dead gateway methods: send_emote, send_notice (matrix), send_reaction
  (bluebubbles), _normalize_inbound_text (feishu), fetch_room_history
  (matrix), _start_typing_indicator (signal), parse_feishu_post_content
- Dead constants: NOUS_API_BASE_URL, SKILLS_TOOL_DESCRIPTION,
  FILE_TOOLS, VALID_ASPECT_RATIOS, MEMORY_DIR
- Unused UI code: _interactive_provider_selection,
  _interactive_model_selection (superseded by prompt_toolkit picker)

Test suite verified: 609 tests covering affected files all pass.
Tests for removed functions deleted. Tests using removed utilities
(clear_read_tracker, MEMORY_DIR) updated to use internal APIs directly.
2026-04-13 16:32:04 -07:00
helix4u
f94f53cc22 fix(matrix): disable streaming cursor decoration on Matrix 2026-04-13 16:31:02 -07:00
helix4u
0ffb6f2dae fix(matrix): skip cursor-only stream placeholder messages 2026-04-13 16:31:02 -07:00
Brooklyn Nicholson
aeb53131f3 fix(ui-tui): harden TUI error handling, model validation, command UX parity, and gateway lifecycle 2026-04-13 18:29:24 -05:00
helix4u
8680f61f8b fix(copilot-acp): keep acp runtime off responses path 2026-04-13 16:17:43 -07:00
Teknium
063244bb16 test: add coverage for plugin context engine init (#9071)
Verify that plugin context engines receive update_model() with correct
context_length during AIAgent init — regression test for the ctx -- bug.
2026-04-13 15:00:57 -07:00
Teknium
952a885fbf
fix(gateway): /stop no longer resets the session (#9224)
/stop was calling suspend_session() which marked the session for auto-reset
on the next message. This meant users lost their conversation history every
time they stopped a running agent — especially painful for untitled sessions
that can't be resumed by name.

Now /stop just interrupts the agent and cleans the session lock. The session
stays intact so users can continue the conversation.

The suspend behavior was introduced in #7536 to break stuck session resume
loops on gateway restart. That case is already handled by
suspend_recently_active() which runs at gateway startup, so removing it from
/stop doesn't regress the original fix.
2026-04-13 14:59:05 -07:00
Brooklyn Nicholson
ebe3270430 fix: fake models 2026-04-13 14:57:42 -05:00
yongtenglei
2773b18b56 fix(run_agent): refresh activity during streaming responses
Previously, long-running streamed responses could be incorrectly treated
as idle by the gateway/cron inactivity timeout even while tokens were
actively arriving. The _touch_activity() call (which feeds
get_activity_summary() polled by the external timeout) was either called
only on the first chunk (chat completions) or not at all (Anthropic,
Codex, Codex fallback).

Add _touch_activity() on every chunk/event in all four streaming paths
so the inactivity monitor knows data is still flowing.

Fixes #8760
2026-04-13 10:55:51 -07:00
墨綠BG
c449cd1af5 fix(config): restore custom providers after v11→v12 migration
The v11→v12 migration converts custom_providers (list) into providers
(dict), then deletes the list. But all runtime resolvers read from
custom_providers — after migration, named custom endpoints silently stop
resolving and fallback chains fail with AuthError.

Add get_compatible_custom_providers() that reads from both config schemas
(legacy custom_providers list + v12+ providers dict), normalizes entries,
deduplicates, and returns a unified list. Update ALL consumers:

- hermes_cli/runtime_provider.py: _get_named_custom_provider() + key_env
- hermes_cli/auth_commands.py: credential pool provider names
- hermes_cli/main.py: model picker + _model_flow_named_custom()
- agent/auxiliary_client.py: key_env + custom_entry model fallback
- agent/credential_pool.py: _iter_custom_providers()
- cli.py + gateway/run.py: /model switch custom_providers passthrough
- run_agent.py + gateway/run.py: per-model context_length lookup

Also: use config.pop() instead of del for safer migration, fix stale
_config_version assertions in tests, add pool mock to codex test.

Co-authored-by: 墨綠BG <s5460703@gmail.com>
Closes #8776, salvaged from PR #8814
2026-04-13 10:50:52 -07:00
Teknium
0dd26c9495
fix(tests): fix 78 CI test failures and remove dead test (#9036)
Production fixes:
- voice_mode.py: add is_recording property to AudioRecorder (parity with TermuxAudioRecorder)
- cronjob_tools.py: add sms example to deliver description

Test fixes:
- test_real_interrupt_subagent: add missing _execution_thread_id (fixes 19 cascading failures from leaked _build_system_prompt patch)
- test_anthropic_error_handling: add _FakeMessages, override _interruptible_streaming_api_call (6 fixes)
- test_ctx_halving_fix: add missing request_overrides attribute (4 fixes)
- test_context_token_tracking: set _disable_streaming=True for non-streaming test path (4 fixes)
- test_dict_tool_call_args: set _disable_streaming=True (1 fix)
- test_provider_parity: add model='gpt-4o' for AIGateway tests to meet 64K minimum context (4 fixes)
- test_session_race_guard: add user_id to SessionSource (5 fixes)
- test_restart_drain/helpers: add user_id to SessionSource (2 fixes)
- test_telegram_photo_interrupts: add user_id to SessionSource
- test_interrupt: target thread_id for per-thread interrupt system (2 fixes)
- test_zombie_process_cleanup: rewrite with object.__new__ for refactored GatewayRunner.stop() (1 fix)
- test_browser_camofox_state: update config version 15->17 (1 fix)
- test_trajectory_compressor_async: widen lookback window 10->20 for line-shifted AsyncOpenAI (1 fix)
- test_voice_mode: fixed by production is_recording addition (5 fixes)
- test_voice_cli_integration: add _attached_images to CLI stub (2 fixes)
- test_hermes_logging: explicit propagation/level reset for cross-test pollution defense (1 fix)
- test_run_agent: add base_url for OpenRouter detection tests (2 fixes)

Deleted:
- test_inline_think_blocks_reasoning_only_accepted: tested unimplemented inline <think> handling
2026-04-13 10:50:24 -07:00
kimsr96
b909a9efef fix: extend ASCII-locale UnicodeEncodeError recovery to full request payload
The existing ASCII codec handler only sanitized conversation messages,
leaving tool schemas, system prompts, ephemeral prompts, prefill messages,
and HTTP headers as unhandled sources of non-ASCII content. On systems
with LANG=C or non-UTF-8 locale, Unicode symbols in tool descriptions
(e.g. arrows, em-dashes from prompt_builder) and system prompt content
would cause UnicodeEncodeError that fell through to the error path.

Changes:
- Add _sanitize_structure_non_ascii() generic recursive walker for
  nested dict/list payloads
- Add _sanitize_tools_non_ascii() thin wrapper for tool schemas
- Add _force_ascii_payload flag: once ASCII locale is detected, all
  subsequent API calls get proactively sanitized (prevents recurring
  failures from new tool results bringing fresh Unicode each turn)
- Extend the ASCII codec error handler to sanitize: prefill_messages,
  tool schemas (self.tools), system prompt, ephemeral system prompt,
  and default HTTP headers
- Update stale comment that acknowledged the gap

Cherry-picked from PR #8834 (credential pool changes dropped as
separate concern).
2026-04-13 05:16:35 -07:00
Geoff
76eecf3819 fix(model): Support providers: dict for custom endpoints in /model
Two fixes for user-defined providers in config.yaml:

1. list_authenticated_providers() - now includes full models list from
   providers.*.models array, not just default_model. This fixes /model
   showing only one model when multiple are configured.

2. _get_named_custom_provider() - now checks providers: dict (new-style)
   in addition to custom_providers: list (legacy). This fixes credential
   resolution errors when switching models via /model command.

Both changes are backwards compatible with existing custom_providers list format.

Fixes: Only one model appears for custom providers in /model selection
2026-04-13 05:16:21 -07:00
konsisumer
311dac1971 fix(file_tools): block /private/etc writes on macOS symlink bypass
On macOS, /etc is a symlink to /private/etc, so os.path.realpath()
resolves /etc/hosts to /private/etc/hosts. The sensitive path check
only matched /etc/ prefixes against the resolved path, allowing
writes to system files on macOS.

- Add /private/etc/ and /private/var/ to _SENSITIVE_PATH_PREFIXES
- Check both realpath-resolved and normpath-normalized paths
- Add regression tests for macOS symlink bypass

Closes #8734
Co-authored-by: ElhamDevelopmentStudio (PR #8829)
2026-04-13 05:15:05 -07:00
Teknium
587eeb56b9 chore: remove duplicate dead _try_gh_cli_token / _gh_cli_candidates from auth.py
These functions were duplicated between auth.py and copilot_auth.py.
The auth.py copies had zero production callers — only copilot_auth.py's
versions are used. Redirect the test import to the live copy and update
monkeypatch targets accordingly.
2026-04-13 05:12:36 -07:00
luyao618
8ec1608642 fix(agent): propagate api_mode to vision provider resolution
resolve_vision_provider_client() computed resolved_api_mode from config
but never passed it to downstream resolve_provider_client() or
_get_cached_client() calls, causing custom providers with
api_mode: anthropic_messages to crash when used for vision tasks.

Also remove the for_vision special case in _normalize_aux_provider()
that incorrectly discarded named custom provider identifiers.

Fixes #8857

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-13 05:02:54 -07:00
Teknium
e3ffe5b75f
fix: remove legacy compression.summary_* config and env var fallbacks (#8992)
Remove the backward-compat code paths that read compression provider/model
settings from legacy config keys and env vars, which caused silent failures
when auto-detection resolved to incompatible backends.

What changed:
- Remove compression.summary_model, summary_provider, summary_base_url from
  DEFAULT_CONFIG and cli.py defaults
- Remove backward-compat block in _resolve_task_provider_model() that read
  from the legacy compression section
- Remove _get_auxiliary_provider() and _get_auxiliary_env_override() helper
  functions (AUXILIARY_*/CONTEXT_* env var readers)
- Remove env var fallback chain for per-task overrides
- Update hermes config show to read from auxiliary.compression
- Add config migration (v16→17) that moves non-empty legacy values to
  auxiliary.compression and strips the old keys
- Update example config and openclaw migration script
- Remove/update tests for deleted code paths

Compression model/provider is now configured exclusively via:
  auxiliary.compression.provider / auxiliary.compression.model

Closes #8923
2026-04-13 04:59:26 -07:00
WorldInnovationsDepartment
c1809e85e7 fix(gateway): handle stale lock files in acquire_scoped_lock
Updated the acquire_scoped_lock function to treat empty or corrupt lock files as stale. This change ensures that if a lock file exists but is invalid, it will be removed to prevent issues with stale locks. Added tests to verify recovery from both empty and corrupt lock files.
2026-04-13 04:59:25 -07:00
Teknium
a5bd56eae3
fix: eliminate provider hang dead zones in retry/timeout architecture (#8985)
Three targeted changes to close the gaps between retry layers that
caused users to experience 'No response from provider for 580s' and
'No activity for 15 minutes' despite having 5 layers of retry:

1. Remove non-streaming fallback from streaming path

   Previously, when all 3 stream retries exhausted, the code fell back
   to _interruptible_api_call() which had no stale detection and no
   activity tracking — a black hole that could hang for up to 1800s.
   Now errors propagate to the main retry loop which has richer recovery
   (credential rotation, provider fallback, backoff).

   For 'stream not supported' errors, sets _disable_streaming flag so
   the main retry loop automatically switches to non-streaming on the
   next attempt.

2. Add _touch_activity to recovery dead zones

   The gateway inactivity monitor relies on _touch_activity() to know
   the agent is alive, but activity was never touched during:
   - Stale stream detection/kill cycles (180-300s gaps)
   - Stream retry connection rebuilds
   - Main retry backoff sleeps (up to 120s)
   - Error recovery classification

   Now all these paths touch activity every ~30s, keeping the gateway
   informed during recovery cycles.

3. Add stale-call detector to non-streaming path

   _interruptible_api_call() now has the same stale detection pattern
   as the streaming path: kills hung connections after 300s (default,
   configurable via HERMES_API_CALL_STALE_TIMEOUT), scaled for large
   contexts (450s for 50K+ tokens, 600s for 100K+ tokens), disabled
   for local providers.

   Also touches activity every ~30s during the wait so the gateway
   monitor stays informed.

Env vars:
- HERMES_API_CALL_STALE_TIMEOUT: non-streaming stale timeout (default 300s)
- HERMES_STREAM_STALE_TIMEOUT: unchanged (default 180s)

Before: worst case ~2+ hours of sequential retries with no feedback
After: worst case bounded by gateway inactivity timeout (default 1800s)
with continuous activity reporting
2026-04-13 04:55:20 -07:00
Teknium
acdff020b7 test: add multi-word query tests for truncation match strategy
Tests phrase matching, proximity co-occurrence, and sliding window
coverage maximisation — the three new tiers from the truncation fix.
2026-04-13 04:54:42 -07:00
landy
dbed40f39b fix: reopen resumed gateway sessions in sqlite 2026-04-13 04:54:07 -07:00
twilwa
3a64348772 fix(discord): voice session continuity and signal handler thread safety
- Store source metadata on /voice channel join so voice input shares the
  same session as the linked text channel conversation
- Treat voice-linked text channels as free-response (skip @mention and
  auto-thread) while voice is active
- Scope the voice-linked exemption to the exact bound channel, not
  sibling threads
- Guard signal handler registration in start_gateway() for non-main
  threads (prevents RuntimeError when gateway runs in a daemon thread)
- Clean up _voice_sources on leave_voice_channel

Salvaged from PR #3475 by twilwa (Modal runtime portions excluded).
2026-04-13 04:49:21 -07:00
Teknium
381810ad50
feat: fix SQLite safety in hermes backup + add --quick snapshots + /snapshot command (#8971)
Three changes consolidated into the existing backup system:

1. Fix: hermes backup now uses sqlite3.Connection.backup() for .db files
   instead of raw file copy. Raw copy of a WAL-mode database can produce
   a corrupted backup — the backup() API handles this correctly.

2. hermes backup --quick: fast snapshot of just critical state files
   (config.yaml, state.db, .env, auth.json, cron/jobs.json, etc.)
   stored in ~/.hermes/state-snapshots/. Auto-prunes to 20 snapshots.

3. /snapshot slash command (alias /snap): in-session interface for
   quick state snapshots. create/list/restore/prune subcommands.
   Restore by ID or number. Powered by the same backup module.

No new modules — everything lives in hermes_cli/backup.py alongside
the existing full backup/import code.

No hooks in run_agent.py — purely on-demand, zero runtime overhead.

Closes the use case from PRs #8406 and #7813 with ~200 lines of new
logic instead of a 1090-line content-addressed storage engine.
2026-04-13 04:46:13 -07:00
Teknium
8dfee98d06 fix: clean up description escaping, add string-data tests
Follow-up for cherry-picked PR #8918.
2026-04-13 04:45:07 -07:00
Teknium
cea34dc7ef fix: follow-up for salvaged PR #8939
- Move test file to tests/hermes_cli/ (consistent with test layout)
- Remove unused imports (os, pytest) from test file
- Update _sanitize_env_lines docstring: now used on read + write paths
2026-04-13 04:35:37 -07:00
Mil Wang (from Dev Box)
e469f3f3db fix: sanitize .env before loading to prevent token duplication (#8908)
When .env files become corrupted (e.g. concatenated KEY=VALUE pairs on
a single line due to concurrent writes or encoding issues), both
python-dotenv and load_env() would parse the entire concatenated string
as a single value. This caused bot tokens to appear duplicated up to 8×,
triggering InvalidToken errors from the Telegram API.

Root cause: _sanitize_env_lines() — which correctly splits concatenated
lines — was only called during save_env_value() writes, not during reads.

Fix:
- load_env() now calls _sanitize_env_lines() before parsing
- env_loader.load_hermes_dotenv() sanitizes the .env file on disk
  before python-dotenv reads it, so os.getenv() also returns clean values
- Added tests reproducing the exact corruption pattern from #8908

Closes #8908
2026-04-13 04:35:37 -07:00
ismell0992-afk
e77f135ed8 fix(cli): narrow Nous Hermes non-agentic warning to actual hermes-3/-4 models
The startup warning that Nous Research Hermes 3 & 4 models are not agentic
fired on any model whose name contained "hermes" anywhere, via a plain
substring check. That false-positived on unrelated local Modelfiles such
as `hermes-brain:qwen3-14b-ctx16k` — a tool-capable Qwen3 wrapper that
happens to live under a custom "hermes" tag namespace — making the warning
noise for legitimate setups.

Replace the substring check with a narrow regex anchored on `^`, `/`, or
`:` boundaries that only matches the real Hermes-3 / Hermes-4 chat family
(e.g. `NousResearch/Hermes-3-Llama-3.1-70B`, `hermes-4-405b`,
`openrouter/hermes3:70b`). Consolidate into a single helper
`is_nous_hermes_non_agentic()` in `hermes_cli.model_switch` so the CLI
and the canonical check don't drift, and route the duplicate inline site
in `cli.HermesCLI._print_warnings()` through the helper.

Add a parametrized test covering positive matches (real Hermes-3/-4
names) and a broad set of negatives (custom Modelfiles, Qwen/Claude/GPT,
older Nous-Hermes-2 families, bare "hermes", empty string, and the
"brain-hermes-3-impostor" boundary case).
2026-04-13 04:33:52 -07:00
ismell0992-afk
3e99964789 fix(agent): prefer Ollama Modelfile num_ctx over GGUF training max
_query_local_context_length was checking model_info.context_length
(the GGUF training max) before num_ctx (the Modelfile runtime override),
inverse to query_ollama_num_ctx. The two helpers therefore disagreed on
the same model:

  hermes-brain:qwen3-14b-ctx32k     # Modelfile: num_ctx 32768
  underlying qwen3:14b GGUF         # qwen3.context_length: 40960

query_ollama_num_ctx correctly returned 32768 (the value Ollama will
actually allocate KV cache for). _query_local_context_length returned
40960, which let ContextCompressor grow conversations past 32768 before
triggering compression — at which point Ollama silently truncated the
prefix, corrupting context.

Swap the order so num_ctx is checked first, matching query_ollama_num_ctx.
Adds a parametrized test that seeds both values and asserts num_ctx wins.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-13 04:24:07 -07:00
Teknium
397eae5d93 fix: recover partial streamed content on connection failure
When streaming fails after partial content delivery (e.g. OpenRouter
timeout kills connection mid-response), the stub response now carries
the accumulated streamed text instead of content=None.

Two fixes:
1. The partial-stream stub response includes recovered content from
   _current_streamed_assistant_text — the text that was already
   delivered to the user via stream callbacks before the connection
   died.

2. The empty response recovery chain now checks for partial stream
   content BEFORE falling back to _last_content_with_tools (prior
   turn content) or wasting API calls on retries. This prevents:
   - Showing wrong content from a prior turn
   - Burning 3+ unnecessary retry API calls
   - Falling through to '(empty)' when the user already saw content

The root cause: OpenRouter has a ~125s inactivity timeout. When
Anthropic's SSE stream goes silent during extended reasoning, the
proxy kills the connection. The model's text was already partially
streamed but the stub discarded it, triggering the empty recovery
chain which would show stale prior-turn content or waste retries.
2026-04-13 02:12:01 -07:00
Teknium
276d20e62c
fix(gateway): /restart uses service restart under systemd instead of detached subprocess
The detached bash subprocess spawned by /restart gets killed by
systemd's KillMode=mixed cgroup cleanup, leaving the gateway dead.

Under systemd (detected via INVOCATION_ID env var), /restart now uses
via_service=True which exits with code 75 — RestartForceExitStatus=75
in the unit file makes systemd auto-restart the service. The detached
subprocess approach is preserved as fallback for non-systemd
environments (Docker, tmux, foreground mode).
2026-04-12 22:32:19 -07:00
Teknium
e2a9b5369f
feat: web UI dashboard for managing Hermes Agent (#8756)
* feat: web UI dashboard for managing Hermes Agent (salvage of #8204/#7621)

Adds an embedded web UI dashboard accessible via `hermes web`:
- Status page: agent version, active sessions, gateway status, connected platforms
- Config editor: schema-driven form with tabbed categories, import/export, reset
- API Keys page: set, clear, and view redacted values with category grouping
- Sessions, Skills, Cron, Logs, and Analytics pages

Backend:
- hermes_cli/web_server.py: FastAPI server with REST endpoints
- hermes_cli/config.py: reload_env() utility for hot-reloading .env
- hermes_cli/main.py: `hermes web` subcommand (--port, --host, --no-open)
- cli.py / commands.py: /reload slash command for .env hot-reload
- pyproject.toml: [web] optional dependency extra (fastapi + uvicorn)
- Both update paths (git + zip) auto-build web frontend when npm available

Frontend:
- Vite + React + TypeScript + Tailwind v4 SPA in web/
- shadcn/ui-style components, Nous design language
- Auto-refresh status page, toast notifications, masked password inputs

Security:
- Path traversal guard (resolve().is_relative_to()) on SPA file serving
- CORS localhost-only via allow_origin_regex
- Generic error messages (no internal leak), SessionDB handles closed properly

Tests: 47 tests covering reload_env, redact_key, API endpoints, schema
generation, path traversal, category merging, internal key stripping,
and full config round-trip.

Original work by @austinpickett (PR #1813), salvaged by @kshitijk4poor
(PR #7621#8204), re-salvaged onto current main with stale-branch
regressions removed.

* fix(web): clean up status page cards, always rebuild on `hermes web`

- Remove config version migration alert banner from status page
- Remove config version card (internal noise, not surfaced in TUI)
- Reorder status cards: Agent → Gateway → Active Sessions (3-col grid)
- `hermes web` now always rebuilds from source before serving,
  preventing stale web_dist when editing frontend files

* feat(web): full-text search across session messages

- Add GET /api/sessions/search endpoint backed by FTS5
- Auto-append prefix wildcards so partial words match (e.g. 'nimb' → 'nimby')
- Debounced search (300ms) with spinner in the search icon slot
- Search results show FTS5 snippets with highlighted match delimiters
- Expanding a search hit auto-scrolls to the first matching message
- Matching messages get a warning ring + 'match' badge
- Inline term highlighting within Markdown (text, bold, italic, headings, lists)
- Clear button (x) on search input for quick reset

---------

Co-authored-by: emozilla <emozilla@nousresearch.com>
2026-04-12 22:26:28 -07:00
Dusk1e
c052cf0eea fix(security): validate domain/service params in ha_call_service to prevent path traversal 2026-04-12 22:26:15 -07:00
Teknium
8a64f3e368 feat(gateway): notify /restart requester when gateway comes back online
When a user sends /restart, the gateway now persists their routing info
(platform, chat_id, thread_id) to .restart_notify.json. After the new
gateway process starts and adapters connect, it reads the file, sends a
'Gateway restarted successfully' message to that specific chat, and
cleans up the file.

This follows the same pattern as _send_update_notification (used by
/update). Thread IDs are preserved so the notification lands in the
correct Telegram topic or Discord thread.

Previously, after /restart the user had no feedback that the gateway was
back — they had to send a message to find out. Now they get a proactive
notification and know their session continues.
2026-04-12 22:23:48 -07:00
Teknium
83ca0844f7
fix: preserve dots in model names for OpenCode Zen and ZAI providers (#8794)
OpenCode Zen was in _DOT_TO_HYPHEN_PROVIDERS, causing all dotted model
names (minimax-m2.5-free, gpt-5.4, glm-5.1) to be mangled. The fix:

Layer 1 (model_normalize.py): Remove opencode-zen from the blanket
dot-to-hyphen set. Add an explicit block that preserves dots for
non-Claude models while keeping Claude hyphenated (Zen's Claude
endpoint uses anthropic_messages mode which expects hyphens).

Layer 2 (run_agent.py _anthropic_preserve_dots): Add opencode-zen and
zai to the provider allowlist. Broaden URL check from opencode.ai/zen/go
to opencode.ai/zen/ to cover both Go and Zen endpoints. Add bigmodel.cn
for ZAI URL detection.

Also adds glm-5.1 to ZAI model lists in models.py and setup.py.

Closes #7710

Salvaged from contributions by:
- konsisumer (PR #7739, #7719)
- DomGrieco (PR #8708)
- Esashiero (PR #7296)
- sharziki (PR #7497)
- XiaoYingGee (PR #8750)
- APTX4869-maker (PR #8752)
- kagura-agent (PR #7157)
2026-04-12 21:22:59 -07:00
Teknium
a0cd2c5338
fix(gateway): verbose tool progress no longer truncates args when tool_preview_length is 0 (#8735)
When tool_preview_length is 0 (default for platforms without a tier
default, like Session), verbose mode was truncating args JSON to 200
characters.  Since the user explicitly opted into verbose mode, they
expect full tool call detail — the 200-char cap defeated the purpose.

Now: tool_preview_length=0 means no truncation in verbose mode.
Positive values still cap as before.  Platform message-length limits
handle overflow naturally.
2026-04-12 20:05:12 -07:00
Teknium
15b1a3aa69
fix: improve WhatsApp UX — chunking, formatting, streaming (#8723)
Three changes that address the poor WhatsApp experience reported by users:

1. Reclassify WhatsApp from TIER_LOW to TIER_MEDIUM in display_config.py
   — enables streaming and tool progress via the existing Baileys /edit
   bridge endpoint. Users now see progressive responses instead of
   minutes of silence followed by a wall of text.

2. Lower MAX_MESSAGE_LENGTH from 65536 to 4096 and add proper chunking
   — send() now calls format_message() and truncate_message() before
   sending, then loops through chunks with a small delay between them.
   The base class truncate_message() already handles code block boundary
   detection (closes/reopens fences at chunk boundaries). reply_to is
   only set on the first chunk.

3. Override format_message() with WhatsApp-specific markdown conversion
   — converts **bold** to *bold*, ~~strike~~ to ~strike~, headers to
   bold text, and [links](url) to text (url). Code blocks and inline
   code are protected from conversion via placeholder substitution.

Together these fix the two user complaints:
- 'sends the whole code all the time' → now chunked at 4K with proper
  formatting
- 'terminal gets interrupted and gets cooked' → streaming + tool progress
  give visual feedback so users don't accidentally interrupt with
  follow-up messages
2026-04-12 19:20:13 -07:00
Teknium
5fae356a85
fix: show full last assistant response when resuming a session (#8724)
When resuming a session with --resume or -c, the last assistant response
was truncated to 200 chars / 3 lines just like older messages in the recap.
This forced users to waste tokens re-asking for the response.

Now the last assistant message in the recap is shown in full with non-dim
styling, so users can see exactly where they left off. Earlier messages
remain truncated for compact display.

Changes:
- Track un-truncated text for the last assistant entry during collection
- Replace last entry with full text after history trimming
- Render last assistant entry with bold (non-dim) styling
- Update existing truncation tests to use multi-message histories
- Add new tests for full last response display (char + multiline)
2026-04-12 19:07:14 -07:00
Teknium
9e992df8ae
fix(telegram): use UTF-16 code units for message length splitting (#8725)
Port from nearai/ironclaw#2304: Telegram's 4096 character limit is
measured in UTF-16 code units, not Unicode codepoints. Characters
outside the Basic Multilingual Plane (emoji like 😀, CJK Extension B,
musical symbols) are surrogate pairs: 1 Python char but 2 UTF-16 units.

Previously, truncate_message() used Python's len() which counts
codepoints. This could produce chunks exceeding Telegram's actual limit
when messages contain many astral-plane characters.

Changes:
- Add utf16_len() helper and _prefix_within_utf16_limit() for
  UTF-16-aware string measurement and truncation
- Add _custom_unit_to_cp() binary-search helper that maps a custom-unit
  budget to the largest safe codepoint slice position
- Update truncate_message() to accept optional len_fn parameter
- Telegram adapter now passes len_fn=utf16_len when splitting messages
- Fix fallback truncation in Telegram error handler to use
  _prefix_within_utf16_limit instead of codepoint slicing
- Update send_message_tool.py to use utf16_len for Telegram platform
- Add comprehensive tests: utf16_len, _prefix_within_utf16_limit,
  truncate_message with len_fn (emoji splitting, content preservation,
  code block handling)
- Update mock lambdas in reply_mode tests to accept **kw for len_fn
2026-04-12 19:06:20 -07:00
Teknium
f724079d3b fix(gateway): reject known-weak placeholder credentials at startup
Port from openclaw/openclaw#64586: users who copy .env.example without
changing placeholder values now get a clear error at startup instead of
a confusing auth failure from the platform API. Also rejects placeholder
API_SERVER_KEY when binding to a network-accessible address.

Cherry-picked from PR #8677.
2026-04-12 18:05:41 -07:00
Teknium
c7d8d109ff fix(matrix): trust m.mentions.user_ids as authoritative mention signal
Port from openclaw/openclaw#64796: Per MSC3952 / Matrix v1.7, the
m.mentions.user_ids field is the authoritative mention signal. Clients
that populate m.mentions but don't duplicate @bot in the body text
were being silently dropped when MATRIX_REQUIRE_MENTION=true.

Cherry-picked from PR #8673.
2026-04-12 18:05:41 -07:00
Teknium
88a12af58c
feat: add hermes debug share — upload debug report to pastebin (#8681)
* feat: add `hermes debug share` — upload debug report to pastebin

Adds a new `hermes debug share` command that collects system info
(via hermes dump), recent logs (agent.log, errors.log, gateway.log),
and uploads the combined report to a paste service (paste.rs primary,
dpaste.com fallback). Returns a shareable URL for support.

Options:
  --lines N    Number of log lines per file (default: 200)
  --expire N   Paste expiry in days (default: 7, dpaste.com only)
  --local      Print report locally without uploading

Files:
  hermes_cli/debug.py           - New module: paste upload + report collection
  hermes_cli/main.py            - Wire cmd_debug + argparse subparser
  tests/hermes_cli/test_debug.py - 19 tests covering upload, collection, CLI

* feat: upload full agent.log and gateway.log as separate pastes

hermes debug share now uploads up to 3 pastes:
  1. Summary report (system info + log tails) — always
  2. Full agent.log (last ~500KB) — if file exists
  3. Full gateway.log (last ~500KB) — if file exists

Each paste uploads independently; log upload failures are noted
but don't block the main report. Output shows all links aligned:

  Report     https://paste.rs/abc
  agent.log  https://paste.rs/def
  gateway.log https://paste.rs/ghi

Also adds _read_full_log() with size-capped tail reading to stay
within paste service limits (~512KB per file).

* feat: prepend hermes dump to each log paste for self-contained context

Each paste (agent.log, gateway.log) now starts with the hermes dump
output so clicking any single link gives full system context without
needing to cross-reference the summary report.

Refactored dump capture into _capture_dump() — called once and
reused across the summary report and each log paste.

* fix: fall back to .1 rotated log when primary log is missing or empty

When gateway.log (or agent.log) doesn't exist or is empty, the debug
share now checks for the .1 rotation file. This is common — the
gateway rotates logs and the primary file may not exist yet.

Extracted _resolve_log_path() to centralize the fallback logic for
both _read_log_tail() and _read_full_log().

* chore: remove unused display_hermes_home import
2026-04-12 18:05:14 -07:00
Teknium
bcad679799 fix(api_server): normalize array-based content parts in chat completions
Some OpenAI-compatible clients (Open WebUI, LobeChat, etc.) send
message content as an array of typed parts instead of a plain string:

    [{"type": "text", "text": "hello"}]

The agent pipeline expects strings, so these array payloads caused
silent failures or empty messages.

Add _normalize_chat_content() with defensive limits (recursion depth,
list size, output length) and apply it to both the Chat Completions
and Responses API endpoints. The Responses path had inline
normalization that only handled input_text/output_text — the shared
function also handles the standard 'text' type.

Salvaged from PR #7980 (ikelvingo) — only the content normalization;
the SSE and Weixin changes in that PR were regressions and are not
included.

Co-authored-by: ikelvingo <ikelvingo@users.noreply.github.com>
2026-04-12 18:03:16 -07:00
Teknium
bc4e2744c3 test: add tests for compression config_context_length passthrough
- Test that auxiliary.compression.context_length from config is forwarded
  to get_model_context_length (positive case)
- Test that invalid/non-integer config values are silently ignored
- Fix _make_agent() to set config=None (cherry-picked code reads self.config)
2026-04-12 17:52:34 -07:00
Teknium
0d0d27d45e test(tts): add speed config tests for Edge, OpenAI, and MiniMax
12 tests covering:
- Provider-specific speed overrides global speed
- Global speed used as fallback
- Default (no speed) preserves existing behavior
- Edge SSML rate string conversion (positive/negative)
- OpenAI speed clamping to 0.25-4.0 range
2026-04-12 16:46:18 -07:00
Teknium
a266238e1e
fix(weixin): streaming cursor, media uploads, markdown links, blank messages (#8665)
Four fixes for the Weixin/WeChat adapter, synthesized from the best
aspects of community PRs #8407, #8521, #8360, #7695, #8308, #8525,
#7531, #8144, #8251.

1. Streaming cursor (▉) stuck permanently — WeChat doesn't support
   message editing, so the cursor appended during streaming can never
   be removed.  Add SUPPORTS_MESSAGE_EDITING = False to WeixinAdapter
   and check it in gateway/run.py to use an empty cursor for non-edit
   platforms.  (Fixes #8307, #8326)

2. Media upload failures — two bugs in _send_file():
   a) upload_full_url path used PUT (404 on WeChat CDN); now uses POST.
   b) aes_key was base64(raw_bytes) but the iLink API expects
      base64(hex_string); images showed as grey boxes.  (Fixes #8352, #7529)
   Also: unified both upload paths into _upload_ciphertext(), preferring
   upload_full_url.  Added send_video/send_voice methods and voice_item
   media builder for audio/.silk files.  Added video_md5 field.

3. Markdown links stripped — WeChat can't render [text](url), so
   format_message() now converts them to 'text (url)' plaintext.
   Code blocks are preserved.  (Fixes #7617)

4. Blank message prevention — three guards:
   a) _split_text_for_weixin_delivery('') returns [] not ['']
   b) send() filters empty/whitespace chunks before _send_text_chunk
   c) _send_message() raises ValueError for empty text as safety net

Community credit: joei4cm (#8407), lyonDan (#8521), SKFDJKLDG (#8360),
tomqiaozc (#7695), joshleeeeee (#8308), luoxiao6645(#8525),
longsizhuo (#7531), Astral-Yang (#8144), QingWei-Li (#8251).
2026-04-12 16:43:25 -07:00
Teknium
c83674dd77 fix: unify OpenClaw detection, add isatty guard, fix print_warning import
Combines detection from both PRs into _detect_openclaw_processes():
- Cross-platform process scan (pgrep/tasklist/PowerShell) from PR #8102
- systemd service check from PR #8555
- Returns list[str] with details about what's found

Fixes in cleanup warning (from PR #8555):
- print_warning -> print_error/print_info (print_warning not in import chain)
- Added isatty() guard for non-interactive sessions
- Removed duplicate _check_openclaw_running() in favor of shared function

Updated all tests to match new API.
2026-04-12 16:40:37 -07:00
dirtyfancy
9fb36738a7 fix(claw): address Copilot review on Windows detection and non-interactive prompt
- Use PowerShell to inspect node.exe command lines on Windows,
  since tasklist output does not include them.
- Also check for dedicated openclaw.exe/clawd.exe processes.
- Skip the interactive prompt in non-interactive sessions so the
  preview-only behavior is preserved.
- Update tests accordingly.

Relates to #7907
2026-04-12 16:40:37 -07:00
dirtyfancy
5af9614f6d fix(claw): warn if OpenClaw is running before migration
Add _is_openclaw_running() and _warn_if_openclaw_running() to detect
OpenClaw processes (via pgrep/tasklist) before hermes claw migrate.
Warns the user that messaging platforms only allow one active session
per bot token, and lets them cancel or continue.

Fixes #7907
2026-04-12 16:40:37 -07:00
alt-glitch
5e1197a42e fix(gateway): harden Docker/container gateway pathway
Centralize container detection in hermes_constants.is_container() with
process-lifetime caching, matching existing is_wsl()/is_termux() patterns.
Dedup _is_inside_container() in config.py to delegate to the new function.

Add _run_systemctl() wrapper that converts FileNotFoundError to RuntimeError
for defense-in-depth — all 10 bare subprocess.run(_systemctl_cmd(...)) call
sites now route through it.

Make supports_systemd_services() return False in containers and when
systemctl binary is absent (shutil.which check).

Add Docker-specific guidance in gateway_command() for install/uninstall/start
subcommands — exit 0 with helpful instructions instead of crashing.

Make 'hermes status' show 'Manager: docker (foreground)' and 'hermes dump'
show 'running (docker, pid N)' inside containers.

Fix setup_gateway() to use supports_systemd instead of _is_linux for all
systemd-related branches, and show Docker restart policy instructions in
containers.

Replace inline /.dockerenv check in voice_mode.py with is_container().

Fixes #7420

Co-authored-by: teknium1 <teknium1@users.noreply.github.com>
2026-04-12 16:36:11 -07:00
sprmn24
18ab5c99d1 fix(backup): correct marker filenames in _validate_backup_zip
The backup validation checked for 'hermes_state.db' and 'memory_store.db'
as telltale markers of a valid Hermes backup zip. Neither name exists in a
real Hermes installation — the actual database file is 'state.db'
(hermes_state.py: DEFAULT_DB_PATH = get_hermes_home() / 'state.db').

A fresh Hermes installation produces:
  ~/.hermes/state.db        (actual name)
  ~/.hermes/config.yaml
  ~/.hermes/.env

Because the marker set never matched 'state.db', a backup zip containing
only 'state.db' plus 'config.yaml' would fail validation with:
  'zip does not appear to be a Hermes backup'
and the import would exit with sys.exit(1), silently rejecting a valid backup.

Fix: replace the wrong marker names with the correct filename.

Adds TestValidateBackupZip with three cases:
- state.db is accepted as a valid marker
- old wrong names (hermes_state.db, memory_store.db) alone are rejected
- config.yaml continues to pass (existing behaviour preserved)
2026-04-12 16:35:56 -07:00
Teknium
d6785dc4d4
fix: empty response recovery for reasoning models (mimo, qwen, GLM) (#8609)
Three fixes for the (empty) response bug affecting open reasoning models:

1. Allow retries after prefill exhaustion — models like mimo-v2-pro always
   populate reasoning fields via OpenRouter, so the old 'not _has_structured'
   guard on the retry path blocked retries for EVERY reasoning model after
   the 2 prefill attempts.  Now: 2 prefills + 3 retries = 6 total attempts
   before (empty).

2. Reset prefill/retry counters on tool-call recovery — the counters
   accumulated across the entire conversation, never resetting during
   tool-calling turns.  A model cycling empty→prefill→tools→empty burned
   both prefill attempts and the third empty got zero recovery.  Now
   counters reset when prefill succeeds with tool calls.

3. Strip think blocks before _truly_empty check — inline <think> content
   made the string non-empty, skipping both retry paths.

Reported by users on Telegram with xiaomi/mimo-v2-pro and qwen3.5 models.
Reproduced: qwen3.5-9b emits tool calls as XML in reasoning field instead
of proper function calls, causing content=None + tool_calls=None + reasoning
with embedded <tool_call> XML.  Prefill recovery works but counter
accumulation caused permanent (empty) in long sessions.
2026-04-12 15:38:11 -07:00
Teknium
1179918746 fix: salvage follow-ups for Feishu QR onboarding (#7706)
- Remove duplicate _setup_feishu() definition (old 3-line version left
  behind by cherry-pick — Python picked the new one but dead code
  remained)
- Remove misleading 'Disable direct messages' DM option — the Feishu
  adapter has no DM policy mechanism, so 'disable' produced identical
  env vars to 'pairing'. Users who chose 'disable' would still see
  pairing prompts. Reduced to 3 options: pairing, allow-all, allowlist.
- Fix test_probe_returns_bot_info_on_success and
  test_probe_returns_none_on_failure: patch FEISHU_AVAILABLE=True so
  probe_bot() takes the SDK path when lark_oapi is not installed
2026-04-12 13:05:56 -07:00
Shuo
d7785f4d5b feat(feishu): add scan-to-create onboarding for Feishu / Lark
Add a QR-based onboarding flow to `hermes gateway setup` for Feishu / Lark.
Users scan a QR code with their phone and the platform creates a fully
configured bot application automatically — matching the existing WeChat
QR login experience.

Setup flow:
- Choose between QR scan-to-create (new app) or manual credential input (existing app)
- Connection mode selection (WebSocket / Webhook)
- DM security policy (pairing / open / allowlist / disabled)
- Group chat policy (open with @mention / disabled)

Implementation:
- Onboard functions (init/begin/poll/QR/probe) in gateway/platforms/feishu.py
- _setup_feishu() in hermes_cli/gateway.py with manual fallback
- probe_bot uses lark_oapi SDK when available, raw HTTP fallback otherwise
- qr_register() catches expected errors (network/protocol), propagates bugs
- Poll handles HTTP 4xx JSON responses and feishu/lark domain auto-detection

Tests:
- 25 tests for onboard module (registration, QR, probe, contract, negative paths)
- 16 tests for setup flow (credentials, connection mode, DM policy, group policy,
  adapter integration verifying env vars produce valid FeishuAdapterSettings)

Change-Id: I720591ee84755f32dda95fbac4b26dc82cbcf823
2026-04-12 13:05:56 -07:00
Teknium
400fe9b2a1 fix: add <thought> stripping to auxiliary_client + tests
auxiliary_client.py had its own regex mirroring _strip_think_blocks
but was missing the <thought> variant. Also adds test coverage for
<thought> paired and orphaned tags.
2026-04-12 12:44:49 -07:00
Teknium
06a17c57ae
fix: improve profile creation UX — seed SOUL.md + credential warning (#8553)
Fresh profiles (created without --clone) now:
- Auto-seed a default SOUL.md immediately, so users have a file to
  customize right away instead of discovering it only after first use
- Print a clear warning that the profile has no API keys and will
  inherit from the shell environment unless configured separately
- Show the SOUL.md path for personality customization

Previously, fresh profiles started with no SOUL.md (only seeded on
first use via ensure_hermes_home), no mention of credential isolation,
and no guidance about customizing personality. Users reported confusion
about profiles using the wrong model/plan tokens and SOUL.md not
being read — both traced to operational gaps in the creation UX.

Closes #8093 (investigated: code correctly loads SOUL.md from profile
HERMES_HOME; issue was operational, not a code bug).
2026-04-12 12:22:34 -07:00
Brooklyn Nicholson
2aea75e91e Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-12 13:18:55 -05:00
Teknium
4eecaf06e4
fix: prevent duplicate update prompt spam in gateway watcher (#8343)
The _watch_update_progress() poll loop never deleted .update_prompt.json
after forwarding the prompt to the user, causing the same prompt to be
re-sent every poll cycle (2s). Two fixes:

1. Delete .update_prompt.json after forwarding — the update process only
   polls for .update_response, it doesn't need the prompt file to persist.
2. Guard re-sends with _update_prompt_pending check — belt-and-suspenders
   to prevent duplicates even under race conditions.

Add regression test asserting the prompt is sent exactly once.
2026-04-12 04:52:59 -07:00
Teknium
45e60904c6
fix: fall back to provider's default model when model config is empty (#8303)
When a user configures a provider (e.g. `hermes auth add openai-codex`)
but never selects a model via `hermes model`, the gateway and CLI would
pass an empty model string to the API, causing:
  'Codex Responses request model must be a non-empty string'

Now both gateway (_resolve_session_agent_runtime) and CLI
(_ensure_runtime_credentials) detect an empty model and fill it from
the provider's first catalog entry in _PROVIDER_MODELS. This covers
all providers that have a static model list (openai-codex, anthropic,
gemini, copilot, etc.).

The fix is conservative: it only triggers when model is truly empty
and a known provider was resolved. Explicit model choices are never
overridden.
2026-04-12 03:53:30 -07:00
Teknium
b6b6b02f0f
fix: prevent unwanted session auto-reset after graceful gateway restarts (#8299)
When the gateway shuts down gracefully (hermes update, gateway restart,
/restart), it now writes a .clean_shutdown marker file. On the next
startup, if this marker exists, suspend_recently_active() is skipped
and the marker is cleaned up.

Previously, suspend_recently_active() fired on EVERY startup —
including planned restarts from hermes update or hermes gateway restart.
This caused users to lose their conversation history unexpectedly: the
session would be marked as suspended, and the next message would
trigger an auto-reset with a notification the user never asked for.

The original purpose of suspend_recently_active() is crash recovery —
preventing stuck sessions that were mid-processing when the gateway
died unexpectedly. Graceful shutdowns already drain active agents via
_drain_active_agents(), so there is no stuck-session risk. After a
crash (no marker written), suspension still fires as before.

Fixes the scenario where a user asks the agent to run hermes update,
the gateway restarts, and the user's next message gets an unwanted
'Session automatically reset' notification with their history cleared.
2026-04-12 03:03:07 -07:00
Teknium
56e3ee2440
fix: write update exit code before gateway restart (cgroup kill race) (#8288)
When /update runs via Telegram, hermes update --gateway is spawned inside
the gateway's systemd cgroup.  The update process itself calls
systemctl restart hermes-gateway, which tears down the cgroup with
KillMode=mixed — SIGKILL to all remaining processes.  The wrapping bash
shell is killed before it can execute the exit-code epilogue, so
.update_exit_code is never created.  The new gateway's update watcher
then polls for 30 minutes and sends a spurious timeout message.

Fix: write .update_exit_code from Python inside cmd_update() immediately
after the git pull + pip install succeed ("Update complete!"), before
attempting the gateway restart.  The shell epilogue still writes it too
(idempotent overwrite), but now the marker exists even when the process
is killed mid-restart.
2026-04-12 02:33:21 -07:00
Teknium
b321330362
feat: add WSL environment hint to system prompt (#8285)
When running inside WSL (Windows Subsystem for Linux), inject a hint into
the system prompt explaining that the Windows host filesystem is mounted
at /mnt/c/, /mnt/d/, etc. This lets the agent naturally translate Windows
paths (Desktop, Documents) to their /mnt/ equivalents without the user
needing to configure anything.

Uses the existing is_wsl() detection from hermes_constants (cached,
checks /proc/version for 'microsoft'). Adds build_environment_hints()
in prompt_builder.py — extensible for Termux, Docker, etc. later.

Closes the UX gap where WSL users had to manually explain path
translation to the agent every session.
2026-04-12 02:26:28 -07:00
Teknium
95fa78eb6c
fix: write refreshed Codex tokens back to ~/.codex/auth.json (#8277)
OpenAI OAuth refresh tokens are single-use and rotate on every refresh.
When Hermes refreshes a Codex token, it consumed the old refresh_token
but never wrote the new pair back to ~/.codex/auth.json. This caused
Codex CLI and VS Code to fail with 'refresh_token_reused' on their
next refresh attempt.

This mirrors the existing Anthropic write-back pattern where refreshed
tokens are written to ~/.claude/.credentials.json via
_write_claude_code_credentials().

Changes:
- Add _write_codex_cli_tokens() in hermes_cli/auth.py (parallel to
  _write_claude_code_credentials in anthropic_adapter.py)
- Call it from _refresh_codex_auth_tokens() (non-pool refresh path)
- Call it from credential_pool._refresh_entry() (pool happy path + retry)
- Add tests for the new write-back behavior
- Update existing test docstring to clarify _save_codex_tokens vs
  _write_codex_cli_tokens separation

Fixes refresh token conflict reported by @ec12edfae2cb221
2026-04-12 02:05:20 -07:00
Teknium
ae6820a45a
fix(setup): validate base URL input in hermes model flow (#8264)
Reject non-URL values (e.g. shell commands typed by mistake) in the
base URL prompt during provider setup. Previously any string was saved
as-is to .env, breaking connectivity when the garbage value was used
as the API endpoint.

Adds http:// / https:// prefix check with a clear error message.
The custom-endpoint flow already had this validation (line 1620);
this brings the generic API-key provider flow to parity.

Triggered by a user support case where 'nano ~/.hermes/.env' was
accidentally entered as GLM_BASE_URL during Z.AI setup.
2026-04-12 01:51:57 -07:00
Teknium
078dba015d
fix: three provider-related bugs (#8161, #8181, #8147) (#8243)
- Add openai/openai-codex -> openai mapping to PROVIDER_TO_MODELS_DEV
  so context-length lookups use models.dev data instead of 128k fallback.
  Fixes #8161.

- Set api_mode from custom_providers entry when switching via hermes model,
  and clear stale api_mode when the entry has none. Also extract api_mode
  in _named_custom_provider_map(). Fixes #8181.

- Convert OpenAI image_url content blocks to Anthropic image blocks when
  the endpoint is Anthropic-compatible (MiniMax, MiniMax-CN, or any URL
  containing /anthropic). Fixes #8147.
2026-04-12 01:44:18 -07:00
Harish Kukreja
b1f13a8c5f fix(agent): route compression aux through live session runtime 2026-04-12 01:34:52 -07:00
bravohenry
81ac62c0e9 fix(weixin): split chatty short replies into separate bubbles, keep structured content together
Add content-aware splitting to compact mode: short chat-like exchanges
(2-6 short lines without headings/lists/quotes) get separate message
bubbles for a natural chat feel, while structured content (tables,
headings with body, numbered lists) stays in a single message.

Cherry-picked from PR #7587 by bravohenry, adapted to the compact/legacy
split_per_line architecture from #7903.
2026-04-12 00:38:07 -07:00
Teknium
f53a5a7fe1
fix: suppress duplicate completion notifications when agent already consumed output via wait/poll/log (#8228)
When the agent calls process(action='wait') or process(action='poll')
and gets the exited status, the completion_queue notification is
redundant — the agent already has the output from the tool return.
Previously, the drain loops in CLI and gateway would still inject
the [SYSTEM: Background process completed] message, causing the
agent to receive the same information twice.

Fix: track session IDs in _completion_consumed set when wait/poll/log
returns an exited process. Drain loops in cli.py and gateway watcher
skip completion events for consumed sessions. Watch pattern events
are never suppressed (they have independent semantics).

Adds 4 tests covering wait/poll/log marking and running-process
negative case.
2026-04-12 00:36:22 -07:00
Teknium
fdf55e0fe9
feat(cli): show random tip on new session start (#8225)
Add a 'tip of the day' feature that displays a random one-liner about
Hermes Agent features on every new session — CLI startup, /clear, /new,
and gateway /new across all messaging platforms.

- New hermes_cli/tips.py module with 210 curated tips covering slash
  commands, keybindings, CLI flags, config options, tools, gateway
  platforms, profiles, sessions, memory, skills, cron, voice, security,
  and more
- CLI: tips display in skin-aware dim gold color after the welcome line
- Gateway: tips append to the /new and /reset response on all platforms
- Fully wrapped in try/except — tips are non-critical and never break
  startup or reset

Display format (CLI):
  ✦ Tip: /btw <question> asks a quick side question without tools or history.

Display format (gateway):
   Session reset! Starting fresh.
  ✦ Tip: hermes -c resumes your most recent CLI session.
2026-04-12 00:34:01 -07:00
opriz
36f57dbc51 fix(migration): don't auto-archive OpenClaw source directory
Remove auto-archival from hermes claw migrate — not its
responsibility (hermes claw cleanup is still there for that).

Skip MESSAGING_CWD when it points inside the OpenClaw source
directory, which was the actual root cause of agent confusion
after migration. Use Path.is_relative_to() for robust path
containment check.

Salvaged from PR #8192 by opriz.
Co-authored-by: opriz <opriz@users.noreply.github.com>
2026-04-12 00:33:54 -07:00
Teknium
1871227198 feat: rebrand OpenClaw references to Hermes during migration
- Add rebrand_text() that replaces OpenClaw, Open Claw, Open-Claw,
  ClawdBot, and MoltBot with Hermes (case-insensitive, word-boundary)
- Apply rebranding to memory entries (MEMORY.md, USER.md, daily memory)
- Apply rebranding to SOUL.md and workspace instructions via new
  transform parameter on copy_file()
- Fix moldbot -> moltbot typo across codebase (claw.py, migration
  script, docs, tests)
- Add unit tests for rebrand_text and integration tests for memory
  and soul migration rebranding
2026-04-12 00:33:54 -07:00
Teknium
eb2a49f95a
fix: openai-codex and anthropic not appearing in /model picker for external credentials (#8224)
Users whose credentials exist only in external files — OpenAI Codex
OAuth tokens in ~/.codex/auth.json or Anthropic Claude Code credentials
in ~/.claude/.credentials.json — would not see those providers in the
/model picker, even though hermes auth and hermes model detected them.

Root cause: list_authenticated_providers() only checked the raw Hermes
auth store and env vars. External credential file fallbacks (Codex CLI
import, Claude Code file discovery) were never triggered.

Fix (three parts):
1. _seed_from_singletons() in credential_pool.py: openai-codex now
   imports from ~/.codex/auth.json when the Hermes auth store is empty,
   mirroring resolve_codex_runtime_credentials().
2. list_authenticated_providers() in model_switch.py: auth store + pool
   checks now run for ALL providers (not just OAuth auth_type), catching
   providers like anthropic that support both API key and OAuth.
3. list_authenticated_providers(): direct check for anthropic external
   credential files (Claude Code, Hermes PKCE). The credential pool
   intentionally gates anthropic behind is_provider_explicitly_configured()
   to prevent auxiliary tasks from silently consuming tokens. The /model
   picker bypasses this gate since it is discovery-oriented.
2026-04-12 00:33:42 -07:00
Teknium
4cadfef8e3
fix(cli): restore stacked tool progress scrollback in TUI (#8201)
The TUI transition (4970705, f83e86d) replaced stacked per-tool history
lines with a single live-updating spinner widget. While the spinner
provides a nice live timer, it removed the scrollback history that
users relied on to see what the agent did during a session.

This restores stacked tool progress lines in 'all' and 'new' modes by
printing persistent scrollback lines via _cprint() when tools complete,
in addition to the existing live spinner display.

Behavior per mode:
- off: no scrollback lines, no spinner (unchanged)
- new: scrollback line on completion, skipping consecutive same-tool repeats
- all: scrollback line on every tool completion
- verbose: no scrollback (run_agent.py handles verbose output directly)

Implementation:
- Store function_args from tool.started events in _pending_tool_info
- On tool.completed, pop stored args and format via get_cute_tool_message()
- FIFO queue per function_name handles concurrent tool execution
- 'new' mode tracks _last_scrollback_tool for dedup
- State cleared at end of agent run

Reported by community user Mr.D — the stacked history provides
transparency into what the agent is doing, which builds trust.

Addresses user report from Discord about lost tool call visibility.
2026-04-11 23:22:34 -07:00
Teknium
1ca9b19750
feat: add network.force_ipv4 config to fix IPv6 timeout issues (#8196)
On servers with broken or unreachable IPv6, Python's socket.getaddrinfo
returns AAAA records first. urllib/httpx/requests all try IPv6 connections
first and hang for the full TCP timeout before falling back to IPv4. This
affects web_extract, web_search, the OpenAI SDK, and all HTTP tools.

Adds network.force_ipv4 config option (default: false) that monkey-patches
socket.getaddrinfo to resolve as AF_INET when the caller didn't specify a
family. Falls back to full resolution if no A record exists, so pure-IPv6
hosts still work.

Applied early at all three entry points (CLI, gateway, cron scheduler)
before any HTTP clients are created.

Reported by user @29n — Chinese Ubuntu server with unreachable IPv6 causing
timeouts on lobste.rs and other IPv6-enabled sites while Google/GitHub
worked fine (IPv4-only resolution).
2026-04-11 23:12:11 -07:00
Tom Qiao
8a48c58bd3 fix(gateway): add missing RedactingFormatter import
The gateway startup path references RedactingFormatter without
importing it, causing a NameError crash when launched with a
verbosity flag (e.g. via launchd --replace).

Fixes #8044

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-11 19:38:05 -07:00
Teknium
a0a02c1bc0
feat: /compress <focus> — guided compression with focus topic (#8017)
Adds an optional focus topic to /compress: `/compress database schema`
guides the summariser to preserve information related to the focus topic
(60-70% of summary budget) while compressing everything else more aggressively.
Inspired by Claude Code's /compact <focus>.

Changes:
- context_compressor.py: focus_topic parameter on _generate_summary() and
  compress(); appends FOCUS TOPIC guidance block to the LLM prompt
- run_agent.py: focus_topic parameter on _compress_context(), passed through
  to the compressor
- cli.py: _manual_compress() extracts focus topic from command string,
  preserves existing manual_compression_feedback integration (no regression)
- gateway/run.py: _handle_compress_command() extracts focus from event args
  and passes through — full gateway parity
- commands.py: args_hint="[focus topic]" on /compress CommandDef

Salvaged from PR #7459 (CLI /compress focus only — /context command deferred).
15 new tests across CLI, compressor, and gateway.
2026-04-11 19:23:29 -07:00
helix4u
cfbfc4c3f1 fix(discord): decouple readiness from slash sync 2026-04-11 19:22:14 -07:00
Teknium
fa7cd44b92
feat: add hermes backup and hermes import commands (#7997)
* feat: add `hermes backup` and `hermes import` commands

hermes backup — creates a zip of ~/.hermes/ (config, skills, sessions,
profiles, memories, skins, cron jobs, etc.) excluding the hermes-agent
codebase, __pycache__, and runtime PID files. Defaults to
~/hermes-backup-<timestamp>.zip, customizable with -o.

hermes import <zipfile> — restores from a backup zip, validating it
looks like a hermes backup before extracting. Handles .hermes/ prefix
stripping, path traversal protection, and confirmation prompts (skip
with --force).

29 tests covering exclusion rules, backup creation, import validation,
prefix detection, path traversal blocking, confirmation flow, and a
full round-trip test.

* test: improve backup/import coverage to 97%

Add 17 additional tests covering:
- _format_size helper (bytes through terabytes)
- Nonexistent hermes home error exit
- Output path is a directory (auto-names inside it)
- Output without .zip suffix (auto-appends)
- Empty hermes home (all files excluded)
- Permission errors during backup and import
- Output zip inside hermes root (skips itself)
- Not-a-zip file rejection
- EOFError and KeyboardInterrupt during confirmation
- 500+ file progress display
- Directory-only zip prefix detection

Remove dead code branch in _detect_prefix (unreachable guard).

* feat: auto-restore profile wrapper scripts on import

After extracting backup files, hermes import now scans profiles/ for
subdirectories with config.yaml or .env and recreates the ~/.local/bin
wrapper scripts so profile aliases (e.g. 'coder chat') work immediately.

Also prints guidance for re-installing gateway services per profile.

Handles edge cases:
- Skips profile dirs without config (not real profiles)
- Skips aliases that collide with existing commands
- Gracefully degrades if hermes_cli.profiles isn't available (fresh install)
- Shows PATH hint if ~/.local/bin isn't in PATH

3 new profile restoration tests (49 total).
2026-04-11 19:15:50 -07:00
Siddharth Balyan
50d86b3c71
fix(matrix): replace pickle crypto store with SQLite, fix E2EE decryption (#7981)
Fixes #7952 — Matrix E2EE completely broken after mautrix migration.

- Replace MemoryCryptoStore + pickle/HMAC persistence with mautrix's
  PgCryptoStore backed by SQLite via aiosqlite. Crypto state now
  persists reliably across restarts without fragile serialization.

- Add handle_sync() call on initial sync response so to-device events
  (queued Megolm key shares) are dispatched to OlmMachine instead of
  being silently dropped.

- Add _verify_device_keys_on_server() after loading crypto state.
  Detects missing keys (re-uploads), stale keys from migration
  (attempts re-upload), and corrupted state (refuses E2EE).

- Add _CryptoStateStore adapter wrapping MemoryStateStore to satisfy
  mautrix crypto's StateStore interface (is_encrypted,
  get_encryption_info, find_shared_rooms).

- Remove redundant share_keys() call from sync loop — OlmMachine
  already handles this via DEVICE_OTK_COUNT event handler.

- Fix datetime vs float TypeError in session.py suspend_recently_active()
  that crashed gateway startup.

- Add aiosqlite and asyncpg to [matrix] extra in pyproject.toml.

- Update test mocks for PgCryptoStore/Database and add query_keys mock
  for key verification. 174 tests pass.

- Add E2EE upgrade/migration docs to Matrix user guide.
2026-04-12 07:24:46 +05:30
Siddharth Balyan
27eeea0555
perf(ssh,modal): bulk file sync via tar pipe and tar/base64 archive (#8014)
* perf(ssh,modal): bulk file sync via tar pipe and tar/base64 archive

SSH: symlink-staging + tar -ch piped over SSH in a single TCP stream.
Eliminates per-file scp round-trips. Handles timeout (kills both
processes), SSH Popen failure (kills tar), and tar create failure.

Modal: in-memory gzipped tar archive, base64-encoded, decoded+extracted
in one exec call. Checks exit code and raises on failure.

Both backends use shared helpers extracted into file_sync.py:
- quoted_mkdir_command() — mirrors existing quoted_rm_command()
- unique_parent_dirs() — deduplicates parent dirs from file pairs

Migrates _ensure_remote_dirs to use the new helpers.

28 new tests (21 SSH + 7 Modal), all passing.

Closes #7465
Closes #7467

* fix(modal): pipe stdin to avoid ARG_MAX, clean up review findings

- Modal bulk upload: stream base64 payload through proc.stdin in 1MB
  chunks instead of embedding in command string (Modal SDK enforces
  64KB ARG_MAX_BYTES — typical payloads are ~4.3MB)
- Modal single-file upload: same stdin fix, add exit code checking
- Remove what-narrating comments in ssh.py and modal.py (keep WHY
  comments: symlink staging rationale, SIGPIPE, deadlock avoidance)
- Remove unnecessary `sandbox = self._sandbox` alias in modal bulk
- Daytona: use shared helpers (unique_parent_dirs, quoted_mkdir_command)
  instead of inlined duplicates

---------

Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com>
2026-04-12 06:18:05 +05:30
Teknium
fd73937ec8
feat: component-separated logging with session context and filtering (#7991)
* feat: component-separated logging with session context and filtering

Phase 1 — Gateway log isolation:
- gateway.log now only receives records from gateway.* loggers
  (platform adapters, session management, slash commands, delivery)
- agent.log remains the catch-all (all components)
- errors.log remains WARNING+ catch-all
- Moved gateway.log handler creation from gateway/run.py into
  hermes_logging.setup_logging(mode='gateway') with _ComponentFilter

Phase 2 — Session ID injection:
- Added set_session_context(session_id) / clear_session_context() API
  using threading.local() for per-thread session tracking
- _SessionFilter enriches every log record with session_tag attribute
- Log format: '2026-04-11 10:23:45 INFO [session_id] logger.name: msg'
- Session context set at start of run_conversation() in run_agent.py
- Thread-isolated: gateway conversations on different threads don't leak

Phase 3 — Component filtering in hermes logs:
- Added --component flag: hermes logs --component gateway|agent|tools|cli|cron
- COMPONENT_PREFIXES maps component names to logger name prefixes
- Works with all existing filters (--level, --session, --since, -f)
- Logger name extraction handles both old and new log formats

Files changed:
- hermes_logging.py: _SessionFilter, _ComponentFilter, COMPONENT_PREFIXES,
  set/clear_session_context(), gateway.log creation in setup_logging()
- gateway/run.py: removed redundant gateway.log handler (now in hermes_logging)
- run_agent.py: set_session_context() at start of run_conversation()
- hermes_cli/logs.py: --component filter, logger name extraction
- hermes_cli/main.py: --component argument on logs subparser

Addresses community request for component-separated, filterable logging.
Zero changes to existing logger names — __name__ already provides hierarchy.

* fix: use LogRecord factory instead of per-handler _SessionFilter

The _SessionFilter approach required attaching a filter to every handler
we create. Any handler created outside our _add_rotating_handler (like
the gateway stderr handler, or third-party handlers) would crash with
KeyError: 'session_tag' if it used our format string.

Replace with logging.setLogRecordFactory() which injects session_tag
into every LogRecord at creation time — process-global, zero per-handler
wiring needed. The factory is installed at import time (before
setup_logging) so session_tag is available from the moment hermes_logging
is imported.

- Idempotent: marker attribute prevents double-wrapping on module reload
- Chains with existing factory: won't break third-party record factories
- Removes _SessionFilter from _add_rotating_handler and setup_verbose_logging
- Adds tests: record factory injection, idempotency, arbitrary handler compat
2026-04-11 17:23:36 -07:00
Teknium
723b5bec85
feat: per-platform display verbosity configuration (#8006)
Add display.platforms section to config.yaml for per-platform overrides of
display settings (tool_progress, show_reasoning, streaming, tool_preview_length).

Each platform gets sensible built-in defaults based on capability tier:
- High (telegram, discord): tool_progress=all, streaming follows global
- Medium (slack, mattermost, matrix, feishu): tool_progress=new
- Low (signal, whatsapp, bluebubbles, wecom, etc.): tool_progress=off, streaming=false
- Minimal (email, sms, webhook, homeassistant): tool_progress=off, streaming=false

Example config:
  display:
    platforms:
      telegram:
        tool_progress: all
        show_reasoning: true
      slack:
        tool_progress: off

Resolution order: platform override > global setting > built-in platform default.

Changes:
- New gateway/display_config.py: resolver module with tier-based platform defaults
- gateway/run.py: tool_progress, tool_preview_length, streaming, show_reasoning
  all resolve per-platform via the new resolver
- /verbose command: now cycles tool_progress per-platform (saves to
  display.platforms.<platform>.tool_progress instead of global)
- /reasoning show|hide: now saves show_reasoning per-platform
- Config version 15 -> 16: migrates tool_progress_overrides into display.platforms
- Backward compat: legacy tool_progress_overrides still read as fallback
- 27 new tests for resolver, normalization, migration, backward compat
- Updated verbose command tests for per-platform behavior

Addresses community request for per-channel verbosity control (Guillaume Meyer,
Nathan Danielsen) — high verbosity on backchannel Telegram, low on customer-facing
Slack, none on email.
2026-04-11 17:20:34 -07:00
Teknium
14ccd32cee
refactor(terminal): remove check_interval parameter (#8001)
The check_interval parameter on terminal_tool sent periodic output
updates to the gateway chat, but these were display-only — the agent
couldn't see or act on them. This added schema bloat and introduced
a bug where notify_on_complete=True was silently dropped when
check_interval was also set (the not-check_interval guard skipped
fast-watcher registration, and the check_interval watcher dict
was missing the notify_on_complete key).

Removing check_interval entirely:
- Eliminates the notify_on_complete interaction bug
- Reduces tool schema size (one fewer parameter for the model)
- Simplifies the watcher registration path
- notify_on_complete (agent wake-on-completion) still works
- watch_patterns (output alerting) still works
- process(action='poll') covers manual status checking

Closes #7947 (root cause eliminated rather than patched).
2026-04-11 17:16:11 -07:00
Mateus Scheuer Macedo
06f862fa1b feat(cli): add native /model picker modal for provider → model selection
When /model is called with no arguments in the interactive CLI, open a
two-step prompt_toolkit modal instead of the previous text-only listing:

1. Provider selection — curses_single_select with all authenticated providers
2. Model selection — live API fetch with curated fallback

Also fixes:
- OpenAI Codex model normalization (openai/gpt-5.4 → gpt-5.4)
- Dedicated Codex validation path using provider_model_ids()

Preserves curses_radiolist (used by setup, tools, plugins) alongside the
new curses_single_select. Retains tool elapsed timer in spinner.

Cherry-picked from PR #7438 by MestreY0d4-Uninter.
2026-04-11 17:16:06 -07:00
Teknium
39cd57083a
refactor: remove budget warning injection system (dead code)
The _get_budget_warning() method already returned None unconditionally —
the entire budget warning system was disabled. Remove all dead code:

- _BUDGET_WARNING_RE regex
- _strip_budget_warnings_from_history() function and its call site
- Both injection blocks (concurrent + sequential tool execution)
- _get_budget_warning() method
- 7 tests for the removed functions

The budget exhaustion grace call system (_budget_exhausted_injected,
_budget_grace_call) is a separate recovery mechanism and is preserved.
2026-04-11 16:56:33 -07:00
Siddharth Balyan
cab814af15
feat(nix): container-aware CLI — auto-route into managed container (#7543)
* feat(nix): container-aware CLI — auto-route all subcommands into managed container

When container.enable = true, the host `hermes` CLI transparently execs
every subcommand into the managed Docker/Podman container. A symlink
bridge (~/.hermes -> /var/lib/hermes/.hermes) unifies state between host
and container so sessions, config, and memories are shared.

CLI changes:
- Global routing before subcommand dispatch (all commands forwarded)
- docker exec with -u exec_user, env passthrough (TERM, COLORTERM,
  LANG, LC_ALL), TTY-aware flags
- Retry with spinner on failure (TTY: 5s, non-TTY: 10s silent)
- Hard fail instead of silent fallback
- HERMES_DEV=1 env var bypasses routing for development
- No routing messages (invisible to user)

NixOS module changes:
- container.hostUsers option: lists users who get ~/.hermes symlink
  and automatic hermes group membership
- Activation script creates symlink bridge (with backup of existing
  ~/.hermes dirs), writes exec_user to .container-mode
- Cleanup on disable: removes symlinks + .container-mode + stops service
- Warning when hostUsers set without addToSystemPackages

* fix: address review — reuse sudo var, add chown -h on symlink update

- hermes_cli/main.py: reuse the existing `sudo` variable instead of
  redundant `shutil.which("sudo")` call that could return None
- nix/nixosModules.nix: add missing `chown -h` when updating an
  existing symlink target so ownership stays consistent with the
  fresh-create and backup-replace branches

* fix: address remaining review items from cursor bugbot

- hermes_cli/main.py: move container routing BEFORE parse_args() so
  --help, unrecognised flags, and all subcommands are forwarded
  transparently into the container instead of being intercepted by
  argparse on the host (high severity)

- nix/nixosModules.nix: resolve home dirs via
  config.users.users.${user}.home instead of hardcoding /home/${user},
  supporting users with custom home directories (medium severity)

- nix/nixosModules.nix: gate hostUsers group membership on
  container.enable so setting hostUsers without container mode doesn't
  silently add users to the hermes group (low severity)

* fix: simplify container routing — execvp, no retries, let it crash

- Replace subprocess.run retry loop with os.execvp (no idle parent process)
- Extract _probe_container helper for sudo detection with 15s timeout
- Narrow exception handling: FileNotFoundError only in get_container_exec_info,
  catch TimeoutExpired specifically, remove silent except Exception: pass
- Collapse needs_sudo + sudo into single sudo_path variable
- Simplify NixOS symlink creation from 4 branches to 2
- Gate NixOS sudoers hint with "On NixOS:" prefix
- Full test rewrite: 18 tests covering execvp, sudo probe, timeout, permissions

---------

Co-authored-by: Hermes Agent <hermes@nousresearch.com>
2026-04-12 05:17:46 +05:30
Teknium
5c2ecdec49
fix: use ceiling division for token estimation, deduplicate inline formula
Switch estimate_tokens_rough(), estimate_messages_tokens_rough(), and
estimate_request_tokens_rough() from floor division (len // 4) to
ceiling division ((len + 3) // 4). Short texts (1-3 chars) previously
estimated as 0 tokens, causing the compressor and pre-flight checks to
systematically undercount when many short tool results are present.

Also replaced the inline duplicate formula in run_conversation()
(total_chars // 4) with a call to the shared
estimate_messages_tokens_rough() function.

Updated 4 tests that hardcoded floor-division expected values.

Related: issue #6217, PR #6629
2026-04-11 16:33:40 -07:00
Brooklyn Nicholson
a1d2a0c0fd feat: self update npm deps on hermes update 2026-04-11 18:29:18 -05:00
WAXLYY
6d272ba477 fix(tools): enforce ID uniqueness in TODO store during replace operations
Deduplicate todo items by ID before writing to the store, keeping the
last occurrence. Prevents ghost entries when the model sends duplicate
IDs in a single write() call, which corrupts subsequent merge operations.

Co-authored-by: WAXLYY <WAXLYY@users.noreply.github.com>
2026-04-11 16:22:50 -07:00
asheriif
97b0cd51ee feat(gateway): surface natural mid-turn assistant messages in chat platforms
Add display.interim_assistant_messages config (enabled by default) that
forwards completed assistant commentary between tool calls to the user
as separate chat messages. Models already emit useful status text like
'I'll inspect the repo first.' — this surfaces it on Telegram, Discord,
and other messaging platforms instead of swallowing it.

Independent from tool_progress and gateway streaming. Disabled for
webhooks. Uses GatewayStreamConsumer when available, falls back to
direct adapter send. Tracks response_previewed to prevent double-delivery
when interim message matches the final response.

Also fixes: cursor not stripped from fallback prefix in stream consumer
(affected continuation calculation on no-edit platforms like Signal).

Cherry-picked from PR #7885 by asheriif, default changed to enabled.
Fixes #5016
2026-04-11 16:21:39 -07:00
Teknium
c8aff74632
fix: prevent agent from stopping mid-task — compression floor, budget overhaul, activity tracking
Three root causes of the 'agent stops mid-task' gateway bug:

1. Compression threshold floor (64K tokens minimum)
   - The 50% threshold on a 100K-context model fired at 50K tokens,
     causing premature compression that made models lose track of
     multi-step plans.  Now threshold_tokens = max(50% * context, 64K).
   - Models with <64K context are rejected at startup with a clear error.

2. Budget warning removal — grace call instead
   - Removed the 70%/90% iteration budget warnings entirely.  These
     injected '[BUDGET WARNING: Provide your final response NOW]' into
     tool results, causing models to abandon complex tasks prematurely.
   - Now: no warnings during normal execution.  When the budget is
     actually exhausted (90/90), inject a user message asking the model
     to summarise, allow one grace API call, and only then fall back
     to _handle_max_iterations.

3. Activity touches during long terminal execution
   - _wait_for_process polls every 0.2s but never reported activity.
     The gateway's inactivity timeout (default 1800s) would fire during
     long-running commands that appeared 'idle.'
   - Now: thread-local activity callback fires every 10s during the
     poll loop, keeping the gateway's activity tracker alive.
   - Agent wires _touch_activity into the callback before each tool call.

Also: docs update noting 64K minimum context requirement.

Closes #7915 (root cause was agent-loop termination, not Weixin delivery limits).
2026-04-11 16:18:57 -07:00
Koichi Tsutsumi
fc417ed049 fix(cli): add ChatConsole.status for /skills search 2026-04-11 15:38:43 -07:00
0xbyt4
32519066dc fix(gateway): add HERMES_SESSION_KEY to session_context contextvars
Complete the contextvars migration by adding HERMES_SESSION_KEY to the
unified _VAR_MAP in session_context.py. Without this, concurrent gateway
handlers race on os.environ["HERMES_SESSION_KEY"].

- Add _SESSION_KEY ContextVar to _VAR_MAP, set_session_vars(), clear_session_vars()
- Wire session_key through _set_session_env() from SessionContext
- Replace os.getenv fallback in tools/approval.py with get_session_env()
  (function-level import to avoid cross-layer coupling)
- Keep os.environ set as CLI/cron fallback

Cherry-picked from PR #7878 by 0xbyt4.
2026-04-11 15:35:04 -07:00
syaor4n
689c515090 feat: add --env and --preset support to hermes mcp add
- Add --env KEY=VALUE for passing environment variables to stdio MCP servers
- Add --preset for known MCP server templates (empty for now, extensible)
- Validate env var names, reject --env for HTTP servers
- Explicit --command/--url overrides preset defaults
- Remove unused getpass import

Based on PR #7936 by @syaor4n (stitch preset removed, generic infra kept).
2026-04-11 15:34:57 -07:00
chqchshj
5f0caf54d6 feat(gateway): add WeCom callback-mode adapter for self-built apps
Add a second WeCom integration mode for regular enterprise self-built
applications.  Unlike the existing bot/websocket adapter (wecom.py),
this handles WeCom's standard callback flow: WeCom POSTs encrypted XML
to an HTTP endpoint, the adapter decrypts, queues for the agent, and
immediately acknowledges.  The agent's reply is delivered proactively
via the message/send API.

Key design choice: always acknowledge immediately and use proactive
send — agent sessions take 3-30 minutes, so the 5-second inline reply
window is never useful.  The original PR's Future/pending-reply
machinery was removed in favour of this simpler architecture.

Features:
- AES-CBC encrypt/decrypt (BizMsgCrypt-compatible)
- Multi-app routing scoped by corp_id:user_id
- Legacy bare user_id fallback for backward compat
- Access-token management with auto-refresh
- WECOM_CALLBACK_* env var overrides
- Port-in-use pre-check before binding
- Health endpoint at /health

Salvaged from PR #7774 by @chqchshj.  Simplified by removing the
inline reply Future system and fixing: secrets.choice for nonce
generation, immediate plain-text acknowledgment (not encrypted XML
containing 'success'), and initial token refresh error handling.
2026-04-11 15:22:49 -07:00
Brooklyn Nicholson
ec553fdb49 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-11 17:15:41 -05:00
faishal
90352b2adf fix: normalize checkpoint manager home-relative paths
Adds _normalize_path() helper that calls expanduser().resolve() to
properly handle tilde paths (e.g. ~/.hermes, ~/.config).  Previously
Path.resolve() alone treated ~ as a literal directory name, producing
invalid paths like /root/~/.hermes.

Also improves _run_git() error handling to distinguish missing working
directories from missing git executable, and adds pre-flight directory
validation.

Cherry-picked from PR #7898 by faishal882.
Fixes #7807
2026-04-11 14:50:44 -07:00
Teknium
8c3935ebe8
fix: is_local_endpoint misses Docker/Podman DNS names (#7950)
* fix(tools): neutralize shell injection in _write_to_sandbox via path quoting

_write_to_sandbox interpolated storage_dir and remote_path directly into
a shell command passed to env.execute(). Paths containing shell
metacharacters (spaces, semicolons, $(), backticks) could trigger
arbitrary command execution inside the sandbox.

Fix: wrap both paths with shlex.quote(). Clean paths (alphanumeric +
slashes/hyphens/dots) are left unmodified by shlex.quote, so existing
behavior is unchanged. Paths with unsafe characters get single-quoted.

Tests added for spaces, $(command) substitution, and semicolon injection.

* fix: is_local_endpoint misses Docker/Podman DNS names

host.docker.internal, host.containers.internal, gateway.docker.internal,
and host.lima.internal are well-known DNS names that container runtimes
use to resolve the host machine. Users running Ollama on the host with
the agent in Docker/Podman hit the default 120s stream timeout instead
of the bumped 1800s because these hostnames weren't recognized as local.

Add _CONTAINER_LOCAL_SUFFIXES tuple and suffix check in
is_local_endpoint(). Tests cover all three runtime families plus a
negative case for domains that merely contain the suffix as a substring.
2026-04-11 14:46:18 -07:00
Teknium
d82580b25b fix: add all_profiles param + narrow exception handling
- add all_profiles=False to find_gateway_pids() and
  kill_gateway_processes() so hermes update and gateway stop --all
  can still discover processes across all profiles
- narrow bare 'except Exception' to (OSError, subprocess.TimeoutExpired)
- update test mocks to match new signatures
2026-04-11 14:44:29 -07:00
Dominic Grieco
b80e318168 fix: scope gateway status to the active profile 2026-04-11 14:44:29 -07:00
etcircle
72b345e068 fix(gateway): preserve queued voice events for STT 2026-04-11 14:43:53 -07:00
Teknium
8160d7a03d test: add dedup coverage for reasoning item ID deduplication
Adds two tests verifying that duplicate reasoning item IDs across
multi-turn Codex Responses conversations are correctly deduplicated
in both _chat_messages_to_responses_input() and
_preflight_codex_input_items().
2026-04-11 14:43:47 -07:00
Teknium
f2893fe51a
fix(tools): neutralize shell injection in _write_to_sandbox via path quoting (#7940)
_write_to_sandbox interpolated storage_dir and remote_path directly into
a shell command passed to env.execute(). Paths containing shell
metacharacters (spaces, semicolons, $(), backticks) could trigger
arbitrary command execution inside the sandbox.

Fix: wrap both paths with shlex.quote(). Clean paths (alphanumeric +
slashes/hyphens/dots) are left unmodified by shlex.quote, so existing
behavior is unchanged. Paths with unsafe characters get single-quoted.

Tests added for spaces, $(command) substitution, and semicolon injection.
2026-04-11 14:26:11 -07:00
Dusk1e
255f59de18 fix(tools): prevent command argument injection and path traversal in checkpoint manager
This commit addresses a security vulnerability where unsanitized user inputs for commit_hash and file_path were passed directly to git commands in CheckpointManager.restore() and diff(). It validates commit hashes to be strictly hexadecimal characters without leading dashes (preventing flag injection like '--patch') and enforces file paths to stay within the working directory via root resolution. Regression tests test_restore_rejects_argument_injection, test_restore_rejects_invalid_hex_chars, and test_restore_rejects_path_traversal were added.
2026-04-11 14:25:57 -07:00
Teknium
4bede272cf fix: propagate model through credential pool path + add tests
The cherry-picked fix from PR #7916 placed model propagation after
the credential pool early-return in _resolve_named_custom_runtime(),
making it dead code when a pool is active (which happens whenever
custom_providers has an api_key that auto-seeds the pool).

- Inject model into pool_result before returning
- Add 5 regression tests covering direct path, pool path, empty
  model, and absent model scenarios
- Add 'model' to _VALID_CUSTOM_PROVIDER_FIELDS for config validation
2026-04-11 14:09:40 -07:00
Teknium
b0892375cd fix: mock aiohttp server in startup guard tests to avoid port binding
The startup guard tests called connect() which bound a real aiohttp
server on port 8080 — flaky in any environment where the port is
in use. Mock AppRunner, TCPSite, and ClientSession instead.
2026-04-11 14:05:38 -07:00
Mariano Nicolini
0a922bf218 add new test covering edge case where both insecure_no_sig and _webhook_url are set 2026-04-11 14:05:38 -07:00
Mariano Nicolini
8ce6aaac23 change Twilio signature verification from opt-in to opt-out 2026-04-11 14:05:38 -07:00
Mariano Nicolini
ad1e8804a6 handle port variants in Twilio signatures 2026-04-11 14:05:38 -07:00
Mariano Nicolini
c22bffc92e add basic twilio signature checking and tests 2026-04-11 14:05:38 -07:00
Teknium
dfc820345d
fix: scope tool interrupt signal per-thread to prevent cross-session leaks (#7930)
The interrupt mechanism in tools/interrupt.py used a process-global
threading.Event. In the gateway, multiple agents run concurrently in
the same process via run_in_executor. When any agent was interrupted
(user sends a follow-up message), the global flag killed ALL agents'
running tools — terminal commands, browser ops, web requests — across
all sessions.

Changes:
- tools/interrupt.py: Replace single threading.Event with a set of
  interrupted thread IDs. set_interrupt() targets a specific thread;
  is_interrupted() checks the current thread. Includes a backward-
  compatible _ThreadAwareEventProxy for legacy _interrupt_event usage.
- run_agent.py: Store execution thread ID at start of run_conversation().
  interrupt() and clear_interrupt() pass it to set_interrupt() so only
  this agent's thread is affected.
- tools/code_execution_tool.py: Use is_interrupted() instead of
  directly checking _interrupt_event.is_set().
- tools/process_registry.py: Same — use is_interrupted().
- tests: Update interrupt tests for per-thread semantics. Add new
  TestPerThreadInterruptIsolation with two tests verifying cross-thread
  isolation.
2026-04-11 14:02:58 -07:00
Teknium
75380de430
fix: reap orphaned browser sessions on startup (#7931)
When a Python process exits uncleanly (SIGKILL, crash, gateway restart
via hermes update), in-memory _active_sessions tracking is lost but the
agent-browser node daemons and their Chromium child processes keep
running indefinitely. On a long-running system this causes unbounded
memory growth — 24 orphaned sessions consumed 7.6 GB on a production
machine over 9 days.

Add _reap_orphaned_browser_sessions() which scans the tmp directory for
agent-browser-{h_*,cdp_*} socket dirs on cleanup thread startup.  For
each dir not tracked by the current process, reads the daemon PID file
and sends SIGTERM if the daemon is still alive.  Handles edge cases:
dead PIDs, corrupt PID files, permission errors, foreign processes.

The reaper runs once on thread startup (not every 30s) to avoid races
with sessions being actively created by concurrent agents.
2026-04-11 14:02:46 -07:00
Markus Corazzione
885123d44b fix(weixin): add per-chunk retry with backoff for text delivery
When sending multi-chunk responses, individual chunks can fail due to
transient iLink API errors. Previously a single failure would abort the
entire message. Now each chunk is retried with linear backoff before
giving up, and the same client_id is reused across retries for
server-side deduplication.

Configurable via config.yaml (platforms.weixin.extra) or env vars:
- send_chunk_delay_seconds (default 0.35s) — pacing between chunks
- send_chunk_retries (default 2) — max retry attempts per chunk
- send_chunk_retry_delay_seconds (default 1.0s) — base retry delay

Replaces the hardcoded 0.3s inter-chunk delay from #7903.

Salvaged from PR #7899 by @corazzione. Fixes #7836.
2026-04-11 14:02:33 -07:00
Teknium
04c1c5d53f
refactor: extract shared helpers to deduplicate repeated code patterns (#7917)
* refactor: add shared helper modules for code deduplication

New modules:
- gateway/platforms/helpers.py: MessageDeduplicator, TextBatchAggregator,
  strip_markdown, ThreadParticipationTracker, redact_phone
- hermes_cli/cli_output.py: print_info/success/warning/error, prompt helpers
- tools/path_security.py: validate_within_dir, has_traversal_component
- utils.py additions: safe_json_loads, read_json_file, read_jsonl,
  append_jsonl, env_str/lower/int/bool helpers
- hermes_constants.py additions: get_config_path, get_skills_dir,
  get_logs_dir, get_env_path

* refactor: migrate gateway adapters to shared helpers

- MessageDeduplicator: discord, slack, dingtalk, wecom, weixin, mattermost
- strip_markdown: bluebubbles, feishu, sms
- redact_phone: sms, signal
- ThreadParticipationTracker: discord, matrix
- _acquire/_release_platform_lock: telegram, discord, slack, whatsapp,
  signal, weixin

Net -316 lines across 19 files.

* refactor: migrate CLI modules to shared helpers

- tools_config.py: use cli_output print/prompt + curses_radiolist (-117 lines)
- setup.py: use cli_output print helpers + curses_radiolist (-101 lines)
- mcp_config.py: use cli_output prompt (-15 lines)
- memory_setup.py: use curses_radiolist (-86 lines)

Net -263 lines across 5 files.

* refactor: migrate to shared utility helpers

- safe_json_loads: agent/display.py (4 sites)
- get_config_path: skill_utils.py, hermes_logging.py, hermes_time.py
- get_skills_dir: skill_utils.py, prompt_builder.py
- Token estimation dedup: skills_tool.py imports from model_metadata
- Path security: skills_tool, cronjob_tools, skill_manager_tool, credential_files
- Non-atomic YAML writes: doctor.py, config.py now use atomic_yaml_write
- Platform dict: new platforms.py, skills_config + tools_config derive from it
- Anthropic key: new get_anthropic_key() in auth.py, used by doctor/status/config/main

* test: update tests for shared helper migrations

- test_dingtalk: use _dedup.is_duplicate() instead of _is_duplicate()
- test_mattermost: use _dedup instead of _seen_posts/_prune_seen
- test_signal: import redact_phone from helpers instead of signal
- test_discord_connect: _platform_lock_identity instead of _token_lock_identity
- test_telegram_conflict: updated lock error message format
- test_skill_manager_tool: 'escapes' instead of 'boundary' in error msgs
2026-04-11 13:59:52 -07:00
WAXLYY
f4f4078ad9 fix(gateway/weixin): ensure atomic persistence for critical session state 2026-04-11 13:48:25 -07:00
Teknium
59e630a64d fix: update thinking-exhaustion test for think-tag gating
The test expected content=None to immediately trigger thinking-exhaustion,
but PR #7738 correctly gates that check on _has_think_tags. Without think
tags, the agent falls through to normal continuation retry (3 attempts).
2026-04-11 13:47:25 -07:00
Tom Qiao
5910412002 fix: detect truncated tool_calls when finish_reason is not length
When API routers rewrite finish_reason from "length" to "tool_calls",
truncated JSON arguments bypassed the length handler and wasted 3
retry attempts in the generic JSON validation loop. Now detects
truncation patterns in tool call arguments regardless of finish_reason.

Fixes #7680

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-11 13:47:25 -07:00
helix4u
39da23a129 fix(api-server): keep chat-completions SSE alive 2026-04-11 13:47:25 -07:00
Teknium
cac6178104 fix(gateway): propagate user identity through process watcher pipeline
Background process watchers (notify_on_complete, check_interval) created
synthetic SessionSource objects without user_id/user_name. While the
internal=True bypass (1d8d4f28) prevented false pairing for agent-
generated notifications, the missing identity caused:

- Garbage entries in pairing rate limiters (discord:None, telegram:None)
- 'User None' in approval messages and logs
- No user identity available for future code paths that need it

Additionally, platform messages arriving without from_user (Telegram
service messages, channel forwards, anonymous admin actions) could still
trigger false pairing because they are not internal events.

Fix:
1. Propagate user_id/user_name through the full watcher chain:
   session_context.py → gateway/run.py → terminal_tool.py →
   process_registry.py (including checkpoint persistence/recovery)

2. Add None user_id guard in _handle_message() — silently drop
   non-internal messages with no user identity instead of triggering
   the pairing flow.

Salvaged from PRs #7664 (kagura-agent, ContextVar approach),
#6540 (MestreY0d4-Uninter, tests), and #7709 (guang384, None guard).

Closes #6341, #6485, #7643
Relates to #6516, #7392
2026-04-11 13:46:16 -07:00
Brooklyn Nicholson
9ccb490cf3 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-11 15:30:23 -05:00
Brooklyn Nicholson
32302c37dd feat: fix types and add type checking plus lazybundle on launch andddd dev flag 2026-04-11 14:42:28 -05:00
Brooklyn Nicholson
e2ea8934d4 feat: ensure feature parity once again 2026-04-11 14:02:36 -05:00
Teknium
dafe443beb
feat: warn at session start when compression model context is too small (#7894)
Two-phase design so the warning fires before the user's first message
on every platform:

Phase 1 (__init__):
  _check_compression_model_feasibility() runs during agent construction.
  Resolves the auxiliary compression model (same chain as call_llm with
  task='compression'), compares its context length to the main model's
  compression threshold. If too small, emits via _emit_status() (prints
  for CLI) and stores the warning in _compression_warning.

Phase 2 (run_conversation, first call):
  _replay_compression_warning() re-sends the stored warning through
  status_callback — which the gateway wires AFTER construction. The
  warning is then cleared so it only fires once.

This ensures:
- CLI users see the warning immediately at startup (right after the
  context limit line)
- Gateway users (Telegram, Discord, Slack, WhatsApp, Signal, Matrix,
  Mattermost, Home Assistant, DingTalk, etc.) receive it via
  status_callback('lifecycle', ...) on their first message
- logger.warning() always hits agent.log regardless of platform

Also warns when no auxiliary LLM provider is configured at all.
Entire check wrapped in try/except — never blocks startup.

11 tests covering: core warning logic, boundary conditions, exception
safety, two-phase store+replay, gateway callback wiring, and
single-delivery guarantee.
2026-04-11 12:01:30 -07:00
Teknium
da9f96bf51
fix(weixin): keep multi-line messages in single bubble by default (#7903)
The Weixin adapter was splitting responses at every top-level newline,
causing notification spam (up to 70 API calls for a single long markdown
response). This salvages the best aspects of six contributor PRs:

Compact mode (new default):
- Messages under the 4000-char limit stay as a single bubble even with
  multiple lines, paragraphs, and code blocks
- Only oversized messages get split at logical markdown boundaries
- Inter-chunk delay (0.3s) between chunks prevents WeChat rate-limit drops

Legacy mode (opt-in):
- Set split_multiline_messages: true in platforms.weixin.extra config
- Or set WEIXIN_SPLIT_MULTILINE_MESSAGES=true env var
- Restores the old per-line splitting behavior

Salvaged from PRs #7797 (guantoubaozi), #7792 (luoxiao6645),
#7838 (qyx596), #7825 (weedge), #7784 (sherunlock03), #7773 (JnyRoad).
Core fix unanimous across all six; config toggle from #7838; inter-chunk
delay from #7825.
2026-04-11 12:00:05 -07:00
0xbyt4
3ec8809b78 fix(vision): preserve aspect ratio during auto-resize
Independent halving of width and height caused aspect ratio distortion
for extreme dimensions (e.g. 8000x200 panoramas). When one axis hit the
64px floor, the other kept shrinking — collapsing the ratio toward 1:1.

Use proportional scaling instead: when either dimension hits the floor,
derive the effective scale factor and apply it to both axes.

Add tests for extreme panorama (8000x200) and tall narrow (200x6000)
images to verify aspect ratio preservation.
2026-04-11 11:53:04 -07:00
Teknium
4e3e87b677 feat(migration): preview-then-confirm UX + docs updates
hermes claw migrate now always shows a full dry-run preview before
making any changes. The user reviews what would be imported, then
confirms to proceed. --dry-run stops after the preview. --yes skips
the confirmation prompt.

This matches the existing setup wizard flow (_offer_openclaw_migration)
which already did preview-then-confirm.

Docs updated across both docs/migration/openclaw.md and
website/docs/guides/migrate-from-openclaw.md to reflect:
- New preview-first UX flow
- workspace-main/ fallback paths
- accounts.default channel token layout
- TTS edge/microsoft rename
- openclaw.json env sub-object as API key source
- Hyphenated provider API types
- Matrix accessToken field
- SecretRef file/exec warnings
- Skills session restart note
- WhatsApp re-pairing note
- Archive cleanup step
2026-04-11 11:35:23 -07:00
Teknium
d4bb44d4b9 docs: add Xiaomi MiMo to all provider docs + fix MiMo-V2-Flash ctx len
- environment-variables.md: XIAOMI_API_KEY, XIAOMI_BASE_URL, provider list
- cli-commands.md: --provider choices
- integrations/providers.md: provider table, Chinese providers section,
  config example, base URL list, choosing table, fallback providers list
- fallback-providers.md: supported providers table, auto-detection chain
- Fix XiaomiMiMo/MiMo-V2-Flash context length 32768 → 256000 (OpenRouter entry)
2026-04-11 11:17:52 -07:00
kshitijk4poor
6693e2a497 feat(xiaomi): add Xiaomi MiMo as first-class provider
Cherry-picked from PR #7702 by kshitijk4poor.

Adds Xiaomi MiMo as a direct provider (XIAOMI_API_KEY) with models:
- mimo-v2-pro (1M context), mimo-v2-omni (256K, multimodal), mimo-v2-flash (256K, cheapest)

Standard OpenAI-compatible provider checklist: auth.py, config.py, models.py,
main.py, providers.py, doctor.py, model_normalize.py, model_metadata.py,
models_dev.py, auxiliary_client.py, .env.example, cli-config.yaml.example.

Follow-up: vision tasks use mimo-v2-omni (multimodal) instead of the user's
main model. Non-vision aux uses the user's selected model. Added
_PROVIDER_VISION_MODELS dict for provider-specific vision model overrides.
On failure, falls back to aggregators (gemini flash) via existing fallback chain.

Corrects pre-existing context lengths: mimo-v2-pro 1048576→1000000,
mimo-v2-omni 1048576→256000, adds mimo-v2-flash 256000.

36 tests covering registry, aliases, auto-detect, credentials, models.dev,
normalization, URL mapping, providers module, doctor, aux client, vision
model override, and agent init.
2026-04-11 11:17:52 -07:00
Brooklyn Nicholson
bf6af95ff5 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-11 13:14:36 -05:00
Brooklyn Nicholson
3fd5cf6e3c feat: fix img pasting in new ink plus newline after tools 2026-04-11 13:14:32 -05:00
kshitijk4poor
50bb4fe010 fix(vision): auto-resize oversized images, increase default timeout, fix vision capability detection
Cherry-picked from PR #7749 by kshitijk4poor with modifications:

- Raise hard image limit from 5 MB to 20 MB (matches most restrictive provider)
- Send images at full resolution first; only auto-resize to 5 MB on API failure
- Add _is_image_size_error() helper to detect size-related API rejections
- Auto-resize uses Pillow (soft dep) with progressive downscale + JPEG quality reduction
- Fix get_model_capabilities() to check modalities.input for vision support
- Increase default vision timeout from 30s to 120s (matches hardcoded fallback intent)
- Applied retry-with-resize to both vision_analyze_tool and browser_vision

Closes #7740
2026-04-11 11:12:50 -07:00
Teknium
06e1d9cdd4
fix: resolve three high-impact community bugs (#5819, #6893, #3388) (#7881)
Matrix gateway: fix sync loop never dispatching events (#5819)
- _sync_loop() called client.sync() but never called handle_sync()
  to dispatch events to registered callbacks — _on_room_message was
  registered but never fired for new messages
- Store next_batch token from initial sync and pass as since= to
  subsequent incremental syncs (was doing full initial sync every time)
- 17 comments, confirmed by multiple users on matrix.org

Feishu docs: add interactive card configuration for approvals (#6893)
- Error 200340 is a Feishu Developer Console configuration issue,
  not a code bug — users need to enable Interactive Card capability
  and configure Card Request URL
- Added required 3-step setup instructions to feishu.md
- Added troubleshooting entry for error 200340
- 17 comments from Feishu users

Copilot provider drift: detect GPT-5.x Responses API requirement (#3388)
- GPT-5.x models are rejected on /v1/chat/completions by both OpenAI
  and OpenRouter (unsupported_api_for_model error)
- Added _model_requires_responses_api() to detect models needing
  Responses API regardless of provider
- Applied in __init__ (covers OpenRouter primary users) and in
  _try_activate_fallback() (covers Copilot->OpenRouter drift)
- Fixed stale comment claiming gateway creates fresh agents per message
  (it caches them via _agent_cache since the caching was added)
- 7 comments, reported on Copilot+Telegram gateway
2026-04-11 11:12:20 -07:00
Siddharth Balyan
69f3aaa1d6
fix(matrix): pass required args to MemoryCryptoStore for mautrix ≥0.21 (#7848)
* fix(matrix): pass required args to MemoryCryptoStore for mautrix ≥0.21

MemoryCryptoStore.__init__() now requires account_id and pickle_key
positional arguments as of mautrix 0.21. The migration from matrix-nio
(commit 1850747) didn't account for this, causing E2EE initialization
to fail with:

  MemoryCryptoStore.__init__() missing 2 required positional arguments:
  'account_id' and 'pickle_key'

Pass self._user_id as account_id and derive pickle_key from the same
user_id:device_id pair already used for the on-disk HMAC signature.

Update the test stub to accept the new parameters.

Fixes #7803

* fix: use consistent fallback for pickle_key derivation

Address review: _pickle_key now uses _acct_id (which has the 'hermes'
fallback) instead of raw self._user_id, so both values stay consistent
when user_id is empty.

---------

Co-authored-by: Hermes Agent <hermes@nousresearch.com>
2026-04-11 10:43:49 -07:00
Brooklyn Nicholson
b04248f4d5 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor
# Conflicts:
#	gateway/platforms/base.py
#	gateway/run.py
#	tests/gateway/test_command_bypass_active_session.py
2026-04-11 11:39:47 -05:00
kshitijk4poor
af9caec44f fix(qwen): correct context lengths for qwen3-coder models and send max_tokens to portal
Based on PR #7285 by @kshitijk4poor.

Two bugs affecting Qwen OAuth users:

1. Wrong context window — qwen3-coder-plus showed 128K instead of 1M.
   Added specific entries before the generic qwen catch-all:
   - qwen3-coder-plus: 1,000,000 (corrected from PR's 1,048,576 per
     official Alibaba Cloud docs and OpenRouter)
   - qwen3-coder: 262,144

2. Random stopping — max_tokens was suppressed for Qwen Portal, so the
   server applied its own low default. Reasoning models exhaust that on
   thinking tokens. Now: honor explicit max_tokens, default to 65536
   when unset.

Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com>
2026-04-11 03:29:31 -07:00
Teknium
f459214010
feat: background process monitoring — watch_patterns for real-time output alerts
* feat: add watch_patterns to background processes for output monitoring

Adds a new 'watch_patterns' parameter to terminal(background=true) that
lets the agent specify strings to watch for in process output. When a
matching line appears, a notification is queued and injected as a
synthetic message — triggering a new agent turn, similar to
notify_on_complete but mid-process.

Implementation:
- ProcessSession gets watch_patterns field + rate-limit state
- _check_watch_patterns() in ProcessRegistry scans new output chunks
  from all three reader threads (local, PTY, env-poller)
- Rate limited: max 8 notifications per 10s window
- Sustained overload (45s) permanently disables watching for that process
- watch_queue alongside completion_queue, same consumption pattern
- CLI drains watch_queue in both idle loop and post-turn drain
- Gateway drains after agent runs via _inject_watch_notification()
- Checkpoint persistence + crash recovery includes watch_patterns
- Blocked in execute_code sandbox (like other bg params)
- 20 new tests covering matching, rate limiting, overload kill,
  checkpoint persistence, schema, and handler passthrough

Usage:
  terminal(
      command='npm run dev',
      background=true,
      watch_patterns=['ERROR', 'WARN', 'listening on port']
  )

* refactor: merge watch_queue into completion_queue

Unified queue with 'type' field distinguishing 'completion',
'watch_match', and 'watch_disabled' events. Extracted
_format_process_notification() in CLI and gateway to handle
all event types in a single drain loop. Removes duplication
across both CLI drain sites and the gateway.
2026-04-11 03:13:23 -07:00
Hygaard
a2f9f04c06 fix: honor session-scoped gateway model overrides 2026-04-11 03:11:34 -07:00
luyao618
50ad66aee6 test(tools): add unit tests for budget_config module
Cover default constants, BudgetConfig defaults, frozen immutability,
custom construction, and the resolve_threshold() priority chain
(pinned > tool_overrides > registry > default). 20 tests total.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 02:58:48 -07:00
luyao618
80d82c2f5c test(tools): add unit tests for tool_backend_helpers module
Cover all public functions with 50 test cases:
- managed_nous_tools_enabled() feature flag toggling
- normalize_browser_cloud_provider() coercion and defaults
- coerce_modal_mode() / normalize_modal_mode() validation
- has_direct_modal_credentials() env vars and config file detection
- resolve_modal_backend_state() full backend selection matrix
- resolve_openai_audio_api_key() priority chain and edge cases

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 02:58:48 -07:00
Teknium
7241e6134b fix: remove stale test (missing pop_pending), add headers to FakeResponse
Follow-up fixes for cherry-pick conflicts:
- Removed test_context_keeps_pending_approval test that referenced
  pop_pending() which doesn't exist on current main
- Added headers attribute to FakeResponse in vision test (needed
  after #6949 added Content-Length check)
2026-04-11 02:03:20 -07:00
Kenny Xie
ae9a713a0a test(approval): clear leaked bypass state 2026-04-11 02:03:20 -07:00
Kenny Xie
eb8071bbc1 test(gateway): isolate blocking approval env 2026-04-11 02:03:20 -07:00
Kenny Xie
086d92a0e0 test(tools): isolate approval and audio gateway env 2026-04-11 02:03:20 -07:00
Tranquil-Flow
4e56eacdce fix(vision): reject oversized images before API call, handle file:// URIs, improve 400 errors
Three fixes for vision_analyze returning cryptic 400 "Invalid request data":

1. Pre-flight base64 size check — base64 inflates data ~33%, so a 3.8 MB
   file exceeds the 5 MB API limit. Reject early with a clear message
   instead of letting the provider return a generic 400.

2. Handle file:// URIs — strip the scheme and resolve as a local path.
   Previously file:///path/to/image.png fell through to the "invalid
   image source" error since it matched neither is_file() nor http(s).

3. Separate invalid_request errors from "does not support vision" errors
   so the user gets actionable guidance (resize/compress/retry) instead
   of a misleading "model does not support vision" message.

Closes #6677
2026-04-11 02:03:20 -07:00
kagura-agent
4d1f1dccf9 fix: normalize numeric MCP server names to str (fixes #6901)
YAML parses bare numeric keys (e.g. `12306:`) as int, causing
TypeError when sorted() is called on mixed int/str collections.

Changes:
- Normalize toolset_names entries to str in _get_platform_tools()
- Cast MCP server name to str(name) when building enabled_mcp_servers
- Add regression test
2026-04-11 02:03:20 -07:00
jjovalle99
640441b865 feat(tools): add Voxtral TTS provider (Mistral AI) 2026-04-11 01:56:55 -07:00
Teknium
424b62aa16 fix: update async fallback test mock to 5-tuple for api_mode 2026-04-11 01:52:58 -07:00
kshitijk4poor
c89719ad9c fix: warn and clear stale OPENAI_BASE_URL on provider switch (#5161) 2026-04-11 01:52:58 -07:00
kshitijk4poor
eeb8b4b00f fix(auxiliary): harden fallback behavior for non-OpenRouter users
Four fixes to auxiliary_client.py:

1. Respect explicit provider as hard constraint (#7559)
   When auxiliary.{task}.provider is explicitly set (not 'auto'),
   connection/payment errors no longer silently fallback to cloud
   providers. Local-only users (Ollama, vLLM) will no longer get
   unexpected OpenRouter billing from auxiliary tasks.

2. Eliminate model='default' sentinel (#7512)
   _resolve_api_key_provider() no longer sends literal 'default' as
   model name to APIs. Providers without a known aux model in
   _API_KEY_PROVIDER_AUX_MODELS are skipped instead of producing
   model_not_supported errors.

3. Add payment/connection fallback to async_call_llm (#7512)
   async_call_llm now mirrors sync call_llm's fallback logic for
   payment (402) and connection errors. Previously, async consumers
   (session_search, web_tools, vision) got hard failures with no
   recovery. Also fixes hardcoded 'openrouter' fallback to use the
   full auto-detection chain.

4. Use accurate error reason in fallback logs (#7512)
   _try_payment_fallback() now accepts a reason parameter and uses
   it in log messages. Connection timeouts are no longer misleadingly
   logged as 'payment error'.

Closes #7559
Closes #7512
2026-04-11 01:52:58 -07:00
kshitijk4poor
ffbd80f5fc fix(auxiliary): honor api_mode in auxiliary client (#6800)
The auxiliary client always calls client.chat.completions.create(),
ignoring the api_mode config flag. This breaks codex-family models
(e.g. gpt-5.3-codex) on direct OpenAI API keys, which need the
/v1/responses endpoint.

Changes:
- Expand _resolve_task_provider_model to return api_mode (5-tuple)
- Read api_mode from auxiliary.{task}.api_mode config and env vars
  (AUXILIARY_{TASK}_API_MODE)
- Pass api_mode through _get_cached_client to resolve_provider_client
- Add _needs_codex_wrap/_wrap_if_needed helpers that wrap plain OpenAI
  clients in CodexAuxiliaryClient when api_mode=codex_responses or
  when auto-detection finds api.openai.com + codex model pattern
- Apply wrapping at all custom endpoint, named custom provider, and
  API-key provider return paths
- Update test mocks for the new 5-tuple return format

Users can now set:
  auxiliary:
    compression:
      model: gpt-5.3-codex
      base_url: https://api.openai.com/v1
      api_mode: codex_responses

Closes #6800
2026-04-11 01:52:58 -07:00
jamesarch
704488b207 fix(setup): relaunch chat in a fresh process 2026-04-11 01:47:48 -07:00
konsisumer
b87e0f59cc fix(skills): read name from SKILL.md frontmatter in skills_sync
_discover_bundled_skills() used the directory name to identify skills,
but skills_tool.py and skills_hub.py use the `name:` field from SKILL.md
frontmatter.  This mismatch caused 9 builtin skills whose directory name
differs from their SKILL.md name to be written to .bundled_manifest
under the wrong key, so `hermes skills list` showed them as "local"
instead of "builtin".

Read the frontmatter name field (with directory-name fallback) so the
manifest keys match what the rest of the codebase expects.

Closes #6835
2026-04-11 01:21:20 -07:00
kshitijk4poor
d442f25a2f fix: align MiniMax provider with official API docs
Aligns MiniMax provider with official API documentation. Fixes 6 bugs:
transport mismatch (openai_chat -> anthropic_messages), credential leak
in switch_model(), prompt caching sent to non-Anthropic endpoints,
dot-to-hyphen model name corruption, trajectory compressor URL routing,
and stale doctor health check.

Also corrects context window (204,800), thinking support (manual mode),
max output (131,072), and model catalog (M2 family only on /anthropic).

Source: https://platform.minimax.io/docs/api-reference/text-anthropic-api

Co-authored-by: kshitijk4poor <kshitijk4poor@users.noreply.github.com>
2026-04-11 01:04:41 -07:00
Kathie1ee
d9f53dba4c
feat(honcho): add opt-in initOnSessionStart for tools mode and respect explicit peerName (#6995)
Two fixes for the honcho memory plugin: (1) initOnSessionStart — opt-in eager session init in tools mode so sync_turn() works from turn 1 (default false, non-breaking). (2) peerName fix — gateway user_id no longer silently overwrites an explicitly configured peerName. 11 new tests. Contributed by @Kathie-yu.
2026-04-11 00:43:27 -07:00
Teknium
caf371da18
fix: MiniMax/Alibaba incorrectly detected as Anthropic OAuth, causing mcp_ tool prefix (#7509)
_is_oauth_token() returned True for any key not starting with 'sk-ant-api',
which means MiniMax and Alibaba API keys were falsely treated as Anthropic
OAuth tokens. This triggered the Claude Code compatibility path:
- All tool names prefixed with mcp_ (e.g. mcp_terminal, mcp_web_search)
- System prompt injected with 'You are Claude Code' identity
- 'Hermes Agent' replaced with 'Claude Code' throughout

Fix: Make _is_oauth_token() positively identify Anthropic OAuth tokens by
their key format instead of using a broad catch-all:
- sk-ant-* (but not sk-ant-api-*) -> setup tokens, managed keys
- eyJ* -> JWTs from Anthropic OAuth flow
- Everything else -> False (MiniMax, Alibaba, etc.)

Reported by stefan171.
2026-04-11 00:43:01 -07:00
Kenny Xie
ecfae98152 fix(gateway): address restart review feedback 2026-04-10 21:18:34 -07:00
aquaright1
a55c044ca8 fix(gateway): self-request service restarts when invoked in-process 2026-04-10 21:18:34 -07:00
Kenny Xie
3163731289 fix(gateway): drain in-flight work before restart 2026-04-10 21:18:34 -07:00
Teknium
241032455c
fix: don't evict cached agent on failed runs — prevents MCP restart loop (#7539)
* fix: circuit breaker stops CPU-burning restart loops on persistent errors

When a gateway session hits a non-retryable error (e.g. invalid model
ID → HTTP 400), the agent fails and returns. But if the session keeps
receiving messages (or something periodically recreates agents), each
attempt spawns a new AIAgent — reinitializing MCP server connections,
burning CPU — only to hit the same 400 error again. On a 4-core server,
this pegs an entire core per stuck session and accumulates 300+ minutes
of CPU time over hours.

Fix: add a per-session consecutive failure counter in the gateway runner.

- Track consecutive non-retryable failures per session key
- After 3 consecutive failures (_MAX_CONSECUTIVE_FAILURES), block
  further agent creation for that session and notify the user:
  '⚠️ This session has failed N times in a row with a non-retryable
  error. Use /reset to start a new session.'
- Evict the cached agent when the circuit breaker engages to prevent
  stale state from accumulating
- Reset the counter on successful agent runs
- Clear the counter on /reset and /new so users can recover
- Uses getattr() pattern so bare GatewayRunner instances (common in
  tests using object.__new__) don't crash

Tests:
- 8 new tests in test_circuit_breaker.py covering counter behavior,
  threshold, reset, session isolation, and bare-runner safety

Addresses #7130.

* Revert "fix: circuit breaker stops CPU-burning restart loops on persistent errors"

This reverts commit d848ea7109d62a2fc4ba6da36fc4f0366b5ded94.

* fix: don't evict cached agent on failed runs — prevents MCP restart loop

When a run fails (e.g. invalid model ID → 400) and fallback activated,
the gateway was evicting the cached agent to 'retry primary next time.'
But evicting a failed agent forces a full AIAgent recreation on the next
message — reinitializing MCP server connections, spawning stdio
processes — only to hit the same 400 again. This created a CPU-burning
loop (91%+ for hours, #7130).

The fix: add `and not _run_failed` to the fallback-eviction check.
Failed runs keep the cached agent. The next message reuses it (no MCP
reinit), hits the same error, returns it to the user quickly. The user
can /reset or /model to fix their config.

Successful fallback runs still evict as before so the next message
retries the primary model.

Addresses #7130.
2026-04-10 21:16:56 -07:00
Kenny Xie
1ffd92cc94 fix(gateway): make manual compression feedback truthful 2026-04-10 21:16:53 -07:00
Kenny Xie
d6c2ad7e41 fix(gateway): make compress responses truthful 2026-04-10 21:16:53 -07:00
luyao618
fc06a0147e fix(tools): remove dead code in _is_likely_binary and harden _check_lint against brace paths
- Remove unreachable `if not content_sample` branch inside the truthy
  `if content_sample` block in `_is_likely_binary()` (dead code that
  could never execute).
- Replace `linter_cmd.format(file=...)` with `linter_cmd.replace("{file}", ...)`
  in `_check_lint()` so file paths containing curly braces (e.g.
  `src/{test}.py`) no longer raise KeyError/ValueError.
- Add 16 unit tests covering both fixes and edge cases.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 21:16:53 -07:00
hermes-agent-dhabibi
c1af614289 fix: wrap copilot Responses-API models in CodexAuxiliaryClient for auxiliary tasks
GPT-5+ models (except gpt-5-mini) are only accessible via the Responses
API on Copilot. When these models were configured as the compression
summary_model (or any auxiliary task), the plain OpenAI client sent them
to /chat/completions which returned a 400 error:

    model "gpt-5.4-mini" is not accessible via the /chat/completions endpoint

resolve_provider_client() now checks _should_use_copilot_responses_api()
for the copilot provider and wraps the client in CodexAuxiliaryClient
when needed, routing calls through responses.stream() transparently.

Adds tests for both the wrapping (gpt-5.4-mini) and non-wrapping
(gpt-4.1-mini) paths.
2026-04-10 21:16:53 -07:00
hermes-agent-dhabibi
718e8ad6fa feat(delegation): add configurable reasoning_effort for subagents
Add delegation.reasoning_effort config key so subagents can run at a
different thinking level than the parent agent. When set, overrides
the parent's reasoning_config; when empty, inherits as before.

Valid values: xhigh, high, medium, low, minimal, none (disables thinking).

Config path: delegation.reasoning_effort in config.yaml

Files changed:
- tools/delegate_tool.py: resolve override in _build_child_agent
- hermes_cli/config.py: add reasoning_effort to DEFAULT_CONFIG
- tests/tools/test_delegate.py: 4 new tests covering all cases
2026-04-10 21:16:53 -07:00
Teknium
be9198f1e1 fix: guard mautrix imports for gateway-safe fallback + fix test isolation
Follow-up fixes for the matrix-nio → mautrix migration:

1. Module-level mautrix.types import now wrapped in try/except with
   proper stub classes. Without this, importing gateway.platforms.matrix
   crashes the entire gateway when mautrix isn't installed — even for
   users who don't use Matrix. The stubs mirror mautrix's real attribute
   names so tests that exercise adapter methods (send, reactions, etc.)
   work without the real SDK.

2. Removed _ensure_mautrix_mock() from test_matrix_mention.py — it
   permanently installed MagicMock modules in sys.modules via setdefault(),
   polluting later tests in the suite. No longer needed since the module
   imports cleanly without mautrix.

3. Fixed thread persistence tests to use direct class reference in
   monkeypatch.setattr() instead of string-based paths, which broke
   when the module was reimported by other tests.

4. Moved the module-importability test to a subprocess to prevent it
   from polluting sys.modules (reimporting creates a second module object
   with different __dict__, breaking patch.object in subsequent tests).
2026-04-10 21:15:59 -07:00
alt-glitch
d5be23aed7 docs(matrix): update all references from matrix-nio to mautrix 2026-04-10 21:15:59 -07:00
alt-glitch
417e28f941 test(matrix): update all test mocks for mautrix-python API
Rewrite mock infrastructure across three test files:
- test_matrix.py: replace fake nio module with fake mautrix module tree,
  update all client method mocks to new API names and return types
- test_matrix_voice.py: update event construction, download/upload mocks,
  handler invocation (single event arg, no room object)
- test_matrix_mention.py: update mock module, event construction, DM
  detection via _dm_rooms cache instead of room.member_count

157 tests passing.
2026-04-10 21:15:59 -07:00
alt-glitch
1850747172 refactor(matrix): swap matrix-nio for mautrix-python dependency
matrix-nio pulls in peewee -> atomicwrites (sdist-only, archived,
missing build-system metadata) which breaks nix flake builds.
mautrix-python publishes wheels, has a leaner dep tree, and its
[encryption] extra uses the same python-olm without the problematic
transitive chain.
2026-04-10 21:15:59 -07:00
Teknium
a8fd7257b1
feat(gateway): WSL-aware gateway with smart systemd detection (#7510)
- Add shared is_wsl() to hermes_constants (like is_termux)
- Update supports_systemd_services() to verify systemd is actually
  running on WSL before returning True
- Add WSL-specific guidance in gateway install/start/setup/status
  for both cases: WSL+systemd and WSL without systemd
- Improve help strings: 'run' now says recommended for WSL/Docker,
  'start'/'install' now mention systemd/launchd explicitly
- Add WSL gateway FAQ section with tmux/nohup/Task Scheduler tips
- Update CLI commands docs with WSL tip
- Deduplicate _is_wsl() from clipboard.py to shared hermes_constants
- Fix clipboard tests to reset hermes_constants cache
- 20 new WSL-specific tests covering detection, systemd check,
  supports_systemd_services integration, and command output

Motivated by user feedback: took 1 hour to figure out run vs start
on WSL, Telegram bot kept disconnecting due to flaky WSL systemd.
2026-04-10 21:15:47 -07:00
Hermes Agent
97bb64dbbf test(file_sync): add tests for bulk_upload_fn callback
Cover the three key behaviors:
- bulk_upload_fn is called instead of per-file upload_fn
- Fallback to upload_fn when bulk_upload_fn is None
- Rollback on bulk upload failure retries all files
2026-04-10 21:14:32 -07:00
Teknium
436dfd5ab5 fix: no auto-activation + unified hermes plugins UI with provider categories
- Remove auto-activation: when context.engine is 'compressor' (default),
  plugin-registered engines are NOT used. Users must explicitly set
  context.engine to a plugin name to activate it.

- Add curses_radiolist() to curses_ui.py: single-select radio picker
  with keyboard nav + text fallback, matching curses_checklist pattern.

- Rewrite cmd_toggle() as composite plugins UI:
  Top section: general plugins with checkboxes (existing behavior)
  Bottom section: provider plugin categories (Memory Provider, Context Engine)
  with current selection shown inline. ENTER/SPACE on a category opens
  a radiolist sub-screen for single-select configuration.

- Add provider discovery helpers: _discover_memory_providers(),
  _discover_context_engines(), config read/save for memory.provider
  and context.engine.

- Add tests: radiolist non-TTY fallback, provider config save/load,
  discovery error handling, auto-activation removal verification.
2026-04-10 19:15:50 -07:00
Stephen Schoettler
92382fb00e feat: wire context engine plugin slot into agent and plugin system
- PluginContext.register_context_engine() lets plugins replace the
  built-in ContextCompressor with a custom ContextEngine implementation
- PluginManager stores the registered engine; only one allowed
- run_agent.py checks for a plugin engine at init before falling back
  to the default ContextCompressor
- reset_session_state() now calls engine.on_session_reset() instead of
  poking internal attributes directly
- ContextCompressor.on_session_reset() handles its own internals
  (_context_probed, _previous_summary, etc.)
- 19 new tests covering ABC contract, defaults, plugin slot registration,
  rejection of duplicates/non-engines, and compressor reset behavior
- All 34 existing compressor tests pass unchanged
2026-04-10 19:15:50 -07:00
Teknium
842e669a13
fix: activate fallback provider on repeated empty responses + user-visible status (#7505)
When models return empty responses (no content, no tool calls, no
reasoning), Hermes previously retried 3 times silently then fell through
to '(empty)' — without ever trying the fallback provider chain. Users on
GLM-4.5-Air and similar models experienced what appeared to be a
complete hang, especially in gateway (Telegram/Discord) contexts where
the silent retries produced zero feedback.

Changes:
- After exhausting 3 empty retries, attempt _try_activate_fallback()
  before giving up with '(empty)'. If fallback succeeds, reset retry
  counter and continue the conversation loop with the new provider.
- Replace all _vprint() calls in recovery paths with _emit_status(),
  which surfaces messages through both CLI (_vprint with force=True)
  and gateway (status_callback -> adapter.send). Users now see:
  * '⚠️ Empty response from model — retrying (N/3)' during retries
  * '⚠️ Model returning empty responses — switching to fallback...'
  * '↻ Switched to fallback: <model> (<provider>)' on success
  * ' Model returned no content after all retries [and fallback]'
- Add logger.warning() throughout empty response paths for log file
  visibility (model name, provider, retry counts).
- Upgrade _last_content_with_tools fallback from logger.debug to
  logger.info + _emit_status so recovery is visible.
- Upgrade thinking-only prefill continuation to use _emit_status.

Tests:
- test_empty_response_triggers_fallback_provider: verifies fallback
  activation after 3 empty retries produces content from fallback model
- test_empty_response_fallback_also_empty_returns_empty: verifies
  graceful degradation when fallback also returns empty
- test_empty_response_emits_status_for_gateway: verifies _emit_status
  is called during retries so gateway users see feedback

Addresses #7180.
2026-04-10 19:15:41 -07:00
Bartok Moltbot
992422910c fix(api): send tool progress as custom SSE event to prevent model corruption (#6972)
Tool progress markers (e.g. ` list`) were injected directly into
SSE delta.content chunks. OpenAI-compatible frontends (Open WebUI,
LobeChat, etc.) store delta.content verbatim as the assistant message
and send it back on subsequent requests. After enough turns, the model
learns to emit these markers as plain text instead of issuing real tool
calls — silently hallucinating tool results without ever running them.

Fix: Send tool progress as a custom `event: hermes.tool.progress` SSE
event instead of mixing it into delta.content. Per the SSE spec, clients
that don't understand a custom event type silently ignore it, so this is
backward-compatible. Frontends that want to render progress indicators
can listen for the custom event without persisting it to conversation
history.

The /v1/runs endpoint already uses structured events — this aligns the
/v1/chat/completions streaming path with the same principle.

Closes #6972
2026-04-10 18:55:26 -07:00
Siddharth Balyan
9a0c44f908
fix(nix): gate matrix extra to Linux in [all] profile (#7461)
* fix(nix): gate matrix extra to Linux in [all] profile

matrix-nio[e2e] depends on python-olm which is upstream-broken on modern
macOS (Clang 21+, archived libolm). Previously the [matrix] extra was
completely excluded from [all], meaning NixOS users (who install via [all])
had no Matrix support at all.

Add a sys_platform == 'linux' marker so [all] pulls in [matrix] on Linux
(where python-olm builds fine) while still skipping it on macOS. This
fixes the NixOS setup path without breaking macOS installs.

Update the regression test to verify the Linux-gated marker is present
rather than just checking matrix is absent from [all].

Fixes #4594

* chore: regenerate uv.lock with matrix-on-linux in [all]
2026-04-11 05:59:56 +05:30
0xFrank-eth
e8034e2f6a fix(gateway): replace os.environ session state with contextvars for concurrency safety
When two gateway messages arrived concurrently, _set_session_env wrote
HERMES_SESSION_PLATFORM/CHAT_ID/CHAT_NAME/THREAD_ID into the process-global
os.environ. Because asyncio tasks share the same process, Message B would
overwrite Message A's values mid-flight, causing background-task notifications
and tool calls to route to the wrong thread/chat.

Replace os.environ with Python's contextvars.ContextVar. Each asyncio task
(and any run_in_executor thread it spawns) gets its own copy, so concurrent
messages never interfere.

Changes:
- New gateway/session_context.py with ContextVar definitions, set/clear/get
  helpers, and os.environ fallback for CLI/cron/test backward compatibility
- gateway/run.py: _set_session_env returns reset tokens, _clear_session_env
  accepts them for proper cleanup in finally blocks
- All tool consumers updated: cronjob_tools, send_message_tool, skills_tool,
  terminal_tool (both notify_on_complete AND check_interval blocks), tts_tool,
  agent/skill_utils, agent/prompt_builder
- Tests updated for new contextvar-based API

Fixes #7358

Co-authored-by: teknium1 <127238744+teknium1@users.noreply.github.com>
2026-04-10 17:04:38 -07:00
Dylan Socolobsky
dab5ec8245 test(e2e): add Slack to parametrized e2e platform tests 2026-04-10 16:51:44 -07:00
Dylan Socolobsky
79565630b0 refactor(e2e): unify Telegram and Discord e2e tests into parametrized platform fixtures 2026-04-10 16:51:44 -07:00
Dylan Socolobsky
7033dbf5d6 test(e2e): add Discord e2e integration tests 2026-04-10 16:51:44 -07:00
pefontana
8414f41856 test: add zombie process cleanup tests
Add 9 tests covering the full zombie process prevention chain:

- TestZombieReproduction: demonstrates that processes survive when
  references are dropped without explicit cleanup (the original bug)
- TestAgentCloseMethod: verifies close() calls all cleanup functions,
  is idempotent, propagates to children, and continues cleanup even
  when individual steps fail
- TestGatewayCleanupWiring: verifies stop() calls close() and that
  _evict_cached_agent() does NOT call close() (since it's also used
  for non-destructive cache refreshes)
- TestDelegationCleanup: calls the real _run_single_child function and
  verifies close() is called on the child agent

Ref: #7131
2026-04-10 16:51:44 -07:00
entropidelic
989b950fbc fix(security): enforce API_SERVER_KEY for non-loopback binding
Add is_network_accessible() helper using Python's ipaddress module to
robustly classify bind addresses (IPv4/IPv6 loopback, wildcards,
mapped addresses, hostname resolution with DNS-failure-fails-closed).

The API server connect() now refuses to start when the bind address is
network-accessible and no API_SERVER_KEY is set, preventing RCE from
other machines on the network.

Co-authored-by: entropidelic <entropidelic@users.noreply.github.com>
2026-04-10 16:51:44 -07:00
Teknium
a4fc38c5b1 test: remove dead TestResolveForcedProvider tests (function doesn't exist on main) 2026-04-10 16:47:44 -07:00
KUSH42
0e939af7c2 fix(patch): harden V4A patch parser and fuzzy match — 9 correctness bugs
- Bug 1: replace read_file(limit=10000) with read_file_raw in _apply_update,
  preventing silent truncation of files >2000 lines and corruption of lines
  >2000 chars; add read_file_raw to FileOperations abstract interface and
  ShellFileOperations

- Bug 2: split apply_v4a_operations into validate-then-apply phases; if any
  hunk fails validation, zero writes occur (was: continue after failure,
  leaving filesystem partially modified)

- Bug 3: parse_v4a_patch now returns an error for begin-marker-with-no-ops,
  empty file paths, and moves missing a destination (was: always returned
  error=None)

- Bug 4: raise strategy 7 (block anchor) single-candidate similarity threshold
  from 0.10 to 0.50, eliminating false-positive matches in repetitive code

- Bug 5: add _strategy_unicode_normalized (new strategy 7) with position
  mapping via _build_orig_to_norm_map; smart quotes and em-dashes in
  LLM-generated patches now match via strategies 1-6 before falling through
  to fuzzy strategies

- Bug 6: extend fuzzy_find_and_replace to return 4-tuple (content, count,
  error, strategy); update all 5 call sites across patch_parser.py,
  file_operations.py, and skill_manager_tool.py

- Bug 7: guard in _apply_update returns error when addition-only context hint
  is ambiguous (>1 occurrences); validation phase errors on both 0 and >1

- Bug 8: _apply_delete returns error (not silent success) on missing file

- Bug 9: _validate_operations checks source existence and destination absence
  for MOVE operations before any write occurs
2026-04-10 16:47:44 -07:00
Billard
475cbce775 fix(aux): honor api_mode for custom auxiliary endpoints 2026-04-10 16:47:44 -07:00
Awsh1
6f63ba9c8f fix(mcp): fall back when SIGKILL is unavailable 2026-04-10 16:47:44 -07:00
Fran Fitzpatrick
3e24ba1656 feat(matrix): add MATRIX_DM_MENTION_THREADS env var
When enabled, @mentioning the bot in a DM creates a thread (default:
false). Supports both env var and YAML config (matrix.dm_mention_threads).
6 new tests, docs updated.

From #6957
2026-04-10 15:46:20 -07:00
Teknium
ea81aa2eec
fix: guard api_kwargs in except handler to prevent UnboundLocalError (#7376)
When _build_api_kwargs() throws an exception, the except handler in
the retry loop referenced api_kwargs before it was assigned. This
caused an UnboundLocalError that masked the real error, making
debugging impossible for the user.

Two _dump_api_request_debug() calls in the except block (non-retryable
client error path and max-retries-exhausted path) both accessed
api_kwargs without checking if it was assigned.

Fix: initialize api_kwargs = None before the retry loop and guard both
dump calls. Now the real error surfaces instead of the masking
UnboundLocalError.

Reported by Discord user gruman0.
2026-04-10 15:12:00 -07:00
Teknium
496e378b10
fix: resolve overlay provider slug mismatch in /model picker (#7373)
HERMES_OVERLAYS keys use models.dev IDs (e.g. 'github-copilot') but
_PROVIDER_MODELS curated lists and config.yaml use Hermes provider IDs
('copilot'). list_authenticated_providers() Section 2 was using the
overlay key directly for model lookups and is_current checks, causing:
- 0 models shown for copilot, kimi, kilo, opencode, vercel
- is_current never matching the config provider

Fix: build reverse mapping from PROVIDER_TO_MODELS_DEV to translate
overlay keys to Hermes slugs before curated list lookup and result
construction. Also adds 'kimi-for-coding' alias in auth.py so the
picker's returned slug resolves correctly in resolve_provider().

Fixes #5223. Based on work by HearthCore (#6492) and linxule (#6287).

Co-authored-by: HearthCore <HearthCore@users.noreply.github.com>
Co-authored-by: linxule <linxule@users.noreply.github.com>
2026-04-10 14:46:57 -07:00
Julien Talbot
8bcb8b8e87 feat(providers): add native xAI provider
Adds xAI as a first-class provider: ProviderConfig in auth.py,
HermesOverlay in providers.py, 11 curated Grok models, URL mapping
in model_metadata.py, aliases (x-ai, x.ai), and env var tests.
Uses standard OpenAI-compatible chat completions.

Closes #7050
2026-04-10 13:40:38 -07:00
Teknium
363d5d57be test: update schema assertion after maxItems removal 2026-04-10 13:38:14 -07:00
angelos
7ccdb74364 fix(delegate): make max_concurrent_children configurable + error on excess
`delegate_task` silently truncated batch tasks to 3 — the model sends
5 tasks, gets results for 3, never told 2 were dropped. Now returns a
clear tool_error explaining the limit and how to fix it.

The limit is configurable via:
  - delegation.max_concurrent_children in config.yaml (priority 1)
  - DELEGATION_MAX_CONCURRENT_CHILDREN env var (priority 2)
  - default: 3

Uses the same _load_config() path as the rest of delegate_task for
consistent config priority. Clamps to min 1, warns on non-integer
config values.

Also removes the hardcoded maxItems: 3 from the JSON schema — the
schema was blocking the model from even attempting >3 tasks before
the runtime check could fire. The runtime check gives a much more
actionable error message.

Backwards compatible: default remains 3, existing configs unchanged.
2026-04-10 13:38:14 -07:00
Teknium
4fb42d0193
fix: per-profile subprocess HOME isolation (#4426) (#7357)
Isolate system tool configs (git, ssh, gh, npm) per profile by injecting
a per-profile HOME into subprocess environments only.  The Python
process's own os.environ['HOME'] and Path.home() are never modified,
preserving all existing profile infrastructure.

Activation is directory-based: when {HERMES_HOME}/home/ exists on disk,
subprocesses see it as HOME.  The directory is created automatically for:
- Docker: entrypoint.sh bootstraps it inside the persistent volume
- Named profiles: added to _PROFILE_DIRS in profiles.py

Injection points (all three subprocess env builders):
- tools/environments/local.py _make_run_env() — foreground terminal
- tools/environments/local.py _sanitize_subprocess_env() — background procs
- tools/code_execution_tool.py child_env — execute_code sandbox

Single source of truth: hermes_constants.get_subprocess_home()

Closes #4426
2026-04-10 13:37:45 -07:00
Teknium
360b21ce95
fix(gateway): reject file paths in get_command() + file-drop tests (#7356)
Gateway get_command() now rejects paths containing /. Also adds 28 _detect_file_drop regression tests. From #6978 (@ygd58) and #6963 (@betamod).
2026-04-10 13:06:02 -07:00
kshitijk4poor
37a1c75716 fix(browser): hardening — dead code, caching, scroll perf, security, thread safety
Salvaged from PR #7276 (hardening-only subset; excluded 6 new tools
and unrelated scope additions from the contributor's commit).

- Remove dead DEFAULT_SESSION_TIMEOUT and unregistered browser_close schema
- Fix _camofox_eval wrong call signatures (_ensure_tab, _post args)
- Cache _find_agent_browser, _get_command_timeout, _discover_homebrew_node_dirs
- Replace 5x subprocess scroll loop with single pixel-arg call
- URL-decode before secret exfiltration check (bypass prevention)
- Protect _recording_sessions with _cleanup_lock (thread safety)
- Return failure on empty stdout instead of silent success
- Structure-aware _truncate_snapshot (cut at line boundaries)

Follow-up improvements over contributor's original:
- Move _EMPTY_OK_COMMANDS to module-level frozenset (avoid per-call allocation)
- Fix list+tuple concat in _run_browser_command PATH construction
- Update test_browser_homebrew_paths.py for tuple returns and cache fixtures

Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com>
Closes #7168, closes #7171, closes #7172, closes #7173
2026-04-10 13:05:44 -07:00
WAXLYY
c6e1add6f1 fix(agent): preserve quoted @file references with spaces 2026-04-10 13:05:01 -07:00
Hermes Audit
2c99b4e79b fix(unicode): sanitize surrogate metadata and allow two-pass retry 2026-04-10 13:05:01 -07:00