hermes-agent/tests/tools
Teknium 20f2258f34
fix(interrupt): propagate to concurrent-tool workers + opt-in debug trace (#11907)
* fix(interrupt): propagate to concurrent-tool workers + opt-in debug trace

interrupt() previously only flagged the agent's _execution_thread_id.
Tools running inside _execute_tool_calls_concurrent execute on
ThreadPoolExecutor worker threads whose tids are distinct from the
agent's, so is_interrupted() inside those tools returned False no matter
how many times the gateway called .interrupt() — hung ssh / curl / long
make-builds ran to their own timeout.

Changes:
- run_agent.py: track concurrent-tool worker tids in a per-agent set,
  fan interrupt()/clear_interrupt() out to them, and handle the
  register-after-interrupt race at _run_tool entry.  getattr fallback
  for the tracker so test stubs built via object.__new__ keep working.
- tools/environments/base.py: opt-in _wait_for_process trace (ENTER,
  per-30s HEARTBEAT with interrupt+activity-cb state, INTERRUPT
  DETECTED, TIMEOUT, EXIT) behind HERMES_DEBUG_INTERRUPT=1.
- tools/interrupt.py: opt-in set_interrupt() trace (caller tid, target
  tid, set snapshot) behind the same env flag.
- tests: new regression test runs a polling tool on a concurrent worker
  and asserts is_interrupted() flips to True within ~1s of interrupt().
  Second new test guards clear_interrupt() clearing tracked worker bits.

Validation: tests/run_agent/ all 762 pass; tests/tools/ interrupt+env
subset 216 pass.

* fix(interrupt-debug): bypass quiet_mode logger filter so trace reaches agent.log

AIAgent.__init__ sets logging.getLogger('tools').setLevel(ERROR) when
quiet_mode=True (the CLI default). This would silently swallow every
INFO-level trace line from the HERMES_DEBUG_INTERRUPT=1 instrumentation
added in the parent commit — confirmed by running hermes chat -q with
the flag and finding zero trace lines in agent.log even though
_wait_for_process was clearly executing (subprocess pid existed).

Fix: when HERMES_DEBUG_INTERRUPT=1, each traced module explicitly sets
its own logger level to INFO at import time, overriding the 'tools'
parent-level filter. Scoped to the opt-in case only, so production
(quiet_mode default) logs stay quiet as designed.

Validation: hermes chat -q with HERMES_DEBUG_INTERRUPT=1 now writes
'_wait_for_process ENTER/EXIT' lines to agent.log as expected.

* fix(cli): SIGTERM/SIGHUP no longer orphans tool subprocesses

Tool subprocesses spawned by the local environment backend use
os.setsid so they run in their own process group. Before this fix,
SIGTERM/SIGHUP to the hermes CLI killed the main thread via
KeyboardInterrupt but the worker thread running _wait_for_process
never got a chance to call _kill_process — Python exited, the child
was reparented to init (PPID=1), and the subprocess ran to its
natural end (confirmed live: sleep 300 survived 4+ min after SIGTERM
to the agent until manual cleanup).

Changes:
- cli.py _signal_handler (interactive) + _signal_handler_q (-q mode):
  route SIGTERM/SIGHUP through agent.interrupt() so the worker's poll
  loop sees the per-thread interrupt flag and calls _kill_process
  (os.killpg) on the subprocess group. HERMES_SIGTERM_GRACE (default
  1.5s) gives the worker time to complete its SIGTERM+SIGKILL
  escalation before KeyboardInterrupt unwinds main.
- tools/environments/base.py _wait_for_process: wrap the poll loop in
  try/except (KeyboardInterrupt, SystemExit) so the cleanup fires
  even on paths the signal handlers don't cover (direct sys.exit,
  unhandled KI from nested code, etc.). Emits EXCEPTION_EXIT trace
  line when HERMES_DEBUG_INTERRUPT=1.
- New regression test: injects KeyboardInterrupt into a running
  _wait_for_process via PyThreadState_SetAsyncExc, verifies the
  subprocess process group is dead within 3s of the exception and
  that KeyboardInterrupt re-raises cleanly afterward.

Validation:
| Before                                                  | After              |
|---------------------------------------------------------|--------------------|
| sleep 300 survives 4+ min as PPID=1 orphan after SIGTERM | dies within 2 s   |
| No INTERRUPT DETECTED in trace                          | INTERRUPT DETECTED fires + killing process group |
| tests/tools/test_local_interrupt_cleanup                | 1/1 pass          |
| tests/run_agent/test_concurrent_interrupt               | 4/4 pass          |
2026-04-17 20:39:25 -07:00
..
__init__.py
test_accretion_caps.py fix(tools): bound _read_tracker sub-containers + prune _completion_consumed (#11839) 2026-04-17 15:53:57 -07:00
test_ansi_strip.py
test_approval_heartbeat.py fix(approval): heartbeat activity during gateway approval wait (#11245) 2026-04-16 14:48:50 -07:00
test_approval.py fix(kimi): cover remaining fixed-temperature bypasses 2026-04-17 20:25:42 -07:00
test_base_environment.py
test_browser_camofox_persistence.py docs: remove nonexistent CAMOFOX_PROFILE_DIR env var references (#10976) 2026-04-16 04:07:11 -07:00
test_browser_camofox_state.py feat(discord): add channel_prompts config 2026-04-15 16:31:28 -07:00
test_browser_camofox.py fix: /browser connect CDP override now takes priority over Camofox (#10523) 2026-04-15 14:11:18 -07:00
test_browser_cdp_override.py Support browser CDP URL from config 2026-04-17 16:05:04 -07:00
test_browser_cleanup.py
test_browser_cloud_fallback.py fix(browser): runtime fallback to local Chromium when cloud provider fails 2026-04-16 04:19:34 -07:00
test_browser_console.py
test_browser_content_none_guard.py
test_browser_hardening.py fix(browser): hardening — dead code, caching, scroll perf, security, thread safety 2026-04-10 13:05:44 -07:00
test_browser_homebrew_paths.py fix(browser): add termux PATH fallbacks 2026-04-14 16:55:55 -07:00
test_browser_orphan_reaper.py fix: two process leaks (agent-browser daemons, paste.rs sleepers) (#11843) 2026-04-17 18:46:30 -07:00
test_browser_secret_exfil.py
test_browser_ssrf_local.py
test_budget_config.py test(tools): add unit tests for budget_config module 2026-04-11 02:58:48 -07:00
test_checkpoint_manager.py fix(checkpoints): isolate shadow git repo from user's global config (#11261) 2026-04-16 16:06:49 -07:00
test_clarify_tool.py
test_clipboard.py feat: fix img pasting in new ink plus newline after tools 2026-04-11 13:14:32 -05:00
test_code_execution.py fix: follow-up for salvaged PR #10854 2026-04-16 06:42:45 -07:00
test_command_guards.py fix: remove 115 verified dead code symbols across 46 production files 2026-04-10 03:44:43 -07:00
test_config_null_guard.py
test_credential_files.py fix: remove 115 verified dead code symbols across 46 production files 2026-04-10 03:44:43 -07:00
test_cron_prompt_injection.py
test_cronjob_tools.py refactor: remove dead code — 1,784 lines across 77 files (#9180) 2026-04-13 16:32:04 -07:00
test_daytona_environment.py
test_debug_helpers.py
test_delegate_toolset_scope.py
test_delegate.py feat(delegation): add configurable reasoning_effort for subagents 2026-04-10 21:16:53 -07:00
test_docker_environment.py fix(tests): fix several failing/flaky tests on main (#6777) 2026-04-09 13:17:06 -07:00
test_docker_find.py feat: entry-level Podman support — find_docker() + rootless entrypoint (#10066) 2026-04-14 21:20:37 -07:00
test_env_passthrough.py fix: remove 115 verified dead code symbols across 46 production files 2026-04-10 03:44:43 -07:00
test_feishu_tools.py feat: add Feishu document comment intelligent reply with 3-tier access control 2026-04-17 19:04:11 -07:00
test_file_operations_edge_cases.py fix(tools): remove dead code in _is_likely_binary and harden _check_lint against brace paths 2026-04-10 21:16:53 -07:00
test_file_operations.py fix(patch): harden V4A patch parser and fuzzy match — 9 correctness bugs 2026-04-10 16:47:44 -07:00
test_file_ops_cwd_tracking.py fix(file-ops): follow terminal env's live cwd in _exec instead of init-time cached cwd (#11912) 2026-04-17 19:26:40 -07:00
test_file_read_guards.py refactor: remove dead code — 1,784 lines across 77 files (#9180) 2026-04-13 16:32:04 -07:00
test_file_staleness.py refactor: remove dead code — 1,784 lines across 77 files (#9180) 2026-04-13 16:32:04 -07:00
test_file_sync_back.py fix: harden sync_back — PID-suffix temp path, size cap, lifecycle guards 2026-04-16 19:39:21 -07:00
test_file_sync_perf.py test: add reproducible perf benchmark for file sync overhead 2026-04-10 03:01:46 -07:00
test_file_sync.py test(file_sync): add tests for bulk_upload_fn callback 2026-04-10 21:14:32 -07:00
test_file_tools_live.py
test_file_tools.py refactor: remove dead code — 1,784 lines across 77 files (#9180) 2026-04-13 16:32:04 -07:00
test_file_write_safety.py fix(file_tools): block /private/etc writes on macOS symlink bypass 2026-04-13 05:15:05 -07:00
test_force_dangerous_override.py
test_fuzzy_match.py fix(patch): harden V4A patch parser and fuzzy match — 9 correctness bugs 2026-04-10 16:47:44 -07:00
test_hidden_dir_filter.py
test_homeassistant_tool.py fix: clean up description escaping, add string-data tests 2026-04-13 04:45:07 -07:00
test_image_generation.py feat(image_gen): upgrade Recraft V3 → V4 Pro, Nano Banana → Pro (#11406) 2026-04-16 22:05:41 -07:00
test_interrupt.py fix: resolve remaining 4 CI test failures (#9543) 2026-04-14 02:18:38 -07:00
test_llm_content_none_guard.py
test_local_env_blocklist.py fix(providers): complete NVIDIA NIM parity with other providers 2026-04-17 13:47:46 -07:00
test_local_interrupt_cleanup.py fix(interrupt): propagate to concurrent-tool workers + opt-in debug trace (#11907) 2026-04-17 20:39:25 -07:00
test_local_tempdir.py fix(termux): honor temp dirs for local temp artifacts 2026-04-09 16:24:53 -07:00
test_managed_browserbase_and_modal.py feat: ungate Tool Gateway — subscription-based access with per-tool opt-in 2026-04-16 12:36:49 -07:00
test_managed_media_gateways.py feat: ungate Tool Gateway — subscription-based access with per-tool opt-in 2026-04-16 12:36:49 -07:00
test_managed_modal_environment.py fix: add activity heartbeats to prevent false gateway inactivity timeouts (#10501) 2026-04-15 13:29:05 -07:00
test_managed_server_tool_support.py fix(tests): fix several failing/flaky tests on main (#6777) 2026-04-09 13:17:06 -07:00
test_managed_tool_gateway.py feat: ungate Tool Gateway — subscription-based access with per-tool opt-in 2026-04-16 12:36:49 -07:00
test_mcp_dynamic_discovery.py fix(mcp): make server aliases explicit 2026-04-14 17:19:20 -07:00
test_mcp_oauth_integration.py fix(mcp): consolidate OAuth handling, pick up external token refreshes (#11383) 2026-04-16 21:57:10 -07:00
test_mcp_oauth_manager.py fix(mcp): consolidate OAuth handling, pick up external token refreshes (#11383) 2026-04-16 21:57:10 -07:00
test_mcp_oauth.py fix(mcp): consolidate OAuth handling, pick up external token refreshes (#11383) 2026-04-16 21:57:10 -07:00
test_mcp_probe.py
test_mcp_reconnect_signal.py fix(mcp): consolidate OAuth handling, pick up external token refreshes (#11383) 2026-04-16 21:57:10 -07:00
test_mcp_stability.py fix: add vLLM/local server error patterns + MCP initial connection retry (#9281) 2026-04-13 18:46:14 -07:00
test_mcp_structured_content.py fix(mcp): combine content and structuredContent when both present (#7118) 2026-04-10 03:44:35 -07:00
test_mcp_tool_401_handling.py fix(mcp): consolidate OAuth handling, pick up external token refreshes (#11383) 2026-04-16 21:57:10 -07:00
test_mcp_tool_issue_948.py
test_mcp_tool.py fix(mcp): make server aliases explicit 2026-04-14 17:19:20 -07:00
test_memory_tool_import_fallback.py fix(tools): keep memory tool available when fcntl is unavailable 2026-04-14 10:18:05 -07:00
test_memory_tool.py refactor: remove dead code — 1,784 lines across 77 files (#9180) 2026-04-13 16:32:04 -07:00
test_mixture_of_agents_tool.py
test_modal_bulk_upload.py perf(ssh,modal): bulk file sync via tar pipe and tar/base64 archive (#8014) 2026-04-12 06:18:05 +05:30
test_modal_sandbox_fixes.py
test_modal_snapshot_isolation.py fix(tests): update mocks for file sync changes 2026-04-10 03:01:46 -07:00
test_notify_on_complete.py fix: suppress duplicate completion notifications when agent already consumed output via wait/poll/log (#8228) 2026-04-12 00:36:22 -07:00
test_osv_check.py
test_parse_env_var.py
test_patch_parser.py fix(patch): harden V4A patch parser and fuzzy match — 9 correctness bugs 2026-04-10 16:47:44 -07:00
test_process_registry.py fix(gateway): propagate user identity through process watcher pipeline 2026-04-11 13:46:16 -07:00
test_read_loop_detection.py refactor: remove dead code — 1,784 lines across 77 files (#9180) 2026-04-13 16:32:04 -07:00
test_registry.py fix(tools): auto-discover built-in tool modules 2026-04-14 21:12:29 -07:00
test_rl_training_tool.py
test_search_hidden_dirs.py
test_send_message_missing_platforms.py fix(send_message): deliver Matrix media via adapter 2026-04-15 17:37:43 -07:00
test_send_message_tool.py fix(discord): forum channel media + polish 2026-04-17 20:25:48 -07:00
test_session_search.py fix(session_search): coerce limit to int to prevent TypeError with non-int values (#10522) 2026-04-15 14:11:05 -07:00
test_singularity_preflight.py
test_skill_env_passthrough.py fix: remove 115 verified dead code symbols across 46 production files 2026-04-10 03:44:43 -07:00
test_skill_improvements.py
test_skill_manager_tool.py refactor: extract shared helpers to deduplicate repeated code patterns (#7917) 2026-04-11 13:59:52 -07:00
test_skill_size_limits.py
test_skill_view_path_check.py
test_skill_view_traversal.py
test_skills_guard.py
test_skills_hub_clawhub.py
test_skills_hub.py fix: update 6 test files broken by dead code removal 2026-04-10 03:44:43 -07:00
test_skills_sync.py feat(skills): add 'hermes skills reset' to un-stick bundled skills (#11468) 2026-04-17 00:41:31 -07:00
test_skills_tool.py refactor: remove dead code — 1,784 lines across 77 files (#9180) 2026-04-13 16:32:04 -07:00
test_ssh_bulk_upload.py perf(ssh,modal): bulk file sync via tar pipe and tar/base64 archive (#8014) 2026-04-12 06:18:05 +05:30
test_ssh_environment.py fix(tests): update mocks for file sync changes 2026-04-10 03:01:46 -07:00
test_symlink_prefix_confusion.py
test_sync_back_backends.py fix: harden sync_back — PID-suffix temp path, size cap, lifecycle guards 2026-04-16 19:39:21 -07:00
test_terminal_exit_semantics.py
test_terminal_foreground_timeout_cap.py fix: reject foreground timeout above cap instead of clamping 2026-04-10 02:58:54 -07:00
test_terminal_none_command_guard.py
test_terminal_requirements.py feat: ungate Tool Gateway — subscription-based access with per-tool opt-in 2026-04-16 12:36:49 -07:00
test_terminal_timeout_output.py
test_terminal_tool_pty_fallback.py feat: add tested Termux install path and EOF-aware gh auth 2026-04-09 16:24:53 -07:00
test_terminal_tool_requirements.py feat: ungate Tool Gateway — subscription-based access with per-tool opt-in 2026-04-16 12:36:49 -07:00
test_terminal_tool.py fix terminal workdir validation for Windows paths 2026-04-15 15:06:51 -07:00
test_threaded_process_handle.py
test_tirith_security.py
test_todo_tool.py fix(tools): enforce ID uniqueness in TODO store during replace operations 2026-04-11 16:22:50 -07:00
test_tool_backend_helpers.py feat: ungate Tool Gateway — subscription-based access with per-tool opt-in 2026-04-16 12:36:49 -07:00
test_tool_call_parsers.py
test_tool_result_storage.py fix(tools): neutralize shell injection in _write_to_sandbox via path quoting (#7940) 2026-04-11 14:26:11 -07:00
test_transcription_tools.py refactor: remove dead code — 1,784 lines across 77 files (#9180) 2026-04-13 16:32:04 -07:00
test_transcription.py
test_tts_gemini.py feat(tts): add Google Gemini TTS provider (#11229) 2026-04-16 14:23:16 -07:00
test_tts_mistral.py feat(tools): add Voxtral TTS provider (Mistral AI) 2026-04-11 01:56:55 -07:00
test_tts_speed.py test(tts): add speed config tests for Edge, OpenAI, and MiniMax 2026-04-12 16:46:18 -07:00
test_url_safety.py fix: allow trusted QQ CDN benchmark IP resolution 2026-04-17 04:22:40 -07:00
test_vision_tools.py refactor: remove dead code — 1,784 lines across 77 files (#9180) 2026-04-13 16:32:04 -07:00
test_voice_cli_integration.py fix(tests): fix 78 CI test failures and remove dead test (#9036) 2026-04-13 10:50:24 -07:00
test_voice_mode.py fix(termux): tighten voice setup and mobile chat UX 2026-04-09 16:24:53 -07:00
test_watch_patterns.py fix(gateway): route synthetic background events by session 2026-04-15 11:16:01 -07:00
test_web_tools_config.py test: remove 169 change-detector tests across 21 files (#11472) 2026-04-17 01:05:09 -07:00
test_web_tools_tavily.py fix(tests): fix several failing/flaky tests on main (#6777) 2026-04-09 13:17:06 -07:00
test_website_policy.py
test_windows_compat.py
test_write_deny.py
test_yolo_mode.py fix(gateway): scope /yolo to the active session 2026-04-10 03:38:44 -07:00
test_zombie_process_cleanup.py fix(tests): fix 78 CI test failures and remove dead test (#9036) 2026-04-13 10:50:24 -07:00