hermes-agent/tests
Teknium 327b57da91
fix(gateway): kill tool subprocesses before adapter disconnect on drain timeout (#14728)
Closes #8202.

Root cause: stop() reclaimed tool-call bash/sleep children only at the
very end of the shutdown sequence — after a 60s drain, 5s interrupt
grace, and per-adapter disconnect. Under systemd (TimeoutStopSec bounded
by drain_timeout), that meant the cgroup SIGKILL escalation fired first,
and systemd reaped the bash/sleep children instead of us.

Fix:
- Extract tool-subprocess cleanup into a local helper
  _kill_tool_subprocesses() in _stop_impl().
- Invoke it eagerly right after _interrupt_running_agents() on the
  drain-timeout path, before adapter disconnect.
- Keep the existing catch-all call at the end for the graceful path
  and defense in depth against mid-teardown respawns.
- Bump generated systemd unit TimeoutStopSec to drain_timeout + 30s
  so cleanup + disconnect + DB close has headroom above the drain
  budget, matching the 'subprocess timeout > TimeoutStopSec + margin'
  rule from the skill.

Tests:
- New: test_gateway_stop_kills_tool_subprocesses_before_adapter_disconnect_on_timeout
  asserts kill_all() runs before disconnect() when drain times out.
- New: test_gateway_stop_kills_tool_subprocesses_on_graceful_path
  guards that the final catch-all still fires when drain succeeds
  (regression guard against accidental removal during refactor).
- Updated: existing systemd unit generator tests expect TimeoutStopSec=90
  (= 60s drain + 30s headroom) with explanatory comment.
2026-04-23 13:59:29 -07:00
..
acp
agent fix(auth): refuse to touch real auth.json during pytest; delete sandbox-escaping test (#14729) 2026-04-23 13:50:21 -07:00
cli test(approval): regression guards for thread-local callback contract 2026-04-21 14:29:08 -07:00
cron
e2e
environments/benchmarks
fakes
gateway fix(gateway): kill tool subprocesses before adapter disconnect on drain timeout (#14728) 2026-04-23 13:59:29 -07:00
hermes_cli fix(gateway): kill tool subprocesses before adapter disconnect on drain timeout (#14728) 2026-04-23 13:59:29 -07:00
honcho_plugin
integration
plugins feat(image_gen): add openai-codex plugin (gpt-image-2 via Codex OAuth) (#14317) 2026-04-22 20:43:21 -07:00
run_agent refactor: remove _nr_to_assistant_message shim + fix flush_memories guard 2026-04-23 02:30:05 -07:00
skills
tools fix(delegate): remove model-facing max_iterations override; config is authoritative (#14732) 2026-04-23 13:56:26 -07:00
tui_gateway Merge branch 'main' into fix/tui-provider-resolution 2026-04-22 11:47:49 -07:00
__init__.py
conftest.py
run_interrupt_test.py
test_account_usage.py
test_base_url_hostname.py
test_batch_runner_checkpoint.py
test_cli_file_drop.py
test_cli_skin_integration.py fix: align status bar skin tests with upstream main 2026-04-22 13:20:02 -07:00
test_ctx_halving_fix.py
test_empty_model_fallback.py
test_evidence_store.py
test_hermes_constants.py
test_hermes_logging.py
test_hermes_state.py feat(dashboard): track real API call count per session 2026-04-22 05:51:58 -07:00
test_honcho_client_config.py
test_ipv4_preference.py
test_mcp_serve.py
test_mini_swe_runner.py
test_minimax_model_validation.py
test_minisweagent_path.py
test_model_picker_scroll.py
test_model_tools_async_bridge.py fix(core): ensure non-blocking executor shutdown on async timeout 2026-04-22 14:42:32 -07:00
test_model_tools.py
test_ollama_num_ctx.py
test_packaging_metadata.py
test_plugin_skills.py
test_project_metadata.py
test_retry_utils.py
test_sql_injection.py
test_subprocess_home_isolation.py
test_timezone.py
test_toolset_distributions.py
test_toolsets.py
test_trajectory_compressor_async.py
test_trajectory_compressor.py
test_transform_tool_result_hook.py
test_tui_gateway_server.py Merge pull request #14135 from helix4u/fix/tui-state-db-optional 2026-04-22 20:11:07 -05:00
test_utils_truthy_values.py