|
Some checks failed
Tests / e2e (pull_request) Successful in 57s
Nix / nix (macos-latest) (pull_request) Waiting to run
Contributor Attribution Check / check-attribution (pull_request) Failing after 9s
Supply Chain Audit / Scan PR for critical supply chain risks (pull_request) Successful in 10s
Tests / test (pull_request) Failing after 7m7s
Nix / nix (ubuntu-latest) (pull_request) Failing after 13m19s
Restores the Apr 2026 orphan-bug fix for the local terminal backend
(``sleep 300`` survives ``hermes chat -q`` SIGTERM, originally reported
by Physikal) and aligns the ``hermes update`` survivor sweep with the
contract its tests have always pinned.
Three things move:
1. ``tools/environments/local.py:_kill_process``
- Was: SIGTERM → wait up to 1s polling ``os.killpg(pgid, 0)`` → SIGKILL
→ wait up to 2s on the same pollee.
- Now: SIGKILL directly + ``proc.wait(timeout=0.5)`` to reap the wrapper.
- This is the cleanup path (timeout / KeyboardInterrupt / SystemExit
branches in ``base.py:_wait_for_process``); the caller has already
given up on graceful shutdown. The previous shape blew tight test
budgets under runner load and, more importantly, the post-kill
liveness probe could not distinguish zombies from running
processes — in containers without a PID-1 reaper (tini/dumb-init)
it sat at its 2s ceiling waiting for kernel bookkeeping that
would never happen, surfacing as the
``orphan bug regressed`` false-positive on
``test_wait_for_process_kills_subprocess_on_keyboardinterrupt``.
2. ``tests/tools/test_local_interrupt_cleanup.py``
- ``_pgid_still_alive``: switch from ``os.killpg(pgid, 0)`` to ``ps -g
STAT`` so zombies are not reported as alive.
- ``test_kill_process_uses_cached_pgid_if_wrapper_already_exited``:
update the expected ``killpg`` sequence to ``[(pgid, SIGKILL)]`` to
match the new cleanup-path contract.
3. ``hermes_cli/main.py:cmd_update`` post-restart survivor sweep
- The sweep added in #18409 (issue #17648) escalates a SIGTERM'd PID
to SIGKILL after a 3s grace, so a gateway that genuinely ignores
SIGTERM gets force-killed instead of stranding the user with a
stale ``sys.modules``. The fixture-mocked ``time.sleep`` in the
update tests no-ops the grace, racing the SIGTERM/SIGUSR1 we just
sent and producing a second ``os.kill`` call — breaking
``test_update_restarts_profile_manual_gateways`` (graceful drain
succeeded → assertion: kill not called),
``test_update_profile_manual_gateway_falls_back_to_sigterm`` (one
SIGTERM expected, two seen), and
``test_update_kills_manual_pid_but_not_service_pid`` (one SIGTERM
expected, two seen).
- Fix: gate the sweep on a real wall-clock grace. Sample
``time.monotonic()`` before and after the 3s sleep; if less than
2.5s elapsed (test fixture, signal handler, etc.), skip the sweep
entirely. Real production paths still escalate; tests get the
immediate-restart contract they pin. Also probe each candidate
PID with ``os.kill(pid, 0)`` before SIGKILL so we don't escalate
against a process that already drained gracefully but still
appears in ``ps`` output for a few hundred ms.
The Apr 2026 fix on branch ``fix/kill-process-direct-sigkill`` (commit
|
||
|---|---|---|
| .. | ||
| browser_providers | ||
| environments | ||
| neutts_samples | ||
| __init__.py | ||
| ansi_strip.py | ||
| approval.py | ||
| binary_extensions.py | ||
| browser_camofox_state.py | ||
| browser_camofox.py | ||
| browser_cdp_tool.py | ||
| browser_dialog_tool.py | ||
| browser_supervisor.py | ||
| browser_tool.py | ||
| budget_config.py | ||
| checkpoint_manager.py | ||
| clarify_tool.py | ||
| code_execution_tool.py | ||
| credential_files.py | ||
| cronjob_tools.py | ||
| debug_helpers.py | ||
| delegate_tool.py | ||
| discord_tool.py | ||
| env_passthrough.py | ||
| feishu_doc_tool.py | ||
| feishu_drive_tool.py | ||
| file_operations.py | ||
| file_state.py | ||
| file_tools.py | ||
| fuzzy_match.py | ||
| homeassistant_tool.py | ||
| image_generation_tool.py | ||
| interrupt.py | ||
| kanban_tools.py | ||
| managed_tool_gateway.py | ||
| mcp_oauth_manager.py | ||
| mcp_oauth.py | ||
| mcp_tool.py | ||
| memory_tool.py | ||
| mixture_of_agents_tool.py | ||
| neutts_synth.py | ||
| openrouter_client.py | ||
| osv_check.py | ||
| patch_parser.py | ||
| path_security.py | ||
| process_registry.py | ||
| registry.py | ||
| rl_training_tool.py | ||
| schema_sanitizer.py | ||
| send_message_tool.py | ||
| session_search_tool.py | ||
| skill_manager_tool.py | ||
| skill_usage.py | ||
| skills_guard.py | ||
| skills_hub.py | ||
| skills_sync.py | ||
| skills_tool.py | ||
| slash_confirm.py | ||
| terminal_tool.py | ||
| tirith_security.py | ||
| todo_tool.py | ||
| tool_backend_helpers.py | ||
| tool_output_limits.py | ||
| tool_result_storage.py | ||
| transcription_tools.py | ||
| tts_tool.py | ||
| url_safety.py | ||
| vision_tools.py | ||
| voice_mode.py | ||
| web_tools.py | ||
| website_policy.py | ||
| xai_http.py | ||
| yuanbao_tools.py | ||