hermes-agent/hermes_cli
Dev Lead b14758f09a
Some checks failed
Tests / e2e (pull_request) Successful in 57s
Nix / nix (macos-latest) (pull_request) Waiting to run
Contributor Attribution Check / check-attribution (pull_request) Failing after 9s
Supply Chain Audit / Scan PR for critical supply chain risks (pull_request) Successful in 10s
Tests / test (pull_request) Failing after 7m7s
Nix / nix (ubuntu-latest) (pull_request) Failing after 13m19s
fix(tools/environments): SIGKILL-only on KeyboardInterrupt; gate cmd_update survivor sweep on real grace (partial close hermes-agent#9)
Restores the Apr 2026 orphan-bug fix for the local terminal backend
(``sleep 300`` survives ``hermes chat -q`` SIGTERM, originally reported
by Physikal) and aligns the ``hermes update`` survivor sweep with the
contract its tests have always pinned.

Three things move:

1. ``tools/environments/local.py:_kill_process``
   - Was: SIGTERM → wait up to 1s polling ``os.killpg(pgid, 0)`` → SIGKILL
     → wait up to 2s on the same pollee.
   - Now: SIGKILL directly + ``proc.wait(timeout=0.5)`` to reap the wrapper.
   - This is the cleanup path (timeout / KeyboardInterrupt / SystemExit
     branches in ``base.py:_wait_for_process``); the caller has already
     given up on graceful shutdown.  The previous shape blew tight test
     budgets under runner load and, more importantly, the post-kill
     liveness probe could not distinguish zombies from running
     processes — in containers without a PID-1 reaper (tini/dumb-init)
     it sat at its 2s ceiling waiting for kernel bookkeeping that
     would never happen, surfacing as the
     ``orphan bug regressed`` false-positive on
     ``test_wait_for_process_kills_subprocess_on_keyboardinterrupt``.

2. ``tests/tools/test_local_interrupt_cleanup.py``
   - ``_pgid_still_alive``: switch from ``os.killpg(pgid, 0)`` to ``ps -g
     STAT`` so zombies are not reported as alive.
   - ``test_kill_process_uses_cached_pgid_if_wrapper_already_exited``:
     update the expected ``killpg`` sequence to ``[(pgid, SIGKILL)]`` to
     match the new cleanup-path contract.

3. ``hermes_cli/main.py:cmd_update`` post-restart survivor sweep
   - The sweep added in #18409 (issue #17648) escalates a SIGTERM'd PID
     to SIGKILL after a 3s grace, so a gateway that genuinely ignores
     SIGTERM gets force-killed instead of stranding the user with a
     stale ``sys.modules``.  The fixture-mocked ``time.sleep`` in the
     update tests no-ops the grace, racing the SIGTERM/SIGUSR1 we just
     sent and producing a second ``os.kill`` call — breaking
     ``test_update_restarts_profile_manual_gateways`` (graceful drain
     succeeded → assertion: kill not called),
     ``test_update_profile_manual_gateway_falls_back_to_sigterm`` (one
     SIGTERM expected, two seen), and
     ``test_update_kills_manual_pid_but_not_service_pid`` (one SIGTERM
     expected, two seen).
   - Fix: gate the sweep on a real wall-clock grace.  Sample
     ``time.monotonic()`` before and after the 3s sleep; if less than
     2.5s elapsed (test fixture, signal handler, etc.), skip the sweep
     entirely.  Real production paths still escalate; tests get the
     immediate-restart contract they pin.  Also probe each candidate
     PID with ``os.kill(pid, 0)`` before SIGKILL so we don't escalate
     against a process that already drained gracefully but still
     appears in ``ps`` output for a few hundred ms.

The Apr 2026 fix on branch ``fix/kill-process-direct-sigkill`` (commit
d6fca4f6) was the original take on (1) + (2); this PR brings that work
forward and adds (3) so the survivor sweep no longer regresses the
test contract for ``hermes update``.

Verification:
- ``pytest -x tests/tools/test_local_interrupt_cleanup.py
   tests/hermes_cli/test_update_gateway_restart.py -v`` — 49/49 pass.
- ``pytest -q tests/tools/test_local_background_child_hang.py
   tests/tools/test_base_environment.py
   tests/tools/test_windows_compat.py`` — all pass.
- Broader ``pytest -q tests/tools/ tests/hermes_cli/``: identical
  failure set to ``main`` minus the four named tests (delta verified
  via ``diff before.txt after.txt``).  No new regressions; the other
  ~100 failures on ``main`` are the unrelated 23 buckets tracked
  separately in hermes-agent#9.

Closes the four signal-handling buckets in #9; remaining 23 untouched.
2026-05-08 12:08:23 -07:00
..
__init__.py chore: release v0.12.0 (2026.4.30) (#18057) 2026-04-30 11:31:01 -07:00
_parser.py refactor(cli): derive relaunch flag table from argparse introspection 2026-04-29 20:33:29 -07:00
auth_commands.py feat(cli): add minimax-oauth provider with PKCE browser flow 2026-04-29 09:53:42 -07:00
auth.py fix(auth): make provider config writes atomic 2026-04-30 20:39:41 -07:00
azure_detect.py chore: remove unused imports and dead locals (ruff F401, F841) (#17010) 2026-04-28 06:46:45 -07:00
backup.py feat(claw-migrate): harden OpenClaw import with plan-first apply, redaction, and pre-migration backup (#16911) 2026-04-28 01:50:23 -07:00
banner.py fix(banner): show correct update status on nix-built hermes (#17550) 2026-04-30 07:03:00 +05:30
browser_connect.py fix(browser): address Copilot review on /browser connect 2026-04-28 22:11:10 -07:00
callbacks.py fix: ESC cancels secret/sudo prompts, clearer skip messaging (#9902) 2026-04-14 16:11:37 -07:00
claw.py feat(claw-migrate): harden OpenClaw import with plan-first apply, redaction, and pre-migration backup (#16911) 2026-04-28 01:50:23 -07:00
cli_output.py refactor: remove dead code — 1,784 lines across 77 files (#9180) 2026-04-13 16:32:04 -07:00
clipboard.py feat: fix img pasting in new ink plus newline after tools 2026-04-11 13:14:32 -05:00
codex_models.py feat(codex): add gpt-5.5 and wire live model discovery into picker (#14720) 2026-04-23 13:32:43 -07:00
colors.py feat: respect NO_COLOR env var and TERM=dumb (#4079) 2026-03-30 17:07:21 -07:00
commands.py fix(discord): warn on 32-char clamp collisions in the /skill collector (#18759) 2026-05-02 02:05:01 -07:00
completion.py fix: preserve profile name completion in dynamic shell completion 2026-04-14 10:45:42 -07:00
config.py docs(tts): mention xAI custom voice support (#18776) 2026-05-02 16:08:01 +05:30
copilot_auth.py fix(copilot): exchange raw GitHub token for Copilot API JWT 2026-04-24 05:09:08 -07:00
cron.py feat(cron): per-job workdir for project-aware cron runs (#15110) 2026-04-24 05:07:01 -07:00
curator.py fix(curator): authoritative absorbed_into on delete + restore cron skill links on rollback (#18671) (#18731) 2026-05-02 01:29:57 -07:00
curses_ui.py feat: ungate Tool Gateway — subscription-based access with per-tool opt-in 2026-04-16 12:36:49 -07:00
debug.py chore: remove unused imports and dead locals (ruff F401, F841) (#17010) 2026-04-28 06:46:45 -07:00
default_soul.py fix: reset default SOUL.md to baseline identity text (#3159) 2026-03-26 01:34:27 -07:00
dingtalk_auth.py chore: remove unused imports and dead locals (ruff F401, F841) (#17010) 2026-04-28 06:46:45 -07:00
doctor.py fix(cli): decode .env as UTF-8 to avoid GBK crash on Windows 2026-05-02 01:40:31 -07:00
dump.py refactor(redact): canonical mask_secret helper; fix status.py DIM drift (#17207) 2026-04-28 21:04:35 -07:00
env_loader.py refactor: consolidate symlink-safe atomic replace into shared helper 2026-04-28 04:58:22 -07:00
fallback_cmd.py feat(cli): add 'hermes fallback' command to manage fallback providers (#16052) 2026-04-26 06:19:04 -07:00
gateway.py fix: gateway systemd unit now retries indefinitely with backoff (#18639) 2026-05-02 08:51:30 +05:30
goals.py feat: /goal — persistent cross-turn goals (Ralph loop) (#18262) 2026-04-30 23:10:20 -07:00
hooks.py chore: remove unused imports and dead locals (ruff F401, F841) (#17010) 2026-04-28 06:46:45 -07:00
kanban_db.py feat(kanban): durable multi-profile collaboration board (#17805) 2026-04-30 13:36:47 -07:00
kanban.py feat(kanban): durable multi-profile collaboration board (#17805) 2026-04-30 13:36:47 -07:00
logs.py feat: component-separated logging with session context and filtering (#7991) 2026-04-11 17:23:36 -07:00
main.py fix(tools/environments): SIGKILL-only on KeyboardInterrupt; gate cmd_update survivor sweep on real grace (partial close hermes-agent#9) 2026-05-08 12:08:23 -07:00
mcp_config.py refactor(config): migrate remaining 33 cfg_get call sites (#17311) 2026-04-29 04:03:03 -07:00
memory_setup.py fix(cli): decode .env as UTF-8 to avoid GBK crash on Windows 2026-05-02 01:40:31 -07:00
model_catalog.py chore: remove unused imports and dead locals (ruff F401, F841) (#17010) 2026-04-28 06:46:45 -07:00
model_normalize.py feat(minimax-oauth): full integration with peer OAuth providers 2026-04-29 09:53:42 -07:00
model_switch.py fix(model_switch): correct user_providers override for private models 2026-04-30 19:44:26 -07:00
models.py chore(models): move Vercel AI Gateway to bottom of provider picker (#18112) 2026-04-30 19:34:19 -07:00
nous_subscription.py fix(cli): coerce use_gateway config flags in tool routing 2026-04-26 19:02:55 -07:00
oneshot.py fix(tui): honor launch toolsets (#17623) 2026-04-29 16:55:27 -07:00
pairing.py fix(pairing): handle null user_name in pairing list display 2026-04-23 02:34:11 -07:00
platforms.py feat: complete plugin platform parity — all 12 integration points 2026-04-29 21:56:51 -07:00
plugins_cmd.py feat(dashboard): add Plugins page with enable/disable, auth status, install/remove 2026-04-30 20:29:37 -04:00
plugins.py fix(plugins): bound async plugin command await with 30s timeout 2026-04-30 19:56:18 -07:00
profiles.py Merge upstream/main and address Copilot review feedback 2026-04-30 06:43:22 -04:00
providers.py fix: prevent bare 'custom' slug in model.provider (#17478) 2026-04-30 04:32:11 -07:00
pty_bridge.py fix: mobile chat in new layout 2026-04-24 12:07:46 -04:00
relaunch.py remove relaunch_chat 2026-04-29 20:33:29 -07:00
runtime_provider.py fix(fallback): let custom_providers shadow built-in aliases 2026-04-30 20:18:44 -07:00
setup.py docs(tts): mention xAI custom voice support (#18776) 2026-05-02 16:08:01 +05:30
skills_config.py refactor(config): migrate remaining 33 cfg_get call sites (#17311) 2026-04-29 04:03:03 -07:00
skills_hub.py feat(skills): install skills from a direct HTTP(S) URL (#16323) 2026-04-26 20:57:10 -07:00
skin_engine.py fix(tui): restore macOS copy behavior and theme polish (#17131) 2026-04-28 18:47:14 -05:00
slack_cli.py fix(paths): route achievements plugin + profile-tui through HERMES_HOME 2026-04-30 23:21:54 -07:00
status.py fix(status): add NVIDIA_API_KEY to hermes status API keys display 2026-04-30 19:46:06 -07:00
timeouts.py refactor(timeouts): drop redundant ImportError in except clause 2026-04-26 20:48:20 -07:00
tips.py feat(tips): add cost-saving tips from April 30 tip-of-the-day (#17841) 2026-04-30 02:30:36 -07:00
tools_config.py feat(tts): add Piper as a native local TTS provider (closes #8508) (#17885) 2026-04-30 02:53:20 -07:00
uninstall.py feat(uninstall): offer to remove named profiles when uninstalling from default 2026-04-18 19:18:13 -07:00
vercel_auth.py feat: add Vercel Sandbox backend 2026-04-29 07:22:33 -07:00
voice.py fix(tui): ignore SIGPIPE so stderr back-pressure can't kill the gateway 2026-04-23 16:18:15 -07:00
web_server.py fix: allow WebSocket connections from non-loopback IPs in --insecure mode (#18633) 2026-05-02 08:17:45 +05:30
webhook.py refactor(config): migrate remaining 33 cfg_get call sites (#17311) 2026-04-29 04:03:03 -07:00