Commit Graph

140 Commits

Author SHA1 Message Date
Teknium
0ea7d0ec80
fix(terminal): log disk warning check failures at debug level (salvage #2372) (#2394)
* fix(terminal): log disk warning check failures at debug level

* fix(terminal): guard _check_disk_usage_warning by moving scratch_dir into try

---------

Co-authored-by: aydnOktay <xaydinoktay@gmail.com>
2026-03-21 17:10:17 -07:00
Teknium
d87655afff
fix(gateway): persist watcher metadata in checkpoint for crash recovery (#1706)
Salvaged from PR #1573 by @eren-karakus0. Cherry-picked with authorship preserved.

Fixes #1143 — background process notifications resume after gateway restart.

Co-authored-by: Muhammet Eren Karakuş <erenkar950@gmail.com>
2026-03-17 03:52:15 -07:00
Teknium
556e0f4b43 fix(docker): add explicit env allowlist for container credentials (#1436)
Docker terminal sessions are secret-dark by default. This adds
terminal.docker_forward_env as an explicit allowlist for env vars
that may be forwarded into Docker containers.

Values resolve from the current shell first, then fall back to
~/.hermes/.env. Only variables the user explicitly lists are
forwarded — nothing is auto-exposed.

Cherry-picked from PR #1449 by @teknium1, conflict-resolved onto
current main.

Fixes #1436
Supersedes #1439
2026-03-17 02:34:35 -07:00
teknium1
780ddd102b fix(docker): gate cwd workspace mount behind config
Keep Docker sandboxes isolated by default. Add an explicit terminal.docker_mount_cwd_to_workspace opt-in, thread it through terminal/file environment creation, and document the security tradeoff and config.yaml workflow clearly.
2026-03-16 05:20:56 -07:00
Bartok9
8cdbbcaaa2 fix(docker): auto-mount host CWD to /workspace
Fixes #1445 — When using Docker backend, the user's current working
directory is now automatically bind-mounted to /workspace inside the
container. This allows users to run `cd my-project && hermes` and have
their project files accessible to the agent without manual volume config.

Changes:
- Add host_cwd and auto_mount_cwd parameters to DockerEnvironment
- Capture original host CWD in _get_env_config() before container fallback
- Pass host_cwd through _create_environment() to Docker backend
- Add TERMINAL_DOCKER_NO_AUTO_MOUNT env var to disable if needed
- Skip auto-mount when /workspace is already explicitly mounted
- Add tests for auto-mount behavior
- Add documentation for the new feature

The auto-mount is skipped when:
1. TERMINAL_DOCKER_NO_AUTO_MOUNT=true is set
2. User configured docker_volumes with :/workspace
3. persistent_filesystem=true (persistent sandbox mode)

This makes the Docker backend behave more intuitively — the agent
operates on the user's actual project directory by default.
2026-03-16 05:20:21 -07:00
Teknium
4298c6fd9a
fix: route background process watcher notifications to Telegram forum topics (#1481)
Salvaged from PR #1146 by spanishflu-est1918.

Background process progress/completion messages were sent with only
chat_id, landing in the general topic instead of the originating forum
topic. Thread the thread_id from HERMES_SESSION_THREAD_ID through the
watcher payload and pass it as metadata to adapter.send() so Telegram
routes notifications to the correct topic.

The env var export (HERMES_SESSION_THREAD_ID in _set_session_env /
_clear_session_env) already existed on main — this commit adds the
missing watcher plumbing.

Co-authored-by: spanishflu-est1918 <spanishflu-est1918@users.noreply.github.com>
2026-03-15 23:01:57 -07:00
teknium1
01e62c067b merge: resolve conflicts with origin/main (SSH preflight check) 2026-03-15 21:13:40 -07:00
teknium1
210d5ade1e feat(tools): centralize tool emoji metadata in registry + skin integration
- Add 'emoji' field to ToolEntry and 'get_emoji()' to ToolRegistry
- Add emoji= to all 50+ registry.register() calls across tool files
- Add get_tool_emoji() helper in agent/display.py with 3-tier resolution:
  skin override → registry default → hardcoded fallback
- Replace hardcoded emoji maps in run_agent.py, delegate_tool.py, and
  gateway/run.py with centralized get_tool_emoji() calls
- Add 'tool_emojis' field to SkinConfig so skins can override per-tool
  emojis (e.g. ares skin could use swords instead of wrenches)
- Add 11 tests (5 registry emoji, 6 display/skin integration)
- Update AGENTS.md skin docs table

Based on the approach from PR #1061 by ForgingAlex (emoji centralization
in registry). This salvage fixes several issues from the original:
- Does NOT split the cronjob tool (which would crash on missing schemas)
- Does NOT change image_generate toolset/requires_env/is_async
- Does NOT delete existing tests
- Completes the centralization (gateway/run.py was missed)
- Hooks into the skin system for full customizability
2026-03-15 20:21:21 -07:00
teknium1
33ebedc76d feat: enable persistent shell by default for SSH, add config option
SSH persistent shell now defaults to true — non-local backends benefit
most from state persistence across execute() calls. Local backend
remains opt-in via TERMINAL_LOCAL_PERSISTENT env var.

New config.yaml option: terminal.persistent_shell (default: true)
Controls the default for non-local backends. Users can disable with:
  hermes config set terminal.persistent_shell false

Precedence: per-backend env var > TERMINAL_PERSISTENT_SHELL > default.

Wired through cli.py, gateway/run.py, and hermes_cli/config.py so the
config.yaml value reaches terminal_tool via env var bridge.
2026-03-15 20:17:13 -07:00
alt-glitch
7be314c456 pass configs to file_tools for r+w over ssh.
pass TERM env.
default to ~ to in local and ssh backends.
ssh backend.
2026-03-15 02:26:39 +05:30
balyan.sid@gmail.com
9001b34146 simplify docstrings, fix some bugs 2026-03-15 01:20:42 +05:30
balyan.sid@gmail.com
861202b56c wip: add persistent shell to ssh and local terminal backends 2026-03-15 01:20:42 +05:30
balyan.sid@gmail.com
9d63dcc3f9 add persistent ssh backend 2026-03-15 01:19:38 +05:30
Oktay Aydin
00a0f18544 fix: clearer terminal backend requirement errors
Salvaged from PR #979 onto current main.

Preserve the current terminal backend checks while surfacing actionable
preflight errors for unknown TERMINAL_ENV values, missing SSH host/user
configuration, and missing Modal credentials/config. Tighten the modal
regression test so it deterministically exercises the config-missing
path.
2026-03-14 06:04:39 -07:00
sheeki003
375ce8a881 feat(security): add tirith pre-exec command scanning
Integrate tirith as a pre-execution security scanner that detects
homograph URLs, pipe-to-interpreter patterns, terminal injection,
zero-width Unicode, and environment variable manipulation — threats
the existing 50-pattern dangerous command detector doesn't cover.

Architecture: gather-then-decide — both tirith and the dangerous
command detector run before any approval prompt, preventing gateway
force=True replay from bypassing one check when only the other was
shown to the user.

New files:
- tools/tirith_security.py: subprocess wrapper with auto-installer,
  mandatory cosign provenance verification, non-blocking background
  download, disk-persistent failure markers with retryable-cause
  tracking (cosign_missing auto-clears when cosign appears on PATH)
- tests/tools/test_tirith_security.py: 62 tests covering exit code
  mapping, fail_open, cosign verification, background install,
  HERMES_HOME isolation, and failure recovery
- tests/tools/test_command_guards.py: 21 integration tests for the
  combined guard orchestration

Modified files:
- tools/approval.py: add check_all_command_guards() orchestrator,
  add allow_permanent parameter to prompt_dangerous_approval()
- tools/terminal_tool.py: replace _check_dangerous_command with
  consolidated check_all_command_guards
- cli.py: update _approval_callback for allow_permanent kwarg,
  call ensure_installed() at startup
- gateway/run.py: iterate pattern_keys list on replay approval,
  call ensure_installed() at startup
- hermes_cli/config.py: add security config defaults, split
  commented sections for independent fallback
- cli-config.yaml.example: document tirith security config
2026-03-14 00:11:27 -07:00
Teknium
a20d373945
fix: worktree-aware minisweagent path discovery + clean up requirements check (#1248)
Salvage of PR #1246 by ChatGPT (teknium1 session), resolved against
current main which already includes #1239.

Changes:
- Add minisweagent_path.py: worktree-aware helper that finds
  mini-swe-agent/src from either the current checkout or the main
  checkout behind a git worktree
- Use the helper in tools/terminal_tool.py and mini_swe_runner.py
  instead of naive path-relative lookup that fails in worktrees
- Clean up check_terminal_requirements():
  - local: return True (no minisweagent dep, per #1239)
  - singularity/ssh: remove unnecessary minisweagent imports
  - docker/modal: use importlib.util.find_spec with clear error
- Add regression tests for worktree path discovery and tool resolution
2026-03-13 23:39:51 -07:00
teknium1
329f83ff2d fix: stop local terminal warning without minisweagent 2026-03-13 22:00:36 -07:00
dmahan93
d7f4db53f5 fix: Modal sandbox eval infra (9 fixes for TBLite baseline)
Fixes discovered while running TBLite baseline evaluation:

1. ephemeral_disk param not supported in modal 1.3.5 - check before passing
2. Modal legacy image builder requires working pip - add ensurepip fix via
   setup_dockerfile_commands to handle task images with broken pip
3. Host cwd leaked into Modal sandbox - add /home/ to host prefix check
4. Tilde ~ not expanded by subprocess.run(cwd=) in sandboxes - use /root
5. install_pipx must stay True for swerex-remote to be available

Dependencies also needed (not in this commit):
- git submodule update --init mini-swe-agent
- uv pip install swe-rex boto3
2026-03-11 06:51:42 -07:00
alireza78a
4523cc09cf fix(terminal): validate env var types with clear error messages 2026-03-11 02:59:12 -07:00
teknium1
24479625a2 fix: Docker backend fails when docker is not in PATH (macOS gateway)
On macOS, Docker Desktop installs the CLI to /usr/local/bin/docker, but
when Hermes runs as a gateway service (launchd) or in other non-login
contexts, /usr/local/bin is often not in PATH. This causes the Docker
requirements check to fail with 'No such file or directory: docker' even
though docker works fine from the user's terminal.

Add find_docker() helper that uses shutil.which() first, then probes
common Docker Desktop install paths on macOS (/usr/local/bin,
/opt/homebrew/bin, Docker.app bundle). The resolved path is cached and
passed to mini-swe-agent via its 'executable' parameter.

- tools/environments/docker.py: add find_docker(), use it in
  _storage_opt_supported() and pass to _Docker(executable=...)
- tools/terminal_tool.py: use find_docker() in requirements check
- tests/tools/test_docker_find.py: 4 tests (PATH, fallback, not found, cache)

2877 tests pass.
2026-03-10 20:45:13 -07:00
teknium1
0fdeffe6c4 fix: replace silent exception swallowing with debug logging across tools
Add logger.debug() calls to 27 bare 'except: pass' blocks across 7 core
files, giving visibility into errors that were previously silently
swallowed. This makes it much easier to diagnose user-reported issues
from debug logs.

Files changed:
- tools/terminal_tool.py: 5 catches (stat, termios, fd close, cleanup)
- tools/delegate_tool.py: 7 catches + added logger (spinner, callbacks)
- tools/browser_tool.py: 5 catches (screenshot/recording cleanup, daemon kill)
- tools/code_execution_tool.py: 2 remaining catches (socket, server close)
- gateway/session.py: 2 catches (platform enum parse, temp file cleanup)
- agent/display.py: 2 catches + added logger (JSON parse in failure detect)
- agent/prompt_builder.py: 1 catch (skill description read)

Deliberately kept bare pass for:
- ImportError checks for optional dependencies (terminal_tool.py)
- SystemExit/KeyboardInterrupt handlers
- Spinner _write catch (would spam on every frame when stdout closed)
- process_registry PID-alive check (canonical os.kill(pid,0) pattern)

Extends the pattern from PR #686 (@aydnOktay).
2026-03-10 06:59:20 -07:00
johnh4098
e9742e202f fix(security): pipe sudo password via stdin instead of shell cmdline 2026-03-10 06:34:59 -07:00
teknium1
ff09cad879 Merge PR #621: fix: limit concurrent Modal sandbox creations to avoid deadlocks
Authored by voteblake.

- Semaphore limits concurrent Modal sandbox creations to 8 (configurable)
  to prevent thread pool deadlocks when 86+ tasks fire simultaneously
- Modal cleanup guard for failed init (prevents AttributeError)
- CWD override to /app for TB2 containers
- Add /home/ to host path validation for container backends
2026-03-10 05:57:54 -07:00
teknium1
36328a996f Merge PR #458: Add explicit UTF-8 encoding to config/data file I/O
Authored by shitcoinsherpa. Adds encoding='utf-8' to all text-mode
open() calls in gateway/run.py, gateway/config.py, hermes_cli/config.py,
hermes_cli/main.py, and hermes_cli/status.py. Prevents encoding errors
on Windows where the default locale is not UTF-8.

Also fixed 4 additional open() calls in gateway/run.py that were added
after the PR branch was created.
2026-03-09 21:19:20 -07:00
shitcoinsherpa
4bc32dc0f1 Fix password reader for Windows using msvcrt.getwch()
The existing password prompt uses /dev/tty and termios to read input
with echo disabled. Neither exists on Windows.

On Windows, msvcrt.getwch() reads a single character from the console
without echoing it. This adds a Windows code path that uses getwch()
in a loop, collecting characters until Enter is pressed.

The Unix path using termios and /dev/tty is unchanged.
2026-03-09 21:15:59 -07:00
Blake Johnson
c6df39955c fix: limit concurrent Modal sandbox creations to avoid deadlocks
- Add max_concurrent_tasks config (default 8) with semaphore in TB2 eval
- Pass cwd: /app via register_task_env_overrides for TB2 tasks
- Add /home/ to host path prefixes as safety net for container backends

When all 86 TerminalBench2 tasks fire simultaneously, each creates a Modal sandbox
via asyncio.run() inside a thread pool worker. Modal's blocking calls deadlock
when too many are created at once. The semaphore ensures max 8 concurrent creations.

Co-Authored-By: hermes-agent[bot] <hermes-agent[bot]@users.noreply.github.com>
2026-03-07 14:02:34 -08:00
rovle
c43451a50b feat(terminal): integrate Daytona backend into tool pipeline
Add Daytona to image selection, container_config guards, environment
factory, requirements check, and diagnostics in terminal_tool.py and
file_tools.py. Also add to sandboxed-backend approval bypass.

Signed-off-by: rovle <lovre.pesut@gmail.com>
2026-03-05 10:02:21 -08:00
teknium1
8c48bb080f refactor: remove unnecessary single-element loop in disk usage calc
The 'for pattern in [f"hermes-*{task_id[:8]}*"]' was a loop over a
single-element list — just use a plain variable instead.
2026-03-02 04:40:13 -08:00
teknium1
6d2481ee5c Merge PR #231: fix: use task-specific glob pattern in disk usage calculation
Authored by Farukest. Fixes #230.
2026-03-02 04:38:58 -08:00
teknium1
c84d5ce738 refactor(terminal_tool): clarify foreground and background process usage
Updated documentation within terminal_tool.py to emphasize the appropriate use of foreground and background processes. Enhanced descriptions for the timeout setting and background execution to guide users towards optimal configurations for scripts, builds, and long-running tasks. Adjusted the default timeout value from 60 to 180 seconds for improved handling of longer operations.
2026-03-01 16:15:05 -08:00
teknium1
70dfec9638 test(redact): add sensitive text redaction
- Introduce a new test suite for the `redact_sensitive_text` function, covering various sensitive data formats including API keys, tokens, and environment variables.
- Ensure that sensitive information is properly masked in logs and outputs while non-sensitive data remains unchanged.
- Add tests for different scenarios including JSON fields, authorization headers, and environment variable assignments.
- Implement a redacting formatter for logging to enhance security during log output.
2026-02-28 21:56:27 -08:00
Farukest
f7300a858e
fix(tools): use task-specific glob pattern in disk usage calculation 2026-03-01 03:17:50 +03:00
Teknium
0d113fab1a
Merge pull request #158 from Indelwin/feature/docker-volumes
feat: add docker_volumes config for custom volume mounts
2026-02-27 23:06:06 -08:00
Gesina Sands
f7677ed275 feat: add docker_volumes config for custom volume mounts 2026-02-28 07:12:48 +10:00
teknium1
5007a122b2 fix(terminal): enhance error logging in cleanup functions with exception info 2026-02-27 03:53:58 -08:00
Teknium
547ba73b82
Merge pull request #65 from leonsgithub/fix/sudo-password-shell-injection
fix(security): prevent shell injection in sudo password piping
2026-02-27 01:50:07 -08:00
Teknium
2972f982e4
Merge pull request #55 from bierlingm/fix/atexit-signal-handler-race
Fix SystemExit traceback during atexit cleanup on Ctrl+C
2026-02-27 00:42:23 -08:00
Leon
25e260bb3a fix(security): prevent shell injection in sudo password piping
The sudo password was embedded in shell commands via single-quote
interpolation: echo '{password}' | sudo -S

If the password contained shell metacharacters (single quotes,
$(), backticks), they would be interpreted by the shell, enabling
arbitrary command execution.

Fix: use shlex.quote() which properly escapes all shell-special
characters, ensuring the password is always treated as a literal
string argument to echo.
2026-02-26 19:04:32 +07:00
Dean Kerr
fed9f06c4e fix: add SSH backend to terminal requirements check
The SSH backend was missing from check_terminal_requirements(), causing
it to fall through to `return False`. This silently disabled both the
terminal and file tools when TERMINAL_ENV=ssh was configured.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-26 20:41:59 +11:00
Moritz Bierling
254aafb265 Fix SystemExit traceback during atexit cleanup on Ctrl+C
The browser_tool signal handler calls sys.exit(130) which raises
SystemExit. When this fires during terminal_tool's atexit cleanup
(specifically during _cleanup_thread.join()), it produces an unhandled
traceback. Wrapping the join in a try/except suppresses the race
without changing shutdown behavior.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2026-02-26 10:13:31 +01:00
teknium1
a183827128 feat: enhance README and improve environment configuration
- Added a new section in the README for Inference Providers, detailing setup instructions for Nous Portal, OpenRouter, and Custom Endpoints, improving user guidance for LLM connections.
- Updated messaging platform setup instructions to include Slack and WhatsApp, providing clearer steps for configuration.
- Introduced a new environment variable, TERMINAL_SANDBOX_DIR, to allow users to customize the sandbox storage location for Docker and Singularity environments.
- Refactored the Docker and Singularity environment classes to utilize the new sandbox directory for persistent workspaces, enhancing organization and usability.
- Improved handling of working directories across various environments, ensuring compatibility and clarity in execution paths.
2026-02-23 21:15:35 -08:00
teknium1
90af34bc83 feat: enhance interrupt handling and container resource configuration
- Introduced a shared interrupt signaling mechanism to allow tools to check for user interrupts during long-running operations.
- Updated the AIAgent to handle interrupts more effectively, ensuring in-progress tool calls are canceled and multiple interrupt messages are combined into one prompt.
- Enhanced the CLI configuration to include container resource limits (CPU, memory, disk) and persistence options for Docker, Singularity, and Modal environments.
- Improved documentation to clarify interrupt behaviors and container resource settings, providing users with better guidance on configuration and usage.
2026-02-23 02:11:33 -08:00
teknium1
e1604b2b4a feat: enhance user authorization checks in GatewayRunner
- Updated the authorization logic to include a per-platform allow-all flag for improved flexibility.
- Revised the order of checks to prioritize platform-specific allow-all settings, followed by environment variable allowlists and DM pairing approvals.
- Added global allow-all configuration for broader access control.
- Improved handling of allowlists by stripping whitespace and ensuring valid entries are processed.
2026-02-22 16:32:08 -08:00
teknium1
ededaaa874 Hermes Agent UX Improvements 2026-02-22 02:16:11 -08:00
teknium1
9123cfb5dd Refactor Terminal and AIAgent cleanup 2026-02-21 22:31:43 -08:00
teknium1
08ff1c1aa8 More major refactor/tech debt removal! 2026-02-21 20:22:33 -08:00
teknium1
6134939882 refactor: deduplicate toolsets, unify async bridging, fix approval race condition, harden security
- Replace 4 copy-pasted messaging platform toolsets with shared _HERMES_CORE_TOOLS list
- Consolidate 5 ad-hoc async-bridging patterns into single _run_async() in model_tools.py
  - Removes deprecated get_event_loop()/set_event_loop() calls
  - Makes all tool handlers self-protecting regardless of caller's event loop state
  - RL handler refactored from if/elif chain to dispatch dict
- Fix exec approval race condition: replace module-level globals with thread-safe
  per-session tools/approval.py (submit_pending, pop_pending, approve_session, is_approved)
  - Session A approving "rm" no longer approves it for all other sessions
- Fix config deep merge: user overriding tts.elevenlabs.voice_id no longer clobbers
  tts.elevenlabs.model_id; migration detection now recurses to arbitrary depth
- Gateway default-deny: unauthenticated users denied unless GATEWAY_ALLOW_ALL_USERS=true
- Add 10 dangerous command patterns: rm --recursive, bash -c, python -e, curl|bash,
  xargs rm, find -delete
- Sanitize gateway error messages: users see generic message, full traceback goes to logs
2026-02-21 18:28:49 -08:00
teknium1
0729ef7353 fix: refine environment creation condition in terminal_tool
- Updated the environment creation condition to specifically check for "singularity" instead of allowing "local", ensuring more precise handling of environment types during task execution.
2026-02-21 12:43:56 -08:00
teknium1
8f6788474b feat: enhance logging in AIAgent for quiet mode
- Added functionality to suppress logging noise from specific modules when in quiet mode, improving user experience in CLI.
- Updated terminal_tool.py to change the log level for fallback directory usage from warning to debug, providing clearer context without cluttering logs.
2026-02-21 12:41:05 -08:00
teknium1
c98ee98525 feat: implement interactive prompts for sudo password and command approval in CLI
- Added methods for handling sudo password and dangerous command approval prompts using a callback mechanism in cli.py.
- Integrated these prompts with the prompt_toolkit UI for improved user experience.
- Updated terminal_tool.py to support callback registration for interactive prompts, enhancing the CLI's interactivity.
- Introduced a background thread for API calls in run_agent.py to allow for interrupt handling during long-running operations.
- Enhanced error handling for interrupted API calls, ensuring graceful degradation of user experience.
2026-02-21 12:15:40 -08:00
teknium1
a885d2f240 refactor: implement structured logging across multiple modules
- Introduced logging functionality in cli.py, run_agent.py, scheduler.py, and various tool modules to replace print statements with structured logging.
- Enhanced error handling and informational messages to improve debugging and monitoring capabilities.
- Ensured consistent logging practices across the codebase, facilitating better traceability and maintenance.
2026-02-21 03:11:11 -08:00
teknium1
b6247b71b5 refactor: update tool descriptions for clarity and conciseness
- Revised descriptions for various tools in model_tools.py, browser_tool.py, code_execution_tool.py, delegate_tool.py, and terminal_tool.py to enhance clarity and reduce verbosity.
- Improved consistency in terminology and formatting across tool descriptions, ensuring users have a clearer understanding of tool functionalities and usage.
2026-02-21 02:41:30 -08:00
teknium1
70dd3a16dc Cleanup time! 2026-02-20 23:23:32 -08:00
teknium1
630bd3d789 feat: improve password prompt handling in terminal tool
- Replaced getpass with direct reading from /dev/tty to enhance password input handling without echoing.
- Updated threading logic for password input to ensure proper cleanup and error handling.
- Improved visual feedback during password prompt, including clearer separation and timeout messaging.
- Enhanced user experience by providing immediate feedback on password input status.
2026-02-20 21:26:31 -08:00
teknium1
d49af633f0 feat: enhance command execution with stdin support
- Modified the `_exec` method in `ShellFileOperations` to accept `stdin_data`, allowing large content to be piped directly to commands, bypassing ARG_MAX limitations.
- Updated the `execute` method in various environment classes (`_LocalEnvironment`, `_SingularityEnvironment`, `_SSHEnvironment`, `_DockerEnvironment`) to support `stdin_data`, improving command execution flexibility.
- Removed the unique marker generation for heredoc in favor of direct stdin piping, simplifying file writing operations and enhancing performance for large files.
2026-02-19 14:50:51 -08:00
teknium1
4f57d7116d Improved stdout handling in the terminal tool to prevent deadlocks by implementing a background thread to continuously drain output, ensuring smooth command execution without blocking. 2026-02-19 09:26:31 -08:00
teknium1
061fa70907 Add background process management with process tool, wait, PTY, and stdin support
New process registry and tool for managing long-running background processes
across all terminal backends (local, Docker, Singularity, Modal, SSH).

Process Registry (tools/process_registry.py):
- ProcessSession tracking with rolling 200KB output buffer
- spawn_local() with optional PTY via ptyprocess for interactive CLIs
- spawn_via_env() for non-local backends (runs inside sandbox, never on host)
- Background reader threads per process (Popen stdout or PTY)
- wait() with timeout clamping, interrupt support, and transparent limit reporting
- JSON checkpoint to ~/.hermes/processes.json for gateway crash recovery
- Module-level singleton shared across agent loop, gateway, and RL

Process Tool (model_tools.py):
- 7 actions: list, poll, log, wait, kill, write, submit
- Paired with terminal in all toolsets (CLI, messaging, RL)
- Timeout clamping with transparent notes in response

Terminal Tool Updates (tools/terminal_tool.py):
- Replaced nohup background mode with registry spawn (returns session_id)
- Added workdir parameter for per-command working directory
- Added check_interval parameter for gateway auto-check watchers
- Added pty parameter for interactive CLI tools (Codex, Claude Code)
- Updated TERMINAL_TOOL_DESCRIPTION with full background workflow docs
- Cleanup thread now respects active background processes (won't reap sandbox)

Gateway Integration (gateway/run.py, session.py, config.py):
- Session reset protection: sessions with active processes exempt from reset
- Default idle timeout increased from 2 hours to 24 hours
- from_dict fallback aligned to match (was 120, now 1440)
- session_key env var propagated to process registry for session mapping
- Crash recovery on gateway startup via checkpoint probe
- check_interval watcher: asyncio task polls process, delivers updates to platform

RL Safety (environments/):
- tool_context.py cleanup() kills background processes on episode end
- hermes_base_env.py warns when enabled_toolsets is None (loads all tools)
- Process tool safe in RL via wait() blocking the agent loop

Also:
- Added ptyprocess as optional dependency (in pyproject.toml [pty] extra + [all])
- Fixed pre-existing bug: rl_test_inference missing from TOOL_TO_TOOLSET_MAP
- Updated AGENTS.md with process management docs and project structure
- Updated README.md terminal section with process management overview
2026-02-17 02:51:31 -08:00
teknium1
c33feb6dc9 Fix host CWD leaking into non-local terminal backends
When using Modal, Docker, SSH, or Singularity as the terminal backend
from the CLI, the agent resolved cwd: "." to the host machine's local
path (e.g. /Users/rewbs/code/hermes-agent) and passed it to the remote
sandbox, where it doesn't exist. All commands failed with "No such file
or directory".

Root cause: cli.py unconditionally resolved "." to os.getcwd() and wrote
it to TERMINAL_CWD regardless of backend type. Every tool then used that
host-local path as the working directory inside the remote environment.

Fixes:
- cli.py: only resolve "." to os.getcwd() for the local backend. For all
  remote backends (ssh, docker, modal, singularity), leave TERMINAL_CWD
  unset so the tool layer uses per-backend defaults (/root, /, ~, etc.)
- terminal_tool.py: added sanity check -- if TERMINAL_CWD contains a
  host-local prefix (/Users/, /home/, C:\) for a non-local backend, log
  a warning and fall back to the backend's default
- terminal_tool.py: SSH default CWD is now ~ instead of os.getcwd()
- file_operations.py: last-resort CWD fallback changed from os.getcwd()
  to "/" so host paths never leak into remote file operations
2026-02-16 22:30:04 -08:00
teknium1
8117d0adab Refactor file operations and environment management in file_tools and terminal_tool
- Improved the caching mechanism for ShellFileOperations to ensure stale entries are invalidated when environments are cleaned up.
- Enhanced thread safety by refining the use of locks during environment creation and cleanup processes.
- Streamlined the cleanup of inactive environments to prevent blocking other tool calls, ensuring efficient resource management.
- Added error handling and messaging improvements for better user feedback during environment cleanup.
2026-02-16 19:37:40 -08:00
teknium1
01a3a6ab0d Implement cleanup guard to prevent multiple executions on exit
- Introduced a new cleanup function that ensures terminal and browser sessions are cleaned up only once during application exit.
- Updated atexit registration to use the new cleanup function, enhancing resource management and preventing potential issues from multiple cleanup calls.
- Modified terminal cleanup messaging to only display when environments are cleaned, improving user feedback.
2026-02-16 02:43:45 -08:00
teknium1
f5be6177b2 Add Text-to-Speech (TTS) functionality with multiple providers
Add tool previews

Add AGENTS and SOUL.md support

Add Exec Approval
2026-02-12 10:05:08 -08:00
teknium
f23856df8e Add kill_modal script to manage Modal applications and better handling of file and terminal tools
- Introduced a new script, `kill_modal.sh`, to facilitate stopping running Modal apps, including the ability to stop all apps or specific swe-rex sandboxes.
- Enhanced user experience with clear usage instructions and feedback during the stopping process.
- Improved error handling to ensure smooth execution even if some apps fail to stop.
2026-02-12 05:37:14 +00:00
teknium1
cfe2f3fe15 Implement interrupt handling for long-running tool executions in AIAgent
- Added functionality to signal and terminate long-running terminal commands when a new user message is received, allowing for immediate agent response.
- Introduced a global interrupt event in the terminal tool to facilitate early termination of subprocesses.
- Updated the AIAgent class to handle interrupts gracefully, ensuring that remaining tool calls are skipped and appropriate messages are returned to maintain valid message sequences.
2026-02-10 16:34:27 -08:00
teknium
999a28062d Implement graceful exit cleanup for terminal tool
- Added a new `_atexit_cleanup` function to handle cleanup of active environments and stop the cleanup thread upon program exit.
- Enhanced logging to inform users about the number of remaining sandboxes being shut down during cleanup.
2026-02-10 22:53:44 +00:00
teknium
35ad3146a8 Add new environments and enhance tool context functionality
- Introduced new environments: Terminal Test Environment and SWE Environment, each with default configurations for testing and software engineering tasks.
- Added TerminalBench 2.0 evaluation environment with comprehensive setup for agentic LLMs, including task execution and verification.
- Enhanced ToolContext with methods for uploading and downloading files, ensuring binary-safe operations.
- Updated documentation across environments to reflect new features and usage instructions.
- Refactored existing environment configurations for consistency and clarity.
2026-02-10 19:39:05 +00:00
teknium
e8343f2d87 Refactor Singularity environment for persistent container management
- Updated the _SingularityEnvironment class to utilize a persistent Apptainer instance, allowing state (files, installs, environment changes) to persist across commands.
- Enhanced the initialization process to start a background instance with full isolation and writable filesystem.
- Modified the execute method to connect to the running instance, ensuring commands run within the same container context.
- Implemented cleanup functionality to stop the persistent instance on cleanup or destruction, improving resource management.
- Updated class documentation to reflect new features and usage of the persistent environment.
2026-02-10 06:49:58 +00:00
teknium
7a11be9f3f Enhance browser tool functionality and cleanup process
- Added checks for local installation of the agent-browser CLI in the `_find_agent_browser` function, improving installation guidance.
- Implemented per-task socket directory management in `_run_browser_command` to prevent concurrency issues.
- Updated `cleanup_browser` to remove per-task socket directories, ensuring proper resource cleanup after task completion.
- Refactored comments for clarity and improved documentation throughout the browser tool code.
2026-02-09 04:36:37 +00:00
teknium1
c441681dc2 Update default model to 'anthropic/claude-opus-4.6' and refine terminal working directory settings
- Changed the default LLM model in the setup wizard and example environment file to 'anthropic/claude-opus-4.6'.
- Updated terminal working directory settings in CLI and related files to use the current directory ('.') instead of '/tmp'.
- Enhanced documentation comments for clarity on terminal configuration and working directory behavior.
2026-02-08 12:56:40 -08:00
teknium
d999d9876d Enhance async tool execution and error handling in Hermes agent for Atropos integration
- Updated `.gitignore` to exclude `testlogs` directory.
- Refactored `handle_web_function_call` in `model_tools.py` to support running async functions in existing event loops, improving compatibility with Atropos.
- Introduced a thread pool executor in `agent_loop.py` for running synchronous tool calls that internally use `asyncio.run()`, preventing deadlocks.
- Added `ToolError` class to track tool execution errors, enhancing error reporting during agent loops.
- Updated `wandb_log` method in `hermes_base_env.py` to log tool error statistics for better monitoring.
- Implemented patches in `patches.py` to ensure async-safe operation of tools within Atropos's event loop.
- Enhanced `ToolContext` and `terminal_tool.py` to utilize the new async handling, improving overall tool execution reliability.
2026-02-08 05:00:47 +00:00
teknium1
5d3398aa8a Refactor terminal tool command approval process and enhance CLI feedback
- Updated the terminal tool's command approval flow to improve user interaction when executing potentially dangerous commands, replacing the previous confirmation method with a clear explanation and instructions for adding commands to the allowlist.
- Removed the internal `force` parameter from the model API, ensuring that dangerous command approvals are handled solely through user prompts.
- Enhanced the CLI to provide better feedback regarding tool availability, including improved messaging for enabled and disabled toolsets.
- Updated AGENTS.md to reflect changes in the command approval process and configuration instructions.
2026-02-02 23:46:41 -08:00
teknium1
76d929e177 Implement dangerous command approval system for terminal tool
- Added a safety mechanism to detect and approve potentially dangerous commands (e.g., `rm -rf`, `DROP TABLE`).
- Introduced an approval flow for local/SSH backends, prompting users for confirmation with options to allow once, for the session, or permanently.
- Updated configuration to include a `command_allowlist` for storing approved patterns.
- Enhanced messaging for sudo failures in messaging contexts.
- Updated relevant documentation in AGENTS.md and TODO.md to reflect these changes.
2026-02-02 23:35:18 -08:00
teknium1
3488576bd8 Update terminal configuration and enhance CLI model management
- Changed default Docker, Singularity, and Modal images in configuration files to use "nikolaik/python-nodejs:python3.11-nodejs20" for improved compatibility.
- Updated the default model in the configuration to "anthropic/claude-sonnet-4.5" and adjusted related setup prompts for API provider configuration.
- Introduced a new CLI option for selecting a custom OpenAI-compatible endpoint, enhancing flexibility in model provider setup.
- Enhanced the prompt choice functionality to support arrow key navigation for better user experience in CLI interactions.
- Updated documentation in relevant files to reflect these changes and improve user guidance.
2026-02-02 19:13:41 -08:00
teknium1
bbeed5b5d1 Enhance session logging and interactive sudo support
- Implemented automatic session logging, saving conversation trajectories to the `logs/` directory in JSON format, with each session having a unique identifier.
- Updated the CLI to display the session ID in the welcome banner for easy reference.
- Introduced an interactive sudo password prompt in CLI mode, allowing users to enter their password with a 45-second timeout, enhancing user experience during command execution.
- Documented session logging and interactive sudo features in `README.md`, `cli.md`, and `cli-config.yaml.example` for better user guidance.
2026-02-01 15:36:26 -08:00
teknium1
971ed2bbdf Implement sudo support across terminal environments
- Added support for sudo commands in local, Docker, Singularity, and SSH environments by introducing the `SUDO_PASSWORD` environment variable.
- Updated terminal tool configurations in `.env.example` and `cli-config.yaml.example` to document the new sudo functionality.
- Enhanced the command execution process to handle sudo commands gracefully, preventing hangs on interactive prompts and providing clear error messages when no password is configured.
- Updated `README.md` to include instructions for using sudo support and SSH backend configuration.
- Revised `TODO.md` to reflect the completion of the sudo feature and outline future enhancements.
2026-02-01 10:02:34 -08:00
teknium
bc76a032ba Add a claude code-like CLI
- Introduced `cli-config.yaml.example` to provide a template for configuring the CLI behavior, including model settings, terminal tool configurations, agent behavior, and toolsets.
- Created `cli.py` for an interactive terminal interface, allowing users to start the Hermes Agent with various options and toolsets.
- Added `hermes` launcher script for convenient CLI access.
- Updated `model_tools.py` to support quiet mode for suppressing output during tool initialization and execution.
- Enhanced logging in various tools to respect quiet mode, improving user experience by reducing unnecessary output.
- Added `prompt_toolkit` to `requirements.txt` for improved CLI interaction capabilities.
- Created `TODO.md` for future improvements and enhancements to the Hermes Agent framework.
2026-01-31 06:30:48 +00:00
teknium
771cf41fea Update environment configuration and enhance terminal tool integration
- Modified `.env.example` to set the default terminal environment to 'singularity' and updated Docker and Singularity image references for better compatibility.
- Enhanced `run_mixed_tasks.sh` and `run_terminal_tasks.sh` scripts to utilize the new Singularity setup, including improved logging and cache directory management.
- Introduced functionality in `terminal_tool.py` to automatically build and cache SIF images from Docker URLs, streamlining the execution environment setup.
- Updated logging messages for clarity on image usage and cache directory paths.
2026-01-29 22:47:11 +00:00
teknium
248acf715e Add browser automation tools and enhance environment configuration
- Introduced new browser automation tools in `browser_tool.py` for navigating, interacting with, and extracting content from web pages using the agent-browser CLI and Browserbase cloud execution.
- Updated `.env.example` to include new configuration options for Browserbase API keys and session settings.
- Enhanced `model_tools.py` and `toolsets.py` to integrate browser tools into the existing tool framework, ensuring consistent access across toolsets.
- Updated `README.md` with setup instructions for browser tools and their usage examples.
- Added new test script `test_modal_terminal.py` to validate Modal terminal backend functionality.
- Improved `run_agent.py` to support browser tool integration and logging enhancements for better tracking of API responses.
2026-01-29 06:10:24 +00:00
teknium
ba19d530ad Update environment configuration and enhance terminal tool integration
- Updated `.env.example` to include new API keys and configuration options for the mini-swe-agent backend, including support for local, Docker, and Modal environments.
- Added `.gitmodules` to include mini-swe-agent as a submodule for easier integration.
- Refactored `mini_swe_runner.py` to use the updated model format and default to OpenRouter for API calls.
- Enhanced `model_tools.py` to support the new terminal tool definitions and ensure compatibility with the mini-swe-agent backend.
- Updated `README.md` to reflect changes in setup instructions and environment variable configurations.
- Improved `terminal_tool.py` to manage execution environments and lifecycle, ensuring proper cleanup and error handling.
- Introduced `terminal_hecate.py` for executing commands on MorphCloud VMs, providing an alternative backend for terminal operations.
2026-01-23 12:26:53 +00:00
teknium
4071ba29da Enhance batch processing and tool validation
- Added support for tracking partial results and tool error counts in batch processing.
- Implemented filtering of corrupted entries during batch file combination based on valid tool names.
- Updated terminal tool to improve command execution and error handling, including retry logic for transient failures.
- Refactored model tools to use a simple terminal tool with no session persistence.
- Improved logging and error messages for invalid API responses and tool calls.
- Introduced chunked processing for large content in web tools to manage size limitations effectively.
2026-01-10 05:56:26 +00:00
hjc-puro
0fbc0475f3 update snapshot id for ipython 2025-11-05 02:11:25 -05:00
Teknium
4135cf4682
Merge branch 'main' into test 2025-11-04 19:54:40 -08:00
teknium
c82741c3d8 some cleanups 2025-11-05 03:47:17 +00:00
hjc-puro
fbd3a2fdb8 prevent leakage of morph instances between tasks 2025-11-04 03:32:43 -05:00
hjc-puro
a4db3fdee5 fix leakage 2025-11-03 17:42:23 -05:00
hjc-puro
0ca3e0aaa9 update snapshot 2025-11-02 23:13:49 -05:00
hjc-puro
a6ec79730c terminal tool 2025-11-02 08:57:04 +08:00
hjc-puro
faecbddd9b fix terminal interactivity 2025-11-02 08:52:05 +08:00
teknium
6fac6fecde Enhance import handling for Hecate in terminal_tool.py to manage local folder shadowing and improve error reporting for import failures. 2025-10-03 09:46:44 +00:00
teknium
a7ff4d49e9 A bit of restructuring for simplicity and organization 2025-10-01 23:29:25 +00:00
teknium
0411ca1880 Add environment configuration file, restructure tool imports, and enhance README setup instructions 2025-10-01 09:54:17 +00:00