hermes-agent/tests/tools
Teknium 97acd66b4c
fix(curator): authoritative absorbed_into on delete + restore cron skill links on rollback (#18671) (#18731)
* fix(curator): authoritative absorbed_into declarations on skill delete

Closes #18671. The classification pipeline that feeds cron-ref rewriting
used to infer consolidation vs pruning from two brittle signals: the
curator model's post-hoc YAML summary block, and a substring heuristic
scanning other tool calls for the removed skill's name. Both miss in
real consolidations — the model forgets the YAML under reasoning
pressure, and the heuristic misses when the umbrella's patch content
describes the absorbed behavior abstractly instead of naming the old
slug. When both miss, the skill falls through to 'no-evidence fallback'
pruned, and #18253's cron rewriter drops the cron ref entirely instead
of mapping it to the umbrella. Same observable symptom as pre-#18253:
'Skill(s) not found and skipped' at the next cron run.

The fix makes the model declare intent at the moment of deletion.
skill_manage(action='delete') now accepts absorbed_into:
  - absorbed_into='<umbrella>'  -> consolidated, target must exist on disk
  - absorbed_into=''            -> explicit prune, no forwarding target
  - missing                     -> legacy path, falls through to heuristic/YAML

The curator reconciler reads these declarations off llm_meta.tool_calls
BEFORE either the YAML block or the substring heuristic. Declaration
wins. Fallback logic stays intact for backward compat with any caller
(human or older curator conversation) that doesn't populate the arg.

Changes
- tools/skill_manager_tool.py: add absorbed_into param to skill_manage
  + _delete_skill. Validate target exists when non-empty. Reject
  absorbed_into=<self>. Wire through dispatcher + registry + schema.
- agent/curator.py: new _extract_absorbed_into_declarations() walks
  tool calls for skill_manage(delete) with the arg. _reconcile_classification
  accepts absorbed_declarations= and treats them as authoritative. Curator
  prompt updated to require the arg on every delete.
- Tests: 7 new skill_manager tests covering the tool contract (valid
  target, empty string, nonexistent target, self-reference, whitespace,
  backward compat, dispatcher plumbing). 11 new curator tests covering
  the extractor + authoritative reconciler path + mixed-legacy-and-
  declared runs.

Validation
- 307/307 targeted tests pass (curator + cron + skill_manager suites).
- E2E #18671 repro: 3 narrow skills, 1 umbrella, cron job referencing
  all 3. Model emits NO YAML block. Heuristic misses (patch prose
  doesn't name old slugs). Delete calls carry absorbed_into. Result:
  both PR skills correctly classified 'consolidated' + cron rewritten
  ['pr-review-format', 'pr-review-checklist', 'stale-junk'] ->
  ['hermes-agent-dev']; stale-junk pruned via absorbed_into=''.
- E2E backward-compat: delete without absorbed_into, model emits YAML
  -> routed via existing 'model' source, cron still rewritten correctly.

* feat(curator): capture + restore cron skill links across snapshot/rollback

Before this, rolling back a curator run restored the skills tree but cron
jobs still pointed at the umbrella skills the curator had rewritten them
to. The user would see their old narrow skills back on disk but their
cron jobs still configured with the merged umbrella — not actually 'back
to how it was'.

Snapshot side: snapshot_skills() now captures ~/.hermes/cron/jobs.json
alongside the skills tarball, as cron-jobs.json. The manifest gets a new
'cron_jobs' block with {backed_up, jobs_count} so rollback (and the CLI
confirm dialog) can surface what's in the snapshot. If jobs.json is
missing/unreadable/malformed, snapshot proceeds without cron data — the
skills backup is the core guarantee; cron is additive.

Rollback side: after the skills extract succeeds, the new
_restore_cron_skill_links() reconciles the backed-up jobs into the live
jobs.json SURGICALLY. Only 'skills' and 'skill' fields are restored, and
only on jobs matched by id. Everything else about a cron job — schedule,
last_run_at, next_run_at, enabled, prompt, workdir, hooks — is live
state the user or scheduler has modified since the snapshot; overwriting
it would regress unrelated activity.

Reconciliation rules:
- Job in backup AND live, skills differ  → skills restored.
- Job in backup AND live, skills match   → no-op.
- Job in backup, NOT in live             → skipped (user deleted it
                                              after snapshot; their choice
                                              is later than the snapshot).
- Job in live, NOT in backup             → untouched (user created it
                                              after snapshot).
- Snapshot missing cron-jobs.json at all → rollback still succeeds,
                                              reports 'not captured'
                                              (older pre-feature snapshots
                                              keep working).

Writes go through cron.jobs.save_jobs under the same _jobs_file_lock the
scheduler uses, so rollback doesn't race tick().

Also:
- hermes_cli/curator.py: rollback confirm dialog now shows
  'cron jobs: N (will be restored for skill-link fields only)' when the
  snapshot has cron data, or 'not in snapshot (<reason>)' otherwise.
- rollback()'s message string includes a 'cron links: ...' clause
  summarizing the reconciliation outcome.

Tests
- 9 new cases: snapshot-with-cron, snapshot-without-cron, malformed-json
  captured-as-raw, full rollback-restores-skills-and-cron, rollback
  touches only skill fields, rollback skips user-deleted jobs, rollback
  leaves user-created jobs untouched, rollback still works with
  pre-feature snapshot that has no cron-jobs.json, standalone unit test
  on _restore_cron_skill_links exercising the full report shape.

Validation
- 484/484 targeted tests pass (curator + cron + skill_manager suites).
- E2E: real snapshot_skills, real cron rewrite, real rollback. Before:
  ['pr-review-format', 'pr-review-checklist', 'pr-triage-salvage'].
  After curator: ['hermes-agent-dev']. After rollback: ['pr-review-format',
  'pr-review-checklist', 'pr-triage-salvage']. Non-skill fields (id,
  name, prompt) preserved across the round trip.
2026-05-02 01:29:57 -07:00
..
__init__.py
test_accretion_caps.py fix(ci): recover 38 failing tests on main (#17642) 2026-04-29 20:05:32 -07:00
test_ansi_strip.py
test_approval_heartbeat.py fix(ci): stabilize main test suite regressions (#17660) 2026-04-29 23:18:55 -07:00
test_approval_plugin_hooks.py feat(plugins): add pre_approval_request / post_approval_response hooks (#16776) 2026-04-27 20:08:33 -07:00
test_approval.py fix(approval): close remaining prompt_toolkit deadlock vectors (#15216) 2026-04-27 06:42:32 -07:00
test_base_environment.py fix(env): safely quote ~/ subpaths in wrapped cd commands 2026-04-24 15:25:12 -07:00
test_browser_camofox_persistence.py docs: remove nonexistent CAMOFOX_PROFILE_DIR env var references (#10976) 2026-04-16 04:07:11 -07:00
test_browser_camofox_state.py test: stop testing mutable data — convert change-detectors to invariants (#13363) 2026-04-20 23:20:33 -07:00
test_browser_camofox.py fix(tests): resolve 17 persistent CI test failures (#15084) 2026-04-24 03:46:46 -07:00
test_browser_cdp_override.py Support browser CDP URL from config 2026-04-17 16:05:04 -07:00
test_browser_cdp_tool.py fix(tests): resolve 17 persistent CI test failures (#15084) 2026-04-24 03:46:46 -07:00
test_browser_chromium_check.py fix(browser): detect missing Chromium and fail fast with actionable error (#17039) 2026-04-28 07:03:44 -07:00
test_browser_cleanup.py
test_browser_cloud_fallback.py fix(browser): runtime fallback to local Chromium when cloud provider fails 2026-04-16 04:19:34 -07:00
test_browser_console.py fix(browser): honor auxiliary.vision.temperature for screenshot analysis\n\n- mirror the vision tool's config bridge in browser_vision 2026-04-20 00:32:09 -07:00
test_browser_content_none_guard.py
test_browser_hardening.py
test_browser_homebrew_paths.py fix(browser): detect missing Chromium and fail fast with actionable error (#17039) 2026-04-28 07:03:44 -07:00
test_browser_hybrid_routing.py feat(browser): auto-spawn local Chromium for LAN/localhost URLs in cloud mode (#16136) 2026-04-26 09:57:58 -07:00
test_browser_orphan_reaper.py fix(ci): stabilize main test suite regressions (#17660) 2026-04-29 23:18:55 -07:00
test_browser_secret_exfil.py
test_browser_ssrf_local.py fix(security): treat quoted false as false in browser SSRF guards 2026-04-26 18:27:13 -07:00
test_browser_supervisor_healthcheck.py test(browser_supervisor): cover cache-hit healthcheck on dead thread/loop 2026-04-30 20:33:33 -07:00
test_browser_supervisor.py feat(browser): CDP supervisor — dialog detection + response + cross-origin iframe eval (#14540) 2026-04-23 22:23:37 -07:00
test_budget_config.py
test_checkpoint_manager.py feat(checkpoints): auto-prune orphan and stale shadow repos at startup (#16303) 2026-04-26 19:05:52 -07:00
test_clarify_tool.py
test_clipboard.py fix(ci): stabilize main test suite regressions (#17660) 2026-04-29 23:18:55 -07:00
test_code_execution_modes.py feat(execute_code): add project/strict execution modes, default to project (#11971) 2026-04-18 01:46:25 -07:00
test_code_execution.py fix(tools): serialize concurrent hermes_tools RPC calls from execute_code 2026-04-30 03:31:16 -07:00
test_command_guards.py feat: add Vercel Sandbox backend 2026-04-29 07:22:33 -07:00
test_config_null_guard.py
test_credential_files.py
test_credential_pool_env_fallback.py fix(auth): hoist get_env_value import + strengthen .env fallback tests 2026-04-26 08:32:09 -07:00
test_cron_approval_mode.py feat(approval): hardline blocklist for unrecoverable commands (#15878) 2026-04-25 22:07:12 -07:00
test_cron_prompt_injection.py
test_cronjob_tools.py fix(cron): accept list-form deliver values so deliver=['telegram'] works (#17456) 2026-04-29 06:35:34 -07:00
test_daytona_environment.py
test_debug_helpers.py
test_delegate_subagent_timeout_diagnostic.py feat(delegate): diagnostic dump when a subagent times out with 0 API calls (#15105) 2026-04-24 04:58:32 -07:00
test_delegate_toolset_scope.py
test_delegate.py fix(delegate): honor runtime default model during provider resolution 2026-04-30 19:58:55 -07:00
test_discord_tool.py test(discord_tool): add regression test for per-token capability cache 2026-04-30 20:37:12 -07:00
test_docker_environment.py feat(docker): run container as host user to avoid root-owned bind mounts 2026-04-29 16:16:43 +10:00
test_docker_find.py
test_dockerfile_pid1_reaping.py chore(salvage): strip duplicated/merge-corrupted blocks from PR #17664 2026-04-29 21:56:51 -07:00
test_env_passthrough.py fix(env_passthrough): reject Hermes provider credentials from skill passthrough (#13523) 2026-04-21 06:14:25 -07:00
test_feishu_tools.py feat: add Feishu document comment intelligent reply with 3-tier access control 2026-04-17 19:04:11 -07:00
test_file_operations_edge_cases.py tools: normalize file tool pagination bounds 2026-04-22 06:11:41 -07:00
test_file_operations.py tools: normalize file tool pagination bounds 2026-04-22 06:11:41 -07:00
test_file_ops_cwd_tracking.py fix(file-ops): follow terminal env's live cwd in _exec instead of init-time cached cwd (#11912) 2026-04-17 19:26:40 -07:00
test_file_read_guards.py fix(file-tools): escalate to BLOCKED on repeated read_file dedup stubs (#16382) 2026-04-27 00:17:26 -07:00
test_file_staleness.py fix(file_tools): resolve bookkeeping paths against live terminal cwd 2026-04-23 15:11:52 -07:00
test_file_state_registry.py feat(delegate): cross-agent file state coordination for concurrent subagents (#13718) 2026-04-21 16:41:26 -07:00
test_file_sync_back.py fix(ci): stabilize current main test regressions 2026-04-30 06:36:50 -07:00
test_file_sync_perf.py
test_file_sync.py
test_file_tools_container_config.py fix(docker): pass docker_mount_cwd_to_workspace and docker_forward_env to container_config in file_tools 2026-04-20 00:58:16 -07:00
test_file_tools_live.py
test_file_tools.py fix(tests): resolve 17 persistent CI test failures (#15084) 2026-04-24 03:46:46 -07:00
test_file_write_safety.py
test_force_dangerous_override.py
test_fuzzy_match.py fix(patch): gate 'did you mean?' to no-match + extend to v4a/skill_manage 2026-04-21 02:03:46 -07:00
test_hardline_blocklist.py fix: address self-review findings for Vercel Sandbox salvage 2026-04-29 07:22:33 -07:00
test_hidden_dir_filter.py
test_homeassistant_tool.py
test_image_generation_env.py Normalize FAL_KEY env handling (ignore whitespace-only values) 2026-04-21 02:04:21 -07:00
test_image_generation_plugin_dispatch.py fix(image-gen): force-refresh plugin providers in long-lived sessions 2026-04-23 03:01:18 -07:00
test_image_generation.py feat(image-gen): add GPT Image 2 to FAL catalog (#13677) 2026-04-21 13:35:31 -07:00
test_init_session_cwd_respect.py fix(cli): respect terminal.cwd config in local terminal backend 2026-04-28 22:16:08 -07:00
test_interrupt.py
test_kanban_tools.py feat(kanban): durable multi-profile collaboration board (#17805) 2026-04-30 13:36:47 -07:00
test_llm_content_none_guard.py
test_local_background_child_hang.py fix(environments): use incremental UTF-8 decoder in select-based drain 2026-04-19 11:27:50 -07:00
test_local_env_blocklist.py feat: add Vercel Sandbox backend 2026-04-29 07:22:33 -07:00
test_local_interrupt_cleanup.py fix(ci): stabilize main test suite regressions (#17660) 2026-04-29 23:18:55 -07:00
test_local_shell_init.py fix(terminal): auto-source ~/.profile and ~/.bash_profile so n/nvm PATH survives (#14534) 2026-04-23 05:15:37 -07:00
test_local_tempdir.py
test_managed_browserbase_and_modal.py feat: ungate Tool Gateway — subscription-based access with per-tool opt-in 2026-04-16 12:36:49 -07:00
test_managed_media_gateways.py feat: ungate Tool Gateway — subscription-based access with per-tool opt-in 2026-04-16 12:36:49 -07:00
test_managed_modal_environment.py
test_managed_server_tool_support.py
test_managed_tool_gateway.py feat: ungate Tool Gateway — subscription-based access with per-tool opt-in 2026-04-16 12:36:49 -07:00
test_mcp_circuit_breaker.py test(mcp): add failing tests for circuit-breaker recovery 2026-04-21 05:19:03 -07:00
test_mcp_dynamic_discovery.py fix(ci): recover 38 failing tests on main (#17642) 2026-04-29 20:05:32 -07:00
test_mcp_oauth_bidirectional.py fix(mcp-oauth): bidirectional auth_flow bridge + absolute expires_at (salvage #12025) (#12717) 2026-04-19 16:31:07 -07:00
test_mcp_oauth_cold_load_expiry.py fix(mcp-oauth): bidirectional auth_flow bridge + absolute expires_at (salvage #12025) (#12717) 2026-04-19 16:31:07 -07:00
test_mcp_oauth_integration.py fix(mcp): consolidate OAuth handling, pick up external token refreshes (#11383) 2026-04-16 21:57:10 -07:00
test_mcp_oauth_manager.py fix(mcp): consolidate OAuth handling, pick up external token refreshes (#11383) 2026-04-16 21:57:10 -07:00
test_mcp_oauth.py fix(mcp-oauth): preserve server_url path for protected-resource validation (#16031) 2026-04-26 05:43:54 -07:00
test_mcp_probe.py
test_mcp_reconnect_signal.py fix(mcp): consolidate OAuth handling, pick up external token refreshes (#11383) 2026-04-16 21:57:10 -07:00
test_mcp_stability.py fix(cron): reap orphaned MCP stdio subprocesses after each tick 2026-04-26 18:21:20 -07:00
test_mcp_structured_content.py fix(ci): recover 38 failing tests on main (#17642) 2026-04-29 20:05:32 -07:00
test_mcp_tool_401_handling.py fix(mcp): consolidate OAuth handling, pick up external token refreshes (#11383) 2026-04-16 21:57:10 -07:00
test_mcp_tool_issue_948.py
test_mcp_tool_session_expired.py fix(mcp): auto-reconnect + retry once when the transport session expires (#13383) 2026-04-24 05:28:45 -07:00
test_mcp_tool.py fix(mcp): preserve nullable schema coercion 2026-04-28 04:58:03 -07:00
test_memory_tool_import_fallback.py
test_memory_tool.py
test_mixture_of_agents_tool.py chore(release): map devorun author + convert MoA defaults test to invariant 2026-04-23 15:14:11 -07:00
test_modal_bulk_upload.py
test_modal_sandbox_fixes.py feat: add Vercel Sandbox backend 2026-04-29 07:22:33 -07:00
test_modal_snapshot_isolation.py
test_notify_on_complete.py
test_osv_check.py
test_parse_env_var.py guard terminal_tool import-time env parsing 2026-04-22 14:45:50 -07:00
test_patch_parser.py
test_process_registry.py fix(process): reconcile session.exited against real child exit in poll/wait (#17430) 2026-04-29 04:59:21 -07:00
test_read_loop_detection.py
test_registry.py test(toolsets): include kanban in expected post-#17805 toolset assertions 2026-04-30 19:43:03 -07:00
test_resolve_path.py fix(file_tools): resolve bookkeeping paths against live terminal cwd 2026-04-23 15:11:52 -07:00
test_rl_training_tool.py
test_schema_sanitizer.py fix: sanitize tool schemas for llama.cpp backends; restore MCP in TUI (#15032) 2026-04-24 02:44:46 -07:00
test_search_hidden_dirs.py
test_send_message_missing_platforms.py fix(send_message): deliver Matrix media via adapter 2026-04-15 17:37:43 -07:00
test_send_message_tool.py feat(gateway/signal): add support for multiple images sending 2026-04-30 04:28:08 -07:00
test_session_search.py fix(session_search): order recent mode by last activity instead of start time 2026-04-30 20:17:15 -07:00
test_shared_container_task_id.py feat(terminal): collapse subagent task_ids to shared container (#16177) 2026-04-26 11:55:02 -07:00
test_signal_media.py feat(send_message): add media delivery support for Signal 2026-04-20 13:24:15 -07:00
test_singularity_preflight.py
test_skill_env_passthrough.py
test_skill_improvements.py
test_skill_manager_tool.py fix(curator): authoritative absorbed_into on delete + restore cron skill links on rollback (#18671) (#18731) 2026-05-02 01:29:57 -07:00
test_skill_size_limits.py
test_skill_usage.py fix: use skill activity in curator status 2026-04-30 10:31:47 -07:00
test_skill_view_path_check.py
test_skill_view_traversal.py
test_skills_guard.py feat(skills-guard): gate agent-created scanner on config.skills.guard_agent_created (default off) 2026-04-23 06:20:47 -07:00
test_skills_hub_clawhub.py
test_skills_hub.py feat(skills): install skills from a direct HTTP(S) URL (#16323) 2026-04-26 20:57:10 -07:00
test_skills_sync.py feat(skills_sync): surface collision with reset-hint 2026-04-23 05:09:08 -07:00
test_skills_tool.py fix: address self-review findings for Vercel Sandbox salvage 2026-04-29 07:22:33 -07:00
test_slash_confirm.py feat(gateway,cli): confirm /reload-mcp to warn about prompt cache invalidation 2026-04-29 21:56:47 -07:00
test_spotify_client.py refactor(spotify): convert to built-in bundled plugin under plugins/spotify (#15174) 2026-04-24 07:06:11 -07:00
test_ssh_bulk_upload.py test(ssh): update tar pipe assertion for --no-overwrite-dir 2026-04-30 04:32:28 -07:00
test_ssh_environment.py fix(tools): keep SSH ControlMaster socket path under macOS 104-byte limit 2026-04-20 03:07:32 -07:00
test_symlink_prefix_confusion.py
test_sync_back_backends.py fix: harden sync_back — PID-suffix temp path, size cap, lifecycle guards 2026-04-16 19:39:21 -07:00
test_terminal_compound_background.py fix(terminal): rewrite A && B & to A && { B & } to prevent subshell leak 2026-04-19 16:53:11 -07:00
test_terminal_config_env_sync.py feat(docker): run container as host user to avoid root-owned bind mounts 2026-04-29 16:16:43 +10:00
test_terminal_exit_semantics.py
test_terminal_foreground_timeout_cap.py terminal: steer long-lived server commands to background mode 2026-04-19 16:47:20 -07:00
test_terminal_none_command_guard.py
test_terminal_output_transform_hook.py test: stop testing mutable data — convert change-detectors to invariants (#13363) 2026-04-20 23:20:33 -07:00
test_terminal_requirements.py feat: add Vercel Sandbox backend 2026-04-29 07:22:33 -07:00
test_terminal_timeout_output.py
test_terminal_tool_pty_fallback.py
test_terminal_tool_requirements.py feat: add Vercel Sandbox backend 2026-04-29 07:22:33 -07:00
test_terminal_tool.py fix(terminal): skip sudo prompt when local NOPASSWD sudo works 2026-04-30 20:38:09 -07:00
test_threaded_process_handle.py
test_tirith_security.py fix(ci): stabilize main test suite regressions (#17660) 2026-04-29 23:18:55 -07:00
test_todo_tool.py
test_tool_backend_helpers.py fix(cli): coerce use_gateway config flags in tool routing 2026-04-26 19:02:55 -07:00
test_tool_call_parsers.py
test_tool_output_limits.py feat(skills): add design-md skill for Google's DESIGN.md spec (#14876) 2026-04-23 21:51:19 -07:00
test_tool_result_storage.py
test_transcription_dotenv_fallback.py fix(ci): stabilize main test suite regressions (#17660) 2026-04-29 23:18:55 -07:00
test_transcription_tools.py fix(ci): stabilize main test suite regressions (#17660) 2026-04-29 23:18:55 -07:00
test_transcription.py fix(ci): stabilize main test suite regressions (#17660) 2026-04-29 23:18:55 -07:00
test_tts_command_providers.py feat(tts): add Piper as a native local TTS provider (closes #8508) (#17885) 2026-04-30 02:53:20 -07:00
test_tts_dotenv_fallback.py fix(ci): stabilize main test suite regressions (#17660) 2026-04-29 23:18:55 -07:00
test_tts_gemini.py feat(tts): add Google Gemini TTS provider (#11229) 2026-04-16 14:23:16 -07:00
test_tts_kittentts.py feat(tts): complete KittenTTS integration (tools/setup/docs/tests) 2026-04-21 01:28:32 -07:00
test_tts_max_text_length.py fix(tts): use per-provider input-character caps instead of global 4000 (#13743) 2026-04-21 17:49:39 -07:00
test_tts_mistral.py feat(tts): add Piper as a native local TTS provider (closes #8508) (#17885) 2026-04-30 02:53:20 -07:00
test_tts_piper.py feat(tts): add Piper as a native local TTS provider (closes #8508) (#17885) 2026-04-30 02:53:20 -07:00
test_tts_speed.py
test_url_safety.py fix(security): treat quoted false as false in browser SSRF guards 2026-04-26 18:27:13 -07:00
test_vercel_sandbox_environment.py feat: add Vercel Sandbox backend 2026-04-29 07:22:33 -07:00
test_vision_tools.py test: cover vision config temperature wiring\n\n- add regression tests for auxiliary.vision.temperature and timeout\n- add bugkill3r to AUTHOR_MAP for the salvaged commit 2026-04-20 00:32:09 -07:00
test_voice_cli_integration.py feat(voice): add cli beep toggle 2026-04-21 00:29:29 -07:00
test_voice_mode.py
test_watch_patterns.py fix(terminal): three-layer defense against watch_patterns notification spam (#15642) 2026-04-25 06:41:58 -07:00
test_web_tools_config.py feat(web): expose search result limit 2026-04-28 02:09:30 -07:00
test_web_tools_tavily.py
test_website_policy.py
test_windows_compat.py
test_write_deny.py fix(tests): resolve 17 persistent CI test failures (#15084) 2026-04-24 03:46:46 -07:00
test_yolo_mode.py fix(approval): harden YOLO mode env parsing against quoted-bool strings 2026-04-30 20:37:37 -07:00
test_zombie_process_cleanup.py fix(tests): resolve 17 persistent CI test failures (#15084) 2026-04-24 03:46:46 -07:00