hermes-agent/agent
Teknium 9231a335d4
fix(compression): replace dead summary_target_tokens with ratio-based scaling (#2554)
The summary_target_tokens parameter was accepted in the constructor,
stored on the instance, and never used — the summary budget was always
computed from hardcoded module constants (_SUMMARY_RATIO=0.20,
_MAX_SUMMARY_TOKENS=8000). This caused two compounding problems:

1. The config value was silently ignored, giving users no control
   over post-compression size.
2. Fixed budgets (20K tail, 8K summary cap) didn't scale with
   context window size. Switching from a 1M-context model to a
   200K model would trigger compression that nuked 350K tokens
   of conversation history down to ~30K.

Changes:
- Replace summary_target_tokens with summary_target_ratio (default 0.40)
  which sets the post-compression target as a fraction of context_length.
  Tail token budget and summary cap now scale proportionally:
    MiniMax 200K → ~80K post-compression
    GPT-5   1M  → ~400K post-compression
- Change threshold_percent default: 0.50 → 0.80 (don't fire until
  80% of context is consumed)
- Change protect_last_n default: 4 → 20 (preserve ~10 full turns)
- Summary token cap scales to 5% of context (was fixed 8K), capped
  at 32K ceiling
- Read target_ratio and protect_last_n from config.yaml compression
  section (both are now configurable)
- Remove hardcoded summary_target_tokens=500 from run_agent.py
- Add 5 new tests for ratio scaling, clamping, and new defaults
2026-03-24 17:45:49 -07:00
..
__init__.py Refactor Terminal and AIAgent cleanup 2026-02-21 22:31:43 -08:00
anthropic_adapter.py fix: Alibaba/DashScope: preserve model dots (qwen3.5-plus) and fix 401 auth 2026-03-21 09:38:04 -07:00
auxiliary_client.py fix(cli): prevent 'Press ENTER to continue...' on exit 2026-03-22 15:31:54 -07:00
context_compressor.py fix(compression): replace dead summary_target_tokens with ratio-based scaling (#2554) 2026-03-24 17:45:49 -07:00
context_references.py fix(context): restrict @ references to safe workspace paths (#2601) 2026-03-23 06:40:05 -07:00
copilot_acp_client.py fix(acp): preserve leading whitespace in streaming chunks 2026-03-20 09:38:13 -07:00
display.py fix: reorder setup wizard providers — OpenRouter first 2026-03-24 12:50:24 -07:00
insights.py fix(security): eliminate SQL string formatting in execute() calls 2026-03-19 15:16:35 +01:00
model_metadata.py fix(model_metadata): skip endpoint probe for known providers (Copilot context bug) (#2507) 2026-03-22 08:15:06 -07:00
models_dev.py fix: 6 bugs in model metadata, reasoning detection, and delegate tool 2026-03-20 08:52:37 -07:00
prompt_builder.py feat: priority-based context file selection + CLAUDE.md support (#2301) 2026-03-21 06:26:20 -07:00
prompt_caching.py fix(prompt-caching): skip top-level cache_control on role:tool for OpenRouter 2026-03-21 16:54:43 -07:00
redact.py fix(redact): safely handle non-string inputs 2026-03-21 16:55:02 -07:00
skill_commands.py fix: disabled skills respected across banner, system prompt, slash commands, and skill_view (#1897) 2026-03-18 03:17:37 -07:00
smart_model_routing.py feat: integrate GitHub Copilot providers across Hermes 2026-03-17 23:40:22 -07:00
title_generator.py feat: auto-generate session titles after first exchange 2026-03-17 04:14:40 -07:00
trajectory.py Refactor Terminal and AIAgent cleanup 2026-02-21 22:31:43 -08:00
usage_pricing.py feat: use endpoint metadata for custom model context and pricing (#1906) 2026-03-18 03:04:07 -07:00