hermes-agent/agent
Peppi Littera ec5fdb8b92 feat: query local servers for actual context window size
Custom endpoints (LM Studio, Ollama, vLLM, llama.cpp) silently fall
back to 2M tokens when /v1/models doesn't include context_length.

Adds _query_local_context_length() which queries server-specific APIs:
- LM Studio: /api/v1/models (max_context_length + loaded instances)
- Ollama: /api/show (model_info + num_ctx parameters)
- llama.cpp: /props (n_ctx from default_generation_settings)
- vLLM: /v1/models/{model} (max_model_len)

Prefers loaded instance context over max (e.g., 122K loaded vs 1M max).
Results are cached via save_context_length() to avoid repeated queries.

Also fixes detect_local_server_type() misidentifying LM Studio as
Ollama (LM Studio returns 200 for /api/tags with an error body).
2026-03-19 21:32:04 +01:00
..
__init__.py Refactor Terminal and AIAgent cleanup 2026-02-21 22:31:43 -08:00
anthropic_adapter.py fix(anthropic): tool_choice 'none' still allowed tool calls 2026-03-17 04:02:49 -07:00
auxiliary_client.py fix: respect config.yaml model.base_url for Anthropic provider (#1948) (#1998) 2026-03-18 16:51:24 -07:00
context_compressor.py fix: detect context length for custom model endpoints via fuzzy matching + config override (#2051) 2026-03-19 06:01:16 -07:00
copilot_acp_client.py feat: integrate GitHub Copilot providers across Hermes 2026-03-17 23:40:22 -07:00
display.py feat(tools): centralize tool emoji metadata in registry + skin integration 2026-03-15 20:21:21 -07:00
insights.py fix(security): eliminate SQL string formatting in execute() calls 2026-03-19 15:16:35 +01:00
model_metadata.py feat: query local servers for actual context window size 2026-03-19 21:32:04 +01:00
prompt_builder.py feat: use SOUL.md as primary agent identity instead of hardcoded default (#1922) 2026-03-18 04:11:20 -07:00
prompt_caching.py fix(cache_control) treat empty text like None to avoid anthropic api cache_control error 2026-03-13 18:08:46 -07:00
redact.py feat: secure skill env setup on load (core #688) 2026-03-13 03:14:04 -07:00
skill_commands.py fix: disabled skills respected across banner, system prompt, slash commands, and skill_view (#1897) 2026-03-18 03:17:37 -07:00
smart_model_routing.py feat: integrate GitHub Copilot providers across Hermes 2026-03-17 23:40:22 -07:00
title_generator.py feat: auto-generate session titles after first exchange 2026-03-17 04:14:40 -07:00
trajectory.py Refactor Terminal and AIAgent cleanup 2026-02-21 22:31:43 -08:00
usage_pricing.py feat: use endpoint metadata for custom model context and pricing (#1906) 2026-03-18 03:04:07 -07:00