hermes-agent/tests
Teknium 346601ca8d
fix(context): invalidate stale Codex OAuth cache entries >= 400k (#15078)
PR #14935 added a Codex-aware context resolver but only new lookups
hit the live /models probe. Users who had run Hermes on gpt-5.5 / 5.4
BEFORE that PR already had the wrong value (e.g. 1,050,000 from
models.dev) persisted in ~/.hermes/context_length_cache.yaml, and the
cache-first lookup in get_model_context_length() returns it forever.

Symptom (reported in the wild by Ludwig, min heo, Gaoge on current
main at 6051fba9d, which is AFTER #14935):
  * Startup banner shows context usage against 1M
  * Compression fires late and then OpenAI hard-rejects with
    'context length will be reduced from 1,050,000 to 128,000'
    around the real 272k boundary.

Fix: when the step-1 cache returns a value for an openai-codex lookup,
check whether it's >= 400k. Codex OAuth caps every slug at 272k (live
probe values) so anything at or above 400k is definitionally a
pre-#14935 leftover. Drop that entry from the on-disk cache and fall
through to step 5, which runs the live /models probe and repersists
the correct value (or 272k from the hardcoded fallback if the probe
fails). Non-Codex providers and legitimately-cached Codex entries at
272k are untouched.

Changes:
- agent/model_metadata.py:
  * _invalidate_cached_context_length() — drop a single entry from
    context_length_cache.yaml and rewrite the file.
  * Step-1 cache check in get_model_context_length() now gates
    provider=='openai-codex' entries >= 400k through invalidation
    instead of returning them.

Tests (3 new in TestCodexOAuthContextLength):
- stale 1.05M Codex entry is dropped from disk AND re-resolved
  through the live probe to 272k; unrelated cache entries survive.
- fresh 272k Codex entry is respected (no probe call, no invalidation).
- non-Codex 1M entries (e.g. anthropic/claude-opus-4.6 on OpenRouter)
  are unaffected — the guard is strictly scoped to openai-codex.

Full tests/agent/test_model_metadata.py: 88 passed.
2026-04-24 04:46:07 -07:00
..
acp fix(acp): include MCP toolsets in ACP sessions 2026-04-24 03:04:42 -07:00
agent fix(context): invalidate stale Codex OAuth cache entries >= 400k (#15078) 2026-04-24 04:46:07 -07:00
cli fix(tests): resolve 17 persistent CI test failures (#15084) 2026-04-24 03:46:46 -07:00
cron feat(cron): honor hermes tools config for the cron platform (#14798) 2026-04-23 15:48:50 -07:00
e2e refactor(commands): drop /provider, /plan handler, and clean up slash registry (#15047) 2026-04-24 03:10:52 -07:00
environments/benchmarks
fakes
gateway fix(tests): resolve 17 persistent CI test failures (#15084) 2026-04-24 03:46:46 -07:00
hermes_cli fix(tests): resolve 17 persistent CI test failures (#15084) 2026-04-24 03:46:46 -07:00
hermes_state fix(resume): redirect --resume to the descendant that actually holds the messages 2026-04-24 03:04:42 -07:00
honcho_plugin
integration
plugins feat(hindsight): optional bank_id_template for per-agent / per-user banks 2026-04-24 03:38:17 -07:00
run_agent fix(tests): resolve 17 persistent CI test failures (#15084) 2026-04-24 03:46:46 -07:00
skills
tools fix(tests): resolve 17 persistent CI test failures (#15084) 2026-04-24 03:46:46 -07:00
tui_gateway Merge branch 'main' into fix/tui-provider-resolution 2026-04-22 11:47:49 -07:00
__init__.py
conftest.py
run_interrupt_test.py
test_account_usage.py
test_base_url_hostname.py
test_batch_runner_checkpoint.py
test_cli_file_drop.py
test_cli_skin_integration.py fix: align status bar skin tests with upstream main 2026-04-22 13:20:02 -07:00
test_ctx_halving_fix.py
test_empty_model_fallback.py
test_evidence_store.py
test_hermes_constants.py
test_hermes_logging.py
test_hermes_state.py
test_honcho_client_config.py
test_ipv4_preference.py
test_mcp_serve.py
test_mini_swe_runner.py
test_minimax_model_validation.py
test_minisweagent_path.py
test_model_picker_scroll.py
test_model_tools_async_bridge.py fix(core): ensure non-blocking executor shutdown on async timeout 2026-04-22 14:42:32 -07:00
test_model_tools.py
test_ollama_num_ctx.py
test_packaging_metadata.py
test_plugin_skills.py
test_project_metadata.py
test_retry_utils.py
test_sql_injection.py
test_subprocess_home_isolation.py
test_timezone.py
test_toolset_distributions.py
test_toolsets.py
test_trajectory_compressor_async.py
test_trajectory_compressor.py
test_transform_tool_result_hook.py
test_tui_gateway_server.py feat(tui): per-section visibility for the details accordion 2026-04-24 02:34:32 -05:00
test_utils_truthy_values.py