test(agent_cache): refactor test_concurrent_inserts_settle_at_cap to use stub agents #5

New Issue

claude-ceo-assistant · 2026-05-08T04:04:01Z

2026-05-08 04:04:01 +00:00

Severity: medium

Symptom

tests/gateway/test_agent_cache.py::TestAgentCacheSpilloverLive::test_concurrent_inserts_settle_at_cap passes when run in isolation (~30s) but fails under parallel xdist load when scheduled alongside other heavy tests, blocking the per-PR CI gate.

Root cause

The test creates 160 real AIAgent instances (8 threads × 20 inserts) inside a 30s pytest-timeout budget. Each AIAgent.__init__ does tool registry scan, MCP discovery, config loading, and file I/O. On a runner that's already saturated (the operator host runs at load avg 16-37 on 8 CPUs), the budget is too tight.

The test's INTENT is to verify the cache eviction logic handles concurrent inserts correctly. It does not need real AIAgent instances to do that — a lightweight stub that exposes the small subset of attributes the cache code reads (session_id, etc.) suffices and runs in microseconds.

Proposed fix

Replace _real_agent() with a stub class. The cache logic operates on (agent, signature) tuples and walks _running_agents by id(); the stub needs no real init work to satisfy that contract.

The sister tests test_fill_to_cap_then_spillover and test_spillover_all_active_keeps_cache_over_cap use the same pattern and would benefit from the same refactor; consider doing all three in one PR.

Verification

After the refactor, the test should run in well under 1s even under parallel xdist load. Mutation-test mentally: deleting _enforce_agent_cache_cap() should still fail the test (it asserts cache size == CAP after concurrent inserts).

Surfaced by

Fix attempt while addressing the broader hermes-agent CI red set (PRs #1, #3, #4 as of 2026-05-08). After those land this is the last deterministic remaining failure on main.

## Severity: medium ## Symptom `tests/gateway/test_agent_cache.py::TestAgentCacheSpilloverLive::test_concurrent_inserts_settle_at_cap` passes when run in isolation (~30s) but fails under parallel xdist load when scheduled alongside other heavy tests, blocking the per-PR CI gate. ## Root cause The test creates **160 real `AIAgent` instances** (8 threads × 20 inserts) inside a 30s pytest-timeout budget. Each `AIAgent.__init__` does tool registry scan, MCP discovery, config loading, and file I/O. On a runner that's already saturated (the operator host runs at load avg 16-37 on 8 CPUs), the budget is too tight. The test's INTENT is to verify the cache eviction logic handles concurrent inserts correctly. It does not need real `AIAgent` instances to do that — a lightweight stub that exposes the small subset of attributes the cache code reads (`session_id`, etc.) suffices and runs in microseconds. ## Proposed fix Replace `_real_agent()` with a stub class. The cache logic operates on `(agent, signature)` tuples and walks `_running_agents` by `id()`; the stub needs no real init work to satisfy that contract. The sister tests `test_fill_to_cap_then_spillover` and `test_spillover_all_active_keeps_cache_over_cap` use the same pattern and would benefit from the same refactor; consider doing all three in one PR. ## Verification After the refactor, the test should run in well under 1s even under parallel xdist load. Mutation-test mentally: deleting `_enforce_agent_cache_cap()` should still fail the test (it asserts cache size == CAP after concurrent inserts). ## Surfaced by Fix attempt while addressing the broader hermes-agent CI red set (PRs #1, #3, #4 as of 2026-05-08). After those land this is the last deterministic remaining failure on `main`.

Sign in to join this conversation.

No Label

No Milestone

No project

No Assignees

1 Participants

Notifications

Due Date

The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: molecule-ai/hermes-agent#5