test(approval): isolate test_blocking_approval_* from parallel-load contamination #6

Open
opened 2026-05-08 04:04:01 +00:00 by claude-ceo-assistant · 0 comments

Severity: low (depends on the agent_cache refactor in the sibling issue)

Symptom

tests/gateway/test_approve_deny_commands.py::TestBlockingApprovalE2E::test_blocking_approval_approve_once (and test_blocking_approval_deny) pass when run alone but fail intermittently when xdist schedules them alongside test_concurrent_inserts_settle_at_cap on the same runner.

Failure shape: the agent thread's notify_cb callback never fires within the 2.5s wait window (assert len(notified) == 1 failing with 0).

Root cause hypothesis

The test holds the full test run hostage to host-wide CPU pressure. When test_concurrent_inserts_settle_at_cap is creating 160 real AIAgent instances on adjacent xdist workers, the agent thread in this test is starved enough that it doesn't reach the gateway-approval path before the main thread's polling deadline expires.

If so, fixing the sibling issue (stub-agent refactor for test_concurrent_inserts_settle_at_cap) should also fix this one, because the load source goes away.

Plan

  1. Wait until the sibling stub-agent refactor lands.
  2. Re-run this test under parallel xdist load to confirm pass.
  3. If still red, instrument with TRACE prints at each early-return path of tools.approval.check_all_command_guards to identify which branch swallows the call before notify fires.

Out of scope

Deeper investigation now would be guesswork; the load hypothesis is concrete and easily falsifiable once the sibling fix lands.

## Severity: low (depends on the agent_cache refactor in the sibling issue) ## Symptom `tests/gateway/test_approve_deny_commands.py::TestBlockingApprovalE2E::test_blocking_approval_approve_once` (and `test_blocking_approval_deny`) pass when run alone but fail intermittently when xdist schedules them alongside `test_concurrent_inserts_settle_at_cap` on the same runner. Failure shape: the agent thread's `notify_cb` callback never fires within the 2.5s wait window (`assert len(notified) == 1` failing with `0`). ## Root cause hypothesis The test holds the full test run hostage to host-wide CPU pressure. When `test_concurrent_inserts_settle_at_cap` is creating 160 real `AIAgent` instances on adjacent xdist workers, the agent thread in this test is starved enough that it doesn't reach the gateway-approval path before the main thread's polling deadline expires. If so, fixing the sibling issue (stub-agent refactor for `test_concurrent_inserts_settle_at_cap`) should also fix this one, because the load source goes away. ## Plan 1. Wait until the sibling stub-agent refactor lands. 2. Re-run this test under parallel xdist load to confirm pass. 3. If still red, instrument with TRACE prints at each early-return path of `tools.approval.check_all_command_guards` to identify which branch swallows the call before notify fires. ## Out of scope Deeper investigation now would be guesswork; the load hypothesis is concrete and easily falsifiable once the sibling fix lands.
Sign in to join this conversation.
No Label
No Milestone
No project
No Assignees
1 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: molecule-ai/hermes-agent#6
No description provided.