fix(a2a): avoid false failure on busy queue fallback #1751
Reference in New Issue
Block a user
Delete Branch "fix/codex-scheduled-a2a-timeout"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Summary
Live evidence
timeout awaiting response headersat 180002ms, then a2a_queue item79496d0a-b00b-4e8c-aa55-39fc10a5b18bcompleted on heartbeat drainVerification
go test ./internal/handlers -run 'TestHandleA2ADispatchError_BusyEnqueueLogsQueuedNotFailure|TestHandleA2ADispatchError_ContextDeadline|TestA2AClientResponseHeaderTimeout|TestHandleA2ADispatchError_NativeSession_NowEnqueues|TestHandleA2ADispatchError_NoNativeSession_StillEnqueues'go test ./internal/handlers5-axis review on
691d341:Correctness: APPROVED. Busy/timeout dispatches that successfully enqueue now return the existing 202 queued response and record the activity as queued/ok rather than logging a false failure, while real container-dead/enqueue-failed/other errors still log failures. The response-header budget now matches the scheduler's longer first-response window.
Robustness: The new regression test covers the live shape: timeout/busy error, durable enqueue, 202 response, and status=ok activity logging. Existing enqueue failure fallback remains intact.
Security: No auth or secret surface changed.
Performance: Timeout budget increase can hold a connection longer, but aligns with scheduled turn expectations; no hot-loop or N+1 concern.
Readability: The new helper names and comments make the queued-vs-failed distinction clear. Functional CI is green; red statuses are review gates.