Compare commits

...

1 Commits

Author SHA1 Message Date
fullstack-engineer f105cbfa6a fix(handlers): replace time.Sleep with explicit async drain in 4 tests
E2E API Smoke Test / E2E API Smoke Test (pull_request) Blocked by required conditions
E2E Chat / E2E Chat (pull_request) Blocked by required conditions
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Blocked by required conditions
Harness Replays / Harness Replays (pull_request) Blocked by required conditions
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Blocked by required conditions
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 28s
CI / Detect changes (pull_request) Successful in 1m34s
Harness Replays / detect-changes (pull_request) Successful in 36s
E2E API Smoke Test / detect-changes (pull_request) Successful in 2m4s
E2E Chat / detect-changes (pull_request) Successful in 1m48s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 25s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 2m0s
gate-check-v3 / gate-check (pull_request) Successful in 39s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m41s
qa-review / approved (pull_request) Successful in 55s
security-review / approved (pull_request) Successful in 40s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 1m31s
sop-tier-check / tier-check (pull_request) Successful in 21s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 9s
CI / Python Lint & Test (pull_request) Successful in 9s
CI / Canvas (Next.js) (pull_request) Successful in 19m17s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / Platform (Go) (pull_request) Successful in 22m10s
CI / all-required (pull_request) Successful in 6s
sop-checklist / all-items-acked (pull_request) acked: 7/7 — body-unfilled: comprehensive-testing, local-postgres-e2e, staging-smoke, +4
Issue #1264 — CI/Platform Go tests flake under parallel load.

The 4 tests that were failing used time.Sleep(N) to wait for
goroutines launched by goAsync() to complete before assertions
ran. Under CI parallelism, goroutines from prior tests could
still be writing to the next test's sqlmock, and the fixed
sleep durations (50–200ms) were insufficient under load.

Fix: replace each time.Sleep with handler.waitAsyncForTest(), which
calls h.asyncWG.Wait() and returns only when all goroutines started
by this handler have terminated. This is deterministic regardless
of parallelism or machine speed.

Tests changed:
- TestProxyA2A_Upstream502_TriggersContainerDeadCheck
- TestProxyA2A_Upstream502_AliveAgent_PropagatesAsIs
- TestGracefulPreRestart_URLResolutionError
- TestRestartWorkspaceAuto_RoutesToDockerWhenOnlyDocker

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-16 02:36:15 +00:00
3 changed files with 4 additions and 5 deletions
@@ -294,8 +294,7 @@ func TestProxyA2A_Upstream502_TriggersContainerDeadCheck(t *testing.T) {
c.Request.Header.Set("Content-Type", "application/json")
handler.ProxyA2A(c)
time.Sleep(80 * time.Millisecond)
handler.waitAsyncForTest()
// Caller sees a structured 503 (NOT the upstream 502 which CF would mask).
if w.Code != http.StatusServiceUnavailable {
@@ -350,7 +349,7 @@ func TestProxyA2A_Upstream502_AliveAgent_PropagatesAsIs(t *testing.T) {
c.Request.Header.Set("Content-Type", "application/json")
handler.ProxyA2A(c)
time.Sleep(50 * time.Millisecond)
handler.waitAsyncForTest()
if w.Code != http.StatusBadGateway {
t.Fatalf("alive agent 502 should propagate as 502; got %d: %s", w.Code, w.Body.String())
@@ -273,7 +273,7 @@ func TestGracefulPreRestart_URLResolutionError(t *testing.T) {
}
hWrapper.gracefulPreRestart(context.Background(), "ws-url-err-111")
time.Sleep(200 * time.Millisecond)
hWrapper.waitAsyncForTest()
// No panic or error expected — proceeds with stop as documented
}
@@ -686,7 +686,7 @@ func TestRestartWorkspaceAuto_RoutesToDockerWhenOnlyDocker(t *testing.T) {
// recovered by logProvisionPanic. Without this wait, the goroutine
// outlives the test and writes to a sqlmock that the NEXT test
// owns, causing a `was not expected` race.
time.Sleep(200 * time.Millisecond)
h.waitAsyncForTest()
// Stop call is synchronous on the Docker leg.
if len(stub.stopped) == 0 || stub.stopped[0] != wsID {