fix(core#2834): make TestRecoverPanic -race-clean (mutex buffer + correct LIFO order) #2835
Reference in New Issue
Block a user
Delete Branch "fix/2834-test-recover-panic-race"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
core#2834 — TestRecoverPanic_RecoversFromPanicInGoroutine -race flake
The test had a concurrent read/write on a shared bytes.Buffer that the -race detector flagged intermittently on the repo-wide
go test -racestep (advisory, non-blocking — but reds the -race lane on every Go PR).Two compounding root causes
WRONG LIFO DEFER ORDER. The defer block was:
LIFO runs
close(done)FIRST, thenrecoverPanic— meaning the test's<-doneunblock happens BEFORErecoverPanic'slog.Printfwrites to the buffer. The test then readsbuf.String()while the goroutine is still writing. The test comment explicitly described the correct order (anydefer close(done)placed BEFORE the recoverPanic defer fires) but the code didn't match the comment. Fix: swap the defer order soclose(done)is registered first,recoverPaniclast — LIFO now runs the log write first, then signals done.NO CROSS-GOROUTINE SYNCHRONIZATION ON THE BUFFER. The standard
logpackage holds its own internal mutex around calls toio.Writer.Write, but that mutex only serializes concurrent WRITERS — it does NOT synchronize with the test'sbuf.String()READ from a separate goroutine. Even with the LIFO fix, the race detector needs an explicit happens-before edge between the Write and the String() read. Fix: wrapbytes.Bufferin asyncBuffertype with async.Mutexguarding bothWriteandString. Defense in depth.Test surface
TEST-ONLY change (no production code).
go test ./internal/middleware/passes. No gcc/cgo available in this sandbox forgo test -racelocal-run, but the LIFO + mutex combination is race-free by construction.Files
workspace-server/internal/middleware/panic_recovery_test.go— addedsyncimport, addedsyncBuffertype, fixed defer order in both TestRecoverPanic tests, replacedbytes.BufferwithsyncBuffer.Followups
Closes core#2834. Re-requesting CR2 + Researcher review for the 2-genuine SOP.
The TestRecoverPanic_RecoversFromPanicInGoroutine test had a concurrent read/write on a shared bytes.Buffer that the -race detector flagged intermittently on the repo-wide `go test -race` step (advisory, non-blocking — but reds the -race lane on every Go PR). Two compounding root causes: 1. WRONG LIFO DEFER ORDER. The defer block was: defer recoverPanic("test_goroutine") defer close(done) LIFO runs close(done) FIRST, then recoverPanic — meaning the test's <-done unblock happens BEFORE recoverPanic's log.Printf writes to the buffer. The test then reads buf.String() while the goroutine is still writing. The test comment explicitly described the correct order ("any defer close(done) placed BEFORE the recoverPanic defer fires") but the code didn't match the comment. Fix: swap the defer order so close(done) is registered first, recoverPanic last — LIFO now runs the log write first, then signals done. 2. NO CROSS-GOROUTINE SYNCHRONIZATION ON THE BUFFER. The standard log package holds its own internal mutex around calls to io.Writer.Write, but that mutex only serializes concurrent WRITERS — it does NOT synchronize with the test's buf.String() READ from a separate goroutine. Even with the LIFO fix, the race detector needs an explicit happens-before edge between the Write and the String() read. Fix: wrap bytes.Buffer in a syncBuffer type with a sync.Mutex guarding both Write and String. Defense in depth — if a future change ever moves the defers back to the wrong order, the mutex still serializes the access. TestVerifyPanic_NoopOnNormalReturn got the same treatment (same syncBuffer, same corrected defer order) for uniformity and future-proofing — its normal-return path is no-op so it didn't race in practice, but keeping the pattern uniform means a future test author doesn't have to re-derive the safe pattern. Test surface: TEST-ONLY change (no production code). go test ./internal/middleware/ passes (no -race in this sandbox — no gcc/cgo — but the LIFO + mutex combination is race-free by construction).APPROVED on
b903926a.Verified Platform Go job 497007 is a real matching-head run and completed successfully. Its advisory
go test -race -timeout 10m ./...step completed success; the job log contains noWARNING: DATA RACEand noTestRecoverPanicfailure.Diff review: the defer order now registers
close(done)first andrecoverPanic(...)last, so LIFO runs panic recovery/logging before the test goroutine signals completion. The newsyncBuffermutex guards bothWriteandString, providing the required happens-before edge between log writes and assertions. The tests still assert the intended panic recovery/no-op behavior, and the change is test-only.