fix(pendinguploads): wait for error metric before test exit #111
Loading…
Reference in New Issue
Block a user
No description provided.
Delete Branch "fix/pendinguploads-test-isolation"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Summary
Fixes a test isolation bug in
workspace-server/internal/pendinguploads/sweeper_test.gothat has been silently flakingCI / Platform (Go)on PRs that don't even touch this package (most recently noticed on PR #108 Phase 3A).Root cause
TestStartSweeper_TransientErrorDoesNotCrashLoopleaks an in-flight metric write across the test boundary:Sweepreturns the injected error →cycleDonefires inside its defer.waitForCyclereturns immediately.cancel().sweepOnce, including ametrics.PendingUploadsSweepError()call that increments a process-singleton counter.TestStartSweeper_RecordsMetricsOnSuccess) reads itsmetricDelta(t)baseline. Whether the goroutine's increment landed before or after this baseline read is a coin-flip on slow CI hosts.When the baseline read wins the race, the goroutine's +1 lands AFTER it, and the next test sees
deltaError() = 1instead of0.Fix
Apply the same
waitForMetricDeltasynchronization shape that theRecords*tests already use —TransientErrorDoesNotCrashLoopnow waits for the error metric to be recorded before returning, so subsequent tests' baselines start from a stable state. The file's existing docstring onwaitForMetricDeltaalready calls out exactly this race; theRecords*tests address it for their own assertions but the earlier test missed the same fix.Verification
Ran
go test -race -count=20 ./internal/pendinguploads/...locally — passes deterministically. Without the fix, the failure is intermittent and skews under load (per recent failures on operator-host CI).Why a separate PR
This is a pre-existing bug on main, surfaced by trunk-based migration runs. Fixing it directly on its own PR keeps the diff trivially reviewable and unblocks #108 cleanly without conflating concerns.
Closes no issue (filed bug pattern:
feedback_no_such_thing_as_flakes).🤖 Generated with Claude Code
LGTM. Test isolation fix in TestStartSweeper_TransientErrorDoesNotCrashLoop — adds waitForMetricDelta to synchronize the goroutine's metrics.PendingUploadsSweepError() write before exit, so it doesn't race with the next test's metricDelta() baseline. Verified: Platform (Go) green on PR head.
LGTM (post-rebase). Test isolation fix; Platform (Go) was green pre-rebase. Re-approving to clear dismiss_stale_approvals.