molecule-core/workspace-server/internal
Hongming Wang c79ba05ed5 test(pendinguploads): close cycleDone-vs-metric-record race in sweeper tests
TestStartSweeper_RecordsMetricsOnError flaked on every CI rerun under
race detection: `error counter delta = 0, want 1`. Root cause is a
race between two goroutines, not a bug in the production sweeper.

The fake `fakeSweepStorage.Sweep` signals `cycleDone` from inside its
deferred return — that happens BEFORE Sweep's return value is
received by `sweepOnce`, which is what triggers the metric increment.
On slow CI hosts the test goroutine wins the read after `waitForCycle`
unblocks and BEFORE StartSweeper's goroutine has called
`metrics.PendingUploadsSweepError`, so the asserted delta is 0 even
though the metric WILL be 1 a few ms later.

Adds a polling assert helper, `waitForMetricDelta`, that closes the
race deterministically without timing-based sleeps:

- TestStartSweeper_RecordsMetricsOnError uses waitForMetricDelta to
  wait for the error counter to settle at 1.
- TestStartSweeper_RecordsMetricsOnSuccess uses it on the success
  counters (acked, expired) so the error-stayed-zero assertion
  reads after StartSweeper has fully processed the cycle.
- waitForCycle keeps its current shape but documents the caveat in
  its comment so future tests don't repeat the assumption.

Verified: `go test ./internal/pendinguploads/ -race -count 5` passes
all 9 tests across 5 iterations cleanly.

Per memory feedback_question_test_when_unexpected.md: the
"delta=0, want=1" failure looked like a real production bug at first
glance, but instrumented inspection showed the metric DOES increment,
just AFTER the test's read. The fix is the test's wait shape, not
the sweeper.

Unblocks every PR currently broken by this flake (#2898 hit it on
two consecutive CI runs; staging-merged PRs from earlier today
(#2877/#2881/#2885/#2886) introduced the test).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 10:46:17 -07:00
..
artifacts chore: sync staging to main — 1188 commits, 5 conflicts resolved (#1743) 2026-04-23 18:30:18 +00:00
buildinfo feat(deploy): verify each tenant /buildinfo matches published SHA after redeploy 2026-04-30 10:55:08 -07:00
bundle fix(bundle): markFailed sets last_sample_error + AST gate 2026-05-04 21:08:08 -07:00
channels feat(channels): first-class Lark/Feishu support via schema-driven config 2026-04-24 11:51:15 -07:00
crypto chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
db fix(bundle): markFailed sets last_sample_error + AST gate 2026-05-04 21:08:08 -07:00
envx chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
events test(handlers): introduce events.EventEmitter interface (#1814 partial) 2026-04-26 09:05:52 -07:00
handlers feat(saas): default new workspaces to T4 on SaaS, T3 self-hosted 2026-05-05 10:30:22 -07:00
imagewatch feat(workspace-server): GHCR digest watcher closes runtime CD chain (#2114) 2026-04-26 13:36:26 -07:00
memory fix(memory v2): warn at boot when cutover env half-configured 2026-05-04 17:24:11 -07:00
metrics feat(rfc): poll-mode chat upload — phase 3 GC sweep + observability 2026-05-05 05:00:13 -07:00
middleware fix(tenant-guard): allowlist /buildinfo so redeploy verifier can reach it 2026-04-30 12:54:51 -07:00
models refactor(models): consolidate per-runtime model defaults to SSOT (RFC #2873 iter 1) 2026-05-05 04:12:37 -07:00
orgtoken fix: F1085 rm scope concat + GH#756 ValidateToken terminal guard + CI test fixes 2026-04-24 07:16:54 +00:00
pendinguploads test(pendinguploads): close cycleDone-vs-metric-record race in sweeper tests 2026-05-05 10:46:17 -07:00
plugins chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
provisioner feat(provisioner): digest-pin workspace images via runtime_image_pins (#2272 layer 1) 2026-05-03 02:30:00 -07:00
registry fix(orphan-sweeper): exclude runtime='external' from stale-token revoke 2026-05-03 00:49:37 -07:00
router feat(rfc): poll-mode chat upload — phase 1 platform staging layer 2026-05-05 04:22:24 -07:00
scheduler feat(metrics): add molecule_phantom_busy_resets_total counter (#2865) 2026-05-05 04:45:24 -07:00
supervised chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
ws chore: open-source restructure — rename dirs, remove internal files, scrub secrets 2026-04-18 00:24:44 -07:00
wsauth perf(wsauth): in-process cache for platform_inbound_secret reads 2026-05-03 00:04:38 -07:00