fix(test): de-flake schedules cron UPDATE assertion (Handlers Postgres Integration BP-required red) #3157
Reference in New Issue
Block a user
Delete Branch "fix/schedules-cron-next-run-wallclock-race"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Red being fixed
Handlers Postgres Integration / Handlers Postgres Integration(BP-required) = FAILURE on main HEADadc2e9ff(04:46Z) — a genuine current failure, not stale (siblingCI / all-requiredwent green at 04:49 but this required context stayed red).Named mechanism (no flaky-dismiss)
Single failing test:
TestIntegration_Schedules_CRUDRunHistoryHealth_RoundTrip, line 211:The assertion
newNextRun.After(origNextRun)assumed the recomputed 5am next-run is always strictly later than the 3am next-run. But0 3 * * *and0 5 * * *are daily crons whose next occurrence wraps every 24h. When the test runs in the 03:00-05:00 UTC window, the 3am schedule has already rolled to tomorrow while the 5am is still today — so the 5am next-run is earlier and the ordering inverts.Proven deterministically against
scheduler.ComputeNextRun: the old assertion fails at exactly UTC hours 3 and 4 (the 04:46Z run window) and passes the other 22 hours. This is a time-of-day race — not testcontainers/Postgres infra, not migration-048'slocalconstraint.Fix
Replace the brittle ordering check with the wall-clock-independent invariants that always hold after a cron UPDATE: next_run_at (a) actually changed, and (b) lands at exactly 05:00:00 UTC, proving the new cron was applied. Preserves the test's intent without the daily-wrap bug.
Verification
go vet ./internal/handlers/clean.INTEGRATION_DB_URL(real Postgres, booted in CI) — will run green in the Handlers Postgres Integration job.Generated with Claude Code
APPROVE — Code-review axes (correctness / clarity / scope).
Verified the named mechanism:
0 3 * * *vs0 5 * * *are daily crons; their next-occurrence wraps every 24h, so in the 03:00-05:00 UTC window the 3am next-run is tomorrow while 5am is today, invertingnew.After(orig). The 04:46Z failure falls squarely in that window.The replacement is correct and altitude-appropriate:
newNextRun.Equal(origNextRun)proves recompute happened (3am != 5am always).Hour==5 && Minute==0 && Second==0 (UTC)proves the new cron was applied — robfig/cron minute-parser yields exact minute boundaries, and the schedule's timezone is UTC (CREATE set it; this PATCH only changes cron_expr), so 05:00:00 UTC is exact.Deterministic — no residual wall-clock dependence. Intent ("UPDATE recomputes next_run_at from the new cron") preserved.
APPROVE — Security review.
No security surface: change is confined to a single assertion in an integration test (
schedules_integration_test.go). No production code, no auth/IDOR/input-handling paths, no secrets, no migrations, no dependency changes. The IDOR (Case 6) and invalid-timezone-rejection (Case 5) coverage in the same test is untouched and still gates the handler. No new attack surface or test-coverage regression. Fix correctly removes a wall-clock-dependent false-red without weakening what the test asserts.5-axis current-head review clean. This is test-only in schedules_integration_test.go. The old assertion compared daily cron next-run ordering and failed during the 03:00-05:00 UTC wrap window. The replacement checks wall-clock-independent invariants: next_run_at is recomputed and the new value lands at exactly 05:00:00 UTC for cron '0 5 * * *'. No production logic, auth, or performance surface changed.