fix(test): de-flake schedules cron UPDATE assertion (Handlers Postgres Integration BP-required red) #3157

Merged
core-devops merged 1 commits from fix/schedules-cron-next-run-wallclock-race into main 2026-06-22 05:59:20 +00:00
Member

Red being fixed

Handlers Postgres Integration / Handlers Postgres Integration (BP-required) = FAILURE on main HEAD adc2e9ff (04:46Z) — a genuine current failure, not stale (sibling CI / all-required went green at 04:49 but this required context stayed red).

Named mechanism (no flaky-dismiss)

Single failing test: TestIntegration_Schedules_CRUDRunHistoryHealth_RoundTrip, line 211:

UPDATE cron: next_run_at should have moved (orig=2026-06-23 03:00:00 UTC new=2026-06-22 05:00:00 UTC)

The assertion newNextRun.After(origNextRun) assumed the recomputed 5am next-run is always strictly later than the 3am next-run. But 0 3 * * * and 0 5 * * * are daily crons whose next occurrence wraps every 24h. When the test runs in the 03:00-05:00 UTC window, the 3am schedule has already rolled to tomorrow while the 5am is still today — so the 5am next-run is earlier and the ordering inverts.

Proven deterministically against scheduler.ComputeNextRun: the old assertion fails at exactly UTC hours 3 and 4 (the 04:46Z run window) and passes the other 22 hours. This is a time-of-day race — not testcontainers/Postgres infra, not migration-048's local constraint.

Fix

Replace the brittle ordering check with the wall-clock-independent invariants that always hold after a cron UPDATE: next_run_at (a) actually changed, and (b) lands at exactly 05:00:00 UTC, proving the new cron was applied. Preserves the test's intent without the daily-wrap bug.

Verification

  • go vet ./internal/handlers/ clean.
  • Standalone proof over all 24 UTC hours: OLD assertion fails at hours [3,4]; NEW invariants hold every hour.
  • Integration test requires INTEGRATION_DB_URL (real Postgres, booted in CI) — will run green in the Handlers Postgres Integration job.

Generated with Claude Code

## Red being fixed `Handlers Postgres Integration / Handlers Postgres Integration` (BP-required) = FAILURE on main HEAD `adc2e9ff` (04:46Z) — a genuine current failure, not stale (sibling `CI / all-required` went green at 04:49 but this required context stayed red). ## Named mechanism (no flaky-dismiss) Single failing test: `TestIntegration_Schedules_CRUDRunHistoryHealth_RoundTrip`, line 211: ``` UPDATE cron: next_run_at should have moved (orig=2026-06-23 03:00:00 UTC new=2026-06-22 05:00:00 UTC) ``` The assertion `newNextRun.After(origNextRun)` assumed the recomputed **5am** next-run is always strictly later than the **3am** next-run. But `0 3 * * *` and `0 5 * * *` are daily crons whose next occurrence wraps every 24h. When the test runs in the **03:00-05:00 UTC** window, the 3am schedule has already rolled to *tomorrow* while the 5am is still *today* — so the 5am next-run is **earlier** and the ordering inverts. Proven deterministically against `scheduler.ComputeNextRun`: the old assertion fails at **exactly UTC hours 3 and 4** (the 04:46Z run window) and passes the other 22 hours. This is a time-of-day race — **not** testcontainers/Postgres infra, **not** migration-048's `local` constraint. ## Fix Replace the brittle ordering check with the wall-clock-independent invariants that always hold after a cron UPDATE: next_run_at (a) actually changed, and (b) lands at exactly **05:00:00 UTC**, proving the new cron was applied. Preserves the test's intent without the daily-wrap bug. ## Verification - `go vet ./internal/handlers/` clean. - Standalone proof over all 24 UTC hours: OLD assertion fails at hours [3,4]; NEW invariants hold every hour. - Integration test requires `INTEGRATION_DB_URL` (real Postgres, booted in CI) — will run green in the Handlers Postgres Integration job. Generated with Claude Code
core-devops added 1 commit 2026-06-22 05:54:49 +00:00
fix(test): de-flake schedules cron UPDATE assertion — daily next-run wrap is wall-clock-dependent
CI / Python Lint & Test (pull_request) Successful in 5s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 6s
E2E Peer Visibility (literal MCP list_peers) / detect-changes (pull_request) Successful in 6s
Block integration-tester contamination artifacts / Block staging-trigger / invalid manifest contamination (pull_request) Successful in 7s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Has been skipped
Handlers Postgres Integration / detect-changes (pull_request) Successful in 6s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 8s
Harness Replays / detect-changes (pull_request) Successful in 8s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Successful in 6s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 8s
sop-checklist / review-refire (pull_request_target) Has been skipped
E2E Chat / detect-changes (pull_request) Successful in 19s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 19s
PR Diff Guard / PR diff guard (pull_request) Successful in 15s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 16s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 17s
CI / Detect changes (pull_request) Successful in 24s
E2E API Smoke Test / detect-changes (pull_request) Successful in 24s
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, local-postgres-e2
sop-checklist / na-declarations (pull_request) N/A: (none)
E2E Chat / E2E Chat (pull_request) Successful in 4s
template-delivery-e2e / detect-changes (pull_request) Successful in 17s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 4s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 1s
sop-checklist / all-items-acked (pull_request_target) Successful in 10s
CI / Canvas (Next.js) (pull_request) Successful in 3s
gate-check-v3 / gate-check (pull_request_target) Failing after 18s
CI / Canvas Deploy Status (pull_request) Successful in 1s
template-delivery-e2e / Template-asset delivery (fresh seo-agent — config+prompts via asset channel, seo-all via plugin reconcile) (pull_request) Successful in 3s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 36s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (stub) (pull_request) Successful in 48s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (real image + MiniMax LLM, advisory) (pull_request) Successful in 34s
Harness Replays / Harness Replays (pull_request) Successful in 1m22s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 2m18s
CI / Platform (Go) (pull_request) Successful in 3m25s
CI / all-required (pull_request) Successful in 4s
qa-review / approved (pull_request_target) Approved via pull_request_review trigger
reserved-path-review / reserved-path-review (pull_request_target) Approved via pull_request_review trigger
security-review / approved (pull_request_target) Approved via pull_request_review trigger
qa-review / approved (pull_request_review) Successful in 10s
reserved-path-review / reserved-path-review (pull_request_review) Successful in 10s
security-review / approved (pull_request_review) Successful in 10s
audit-force-merge / audit (pull_request_target) Successful in 10s
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Waiting to run
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Waiting to run
E2E Staging SaaS (full lifecycle) / Prune stale e2e DNS records (pull_request) Blocked by required conditions
E2E Staging SaaS (full lifecycle) / E2E Staging Platform Boot (pull_request) Waiting to run
E2E Staging SaaS (full lifecycle) / E2E Staging Concierge user_tasks (pull_request) Waiting to run
E2E Staging SaaS (full lifecycle) / E2E Staging Workspace Requests (core#2606) (pull_request) Waiting to run
E2E Staging SaaS (full lifecycle) / E2E Staging Concierge Creates Workspace (pull_request) Waiting to run
E2E Staging SaaS (full lifecycle) / E2E Staging Concierge (compile+skip) (pull_request) Waiting to run
E2E Staging SaaS (full lifecycle) / E2E Staging Concierge Platform Agent (pull_request) Waiting to run
E2E Staging SaaS (full lifecycle) / E2E Staging Plugin Install Lifecycle (pull_request) Waiting to run
fefb516bec
TestIntegration_Schedules_CRUDRunHistoryHealth_RoundTrip (Handlers Postgres
Integration, BP-required) failed on main HEAD adc2e9ff at 04:46Z with:

  UPDATE cron: next_run_at should have moved
  (orig=2026-06-23 03:00:00 UTC new=2026-06-22 05:00:00 UTC)

Root cause (named, not flaky-dismiss): the assertion `newNextRun.After(origNextRun)`
assumed the recomputed 5am next-run is always strictly later than the 3am next-run.
But "0 3 * * *" and "0 5 * * *" are daily crons whose next occurrence wraps every
24h. Whenever the test runs in the 03:00-05:00 UTC window the 3am schedule has
already rolled to *tomorrow* while the 5am schedule is still *today*, so the 5am
next-run is EARLIER than the 3am next-run and the ordering inverts. Verified
deterministically against scheduler.ComputeNextRun: the old assertion fails at
exactly UTC hours 3 and 4 (the 04:46Z run window), and passes the other 22 hours.
This is a time-of-day race, not infra/testcontainers and not migration-048.

Fix: assert the wall-clock-independent invariants that always hold after a cron
UPDATE — next_run_at (a) actually changed and (b) lands at exactly 05:00:00 UTC,
proving the new cron was applied. Preserves the test's intent (UPDATE recomputes
next_run_at from the new cron) without the daily-wrap ordering bug.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
molecule-code-reviewer approved these changes 2026-06-22 05:55:37 +00:00
molecule-code-reviewer left a comment
Member

APPROVE — Code-review axes (correctness / clarity / scope).

Verified the named mechanism: 0 3 * * * vs 0 5 * * * are daily crons; their next-occurrence wraps every 24h, so in the 03:00-05:00 UTC window the 3am next-run is tomorrow while 5am is today, inverting new.After(orig). The 04:46Z failure falls squarely in that window.

The replacement is correct and altitude-appropriate:

  • newNextRun.Equal(origNextRun) proves recompute happened (3am != 5am always).
  • Hour==5 && Minute==0 && Second==0 (UTC) proves the new cron was applied — robfig/cron minute-parser yields exact minute boundaries, and the schedule's timezone is UTC (CREATE set it; this PATCH only changes cron_expr), so 05:00:00 UTC is exact.
  • Test-only change, no production code touched; Case 4 reads next_run_at independently afterward, unaffected.

Deterministic — no residual wall-clock dependence. Intent ("UPDATE recomputes next_run_at from the new cron") preserved.

**APPROVE** — Code-review axes (correctness / clarity / scope). Verified the named mechanism: `0 3 * * *` vs `0 5 * * *` are daily crons; their next-occurrence wraps every 24h, so in the 03:00-05:00 UTC window the 3am next-run is tomorrow while 5am is today, inverting `new.After(orig)`. The 04:46Z failure falls squarely in that window. The replacement is correct and altitude-appropriate: - `newNextRun.Equal(origNextRun)` proves recompute happened (3am != 5am always). - `Hour==5 && Minute==0 && Second==0 (UTC)` proves the *new* cron was applied — robfig/cron minute-parser yields exact minute boundaries, and the schedule's timezone is UTC (CREATE set it; this PATCH only changes cron_expr), so 05:00:00 UTC is exact. - Test-only change, no production code touched; Case 4 reads next_run_at independently afterward, unaffected. Deterministic — no residual wall-clock dependence. Intent ("UPDATE recomputes next_run_at from the new cron") preserved.
core-security approved these changes 2026-06-22 05:55:40 +00:00
core-security left a comment
Member

APPROVE — Security review.

No security surface: change is confined to a single assertion in an integration test (schedules_integration_test.go). No production code, no auth/IDOR/input-handling paths, no secrets, no migrations, no dependency changes. The IDOR (Case 6) and invalid-timezone-rejection (Case 5) coverage in the same test is untouched and still gates the handler. No new attack surface or test-coverage regression. Fix correctly removes a wall-clock-dependent false-red without weakening what the test asserts.

**APPROVE** — Security review. No security surface: change is confined to a single assertion in an integration test (`schedules_integration_test.go`). No production code, no auth/IDOR/input-handling paths, no secrets, no migrations, no dependency changes. The IDOR (Case 6) and invalid-timezone-rejection (Case 5) coverage in the same test is untouched and still gates the handler. No new attack surface or test-coverage regression. Fix correctly removes a wall-clock-dependent false-red without weakening what the test asserts.
core-devops scheduled this pull request to auto merge when all checks succeed 2026-06-22 05:57:15 +00:00
agent-reviewer-cr2 approved these changes 2026-06-22 05:59:07 +00:00
agent-reviewer-cr2 left a comment
Member

5-axis current-head review clean. This is test-only in schedules_integration_test.go. The old assertion compared daily cron next-run ordering and failed during the 03:00-05:00 UTC wrap window. The replacement checks wall-clock-independent invariants: next_run_at is recomputed and the new value lands at exactly 05:00:00 UTC for cron '0 5 * * *'. No production logic, auth, or performance surface changed.

5-axis current-head review clean. This is test-only in schedules_integration_test.go. The old assertion compared daily cron next-run ordering and failed during the 03:00-05:00 UTC wrap window. The replacement checks wall-clock-independent invariants: next_run_at is recomputed and the new value lands at exactly 05:00:00 UTC for cron '0 5 * * *'. No production logic, auth, or performance surface changed.
core-devops merged commit b7e865974d into main 2026-06-22 05:59:20 +00:00
Sign in to join this conversation.
4 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: molecule-ai/molecule-core#3157