[main-red] E2E API Smoke health wait times out while platform migrations are still running #2205

Closed
opened 2026-06-04 04:30:07 +00:00 by molecule-code-reviewer · 0 comments
Member

MECHANISM: molecule-core main 0ad52852fd38 fails E2E API Smoke Test / E2E API Smoke Test because .gitea/workflows/e2e-api.yml:330-342 gives platform-server only 30 one-second health probes after background start. On this run, platform startup is still applying the full migration chain during that window; the log reaches migrations through 20260523000000_schedule_consecutive_sdk_errors.up.sql and only then prints Platform starting on :45089. The health loop expires before /health becomes reachable, so downstream E2E assertions never run.

EVIDENCE: job 274520 (run 206111, head 0ad52852fd387a107eefc1d25348a3fb707cd8f2) failed at 2026-06-04T04:27:31Z. Log excerpt: Platform did not become healthy in 30s. The dumped workspace-server/platform.log shows migrations still applying at 04:26:15-04:26:26, then Platform starting on :45089 immediately after the health failure block. Workflow line .gitea/workflows/e2e-api.yml:333 hard-codes seq 1 30.

RECOMMENDED FIX SHAPE: responsible file is molecule-core/.gitea/workflows/e2e-api.yml. Increase or make adaptive the /health wait for the real migration replay path, ideally polling until platform startup or migration completion within the job timeout and surfacing the last migration/log tail on failure. Do not weaken the E2E API gate; make the readiness window match the current migration-chain startup cost.

MECHANISM: molecule-core main `0ad52852fd38` fails `E2E API Smoke Test / E2E API Smoke Test` because `.gitea/workflows/e2e-api.yml:330-342` gives platform-server only 30 one-second health probes after background start. On this run, platform startup is still applying the full migration chain during that window; the log reaches migrations through `20260523000000_schedule_consecutive_sdk_errors.up.sql` and only then prints `Platform starting on :45089`. The health loop expires before `/health` becomes reachable, so downstream E2E assertions never run. EVIDENCE: job `274520` (`run 206111`, head `0ad52852fd387a107eefc1d25348a3fb707cd8f2`) failed at `2026-06-04T04:27:31Z`. Log excerpt: `Platform did not become healthy in 30s`. The dumped `workspace-server/platform.log` shows migrations still applying at `04:26:15`-`04:26:26`, then `Platform starting on :45089` immediately after the health failure block. Workflow line `.gitea/workflows/e2e-api.yml:333` hard-codes `seq 1 30`. RECOMMENDED FIX SHAPE: responsible file is `molecule-core/.gitea/workflows/e2e-api.yml`. Increase or make adaptive the `/health` wait for the real migration replay path, ideally polling until platform startup or migration completion within the job timeout and surfacing the last migration/log tail on failure. Do not weaken the E2E API gate; make the readiness window match the current migration-chain startup cost.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: molecule-ai/molecule-core#2205