molecule-core/platform
rabbitblood b00f478b6e fix(scheduler): independent heartbeat pulse so liveness doesn't false-stale during long fires (#140)
The #95 scheduler heartbeat scheme relied on:
1. Top of tick() (once per poll interval)
2. Per-fire goroutine entry + exit

That leaves a gap: tick() ends with wg.Wait(), so if a single fire takes
longer than pollInterval (UIUX audits routinely take 60-120s; max fireTimeout
is 5min), the next tick doesn't run and no top-of-tick heartbeat fires.
Per-fire heartbeats only bracket the fire — between entry and the HTTP
response returning, nothing heartbeats either.

Observed today: /admin/liveness reports seconds_ago=251 while docker logs
show the scheduler actively firing 'Hourly ecosystem watch'. Scheduler is
fine; liveness is lying.

Adds an independent 10s heartbeat pulse goroutine inside Start(), decoupled
from tick completion. The existing heartbeats at tick top + per-fire are
kept as redundant signals but this pulse is the one that guarantees liveness
freshness regardless of what tick is doing.

Ships the exact fix proposed in #140 body.

Closes #140.
2026-04-15 03:13:41 -07:00
..
cmd fix(platform): panic-recovering supervisor for every background goroutine (#92) 2026-04-14 20:34:18 -07:00
internal fix(scheduler): independent heartbeat pulse so liveness doesn't false-stale during long fires (#140) 2026-04-15 03:13:41 -07:00
migrations fix(schedules): backfill legacy rows to 'template' + extract import SQL const 2026-04-14 14:30:22 -07:00
Dockerfile initial commit — Molecule AI platform 2026-04-13 11:55:37 -07:00
go.mod initial commit — Molecule AI platform 2026-04-13 11:55:37 -07:00
go.sum initial commit — Molecule AI platform 2026-04-13 11:55:37 -07:00