molecule-core

History

Hongming Wang c778b62202 feat(metrics): add molecule_phantom_busy_resets_total counter (#2865 ) Closes #2865 (split-B of the #2669 root-cause stack). The phantom-busy sweep in workspace-server/internal/scheduler/scheduler.go already logs each row reset, but no aggregate metric surfaces "how often is this firing." A regression that causes high reset rates (e.g. controlplane#481's missing env vars, or future drift in the workspace runtime's task-lifecycle accounting) only surfaces when users complain. Fix: counter exposed at /metrics as molecule_phantom_busy_resets_total, incremented from sweepPhantomBusy after each row whose active_tasks was reset. Same shape as existing molecule_websocket_connections_active. Operator-side dashboard: alert when daily phantom-busy reset count > 0.5% of active workspaces. Today's steady-state is near-zero; any increase is a regression signal. Tests: - TestTrackPhantomBusyReset_IncrementsCounter - TestTrackPhantomBusyReset_RaceFreeUnderConcurrentWrites (50×200 concurrent writes; tests atomic invariant) - TestHandler_ExposesPhantomBusyResetsCounter (asserts HELP + TYPE + value lines in Prometheus text format) - TestHandler_PhantomBusyResetsZeroByDefault (fresh-process 0 contract — prevents a future refactor from accidentally dropping the metric from /metrics) Race-detector clean. Vet clean. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>		2026-05-05 04:45:24 -07:00
..
native_scheduler_test.go	feat(runtime): native_scheduler skip — primitive #3 of 6	2026-04-26 22:47:00 -07:00
scheduler_test.go	fix(scheduler): prevent wedge on invalid UTF-8 + unbounded DB ops (#2026 )	2026-04-24 11:00:47 -07:00
scheduler.go	feat(metrics): add molecule_phantom_busy_resets_total counter (#2865 )	2026-05-05 04:45:24 -07:00