molecule-core/workspace-server/cmd/server
Hongming Wang 235aca9908 fix(boot): always start health-sweep goroutine — SaaS tenants need it for external-runtime liveness
Pre-fix, cmd/server/main.go gated the entire health-sweep goroutine on
`prov != nil`. On SaaS tenants (`MOLECULE_ORG_ID` set) the local Docker
provisioner is never initialized — only `cpProv`. So the goroutine
never started, and `sweepStaleRemoteWorkspaces` (which transitions
runtime='external' workspaces from 'online' to 'awaiting_agent' when
their last_heartbeat_at goes stale) never ran.

Net effect on production: every external-runtime workspace on SaaS
that lost its agent stayed 'online' indefinitely instead of falling
back to 'awaiting_agent' (re-registrable). The drift gate (#2388)
caught the migration side and #2382 fixed the SQL writes, but this
orchestration-side gate slipped through both because there was no
SaaS-mode E2E coverage on the heartbeat-loss → awaiting_agent
transition.

Caught by #2392 (live staging external-runtime regression E2E)
failing at step 6 — 180s with no heartbeat, expected
status=awaiting_agent, got online.

Fix: drop the `if prov != nil` gate. `StartHealthSweep` already
handles nil checker correctly (healthsweep.go:50-71): the Docker
sweep is gated inside the loop, the remote sweep always runs. Test
coverage already exists at TestStartHealthSweep_NilCheckerRunsRemoteSweep.

After this lands and tenants redeploy, #2392 step 6 passes and the
regression coverage closes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 12:05:40 -07:00
..
cp_config_test.go feat(ws-server): pull env from CP on startup 2026-04-19 02:41:15 -07:00
cp_config.go fix(go): replace $1 literal with resp.Body.Close() in 7 files (#1247) 2026-04-21 03:18:21 +00:00
dotenv_test.go fix(dotenv): empty value with inline comment was returning the comment 2026-04-24 21:17:21 -07:00
dotenv.go fix(canvas,dotenv): review-driven hardening of fit gate + parser parity 2026-04-24 22:23:51 -07:00
main.go fix(boot): always start health-sweep goroutine — SaaS tenants need it for external-runtime liveness 2026-04-30 12:05:40 -07:00