molecule-core

History

Hongming Wang 4915d1d59e fix(orphan-sweeper): reap labeled containers with no DB row (wiped-DB) The existing sweeper only reaps ws-* containers whose workspace row has status='removed'. That misses the entire wiped-DB case: an operator does `docker compose down -v` (kills the postgres volume), the previous platform's ws-* containers keep running, the new platform boots into an empty workspaces table — first pass finds zero candidates and those containers leak forever. Symptom users hit today: 7 ws-* containers from 11h ago, no rows in DB, no visibility in Canvas, eating CPU + memory. Fix shape: 1. Provisioner stamps every ws-* container + volume with `molecule.platform.managed=true`. Without a label, the sweeper would have to assume any unlabeled ws-* container might belong to a sibling platform stack on a shared Docker daemon. 2. Provisioner exposes ListManagedContainerIDPrefixes — a label-filter counterpart to the existing name-filter. 3. Sweeper splits sweepOnce into two independent passes: - sweepRemovedRows (unchanged behavior; status='removed' only) - sweepLabeledOrphansWithoutRows (new; labeled containers whose workspace_id has no row in the table at all) Each pass has its own short-circuit so an empty result or transient error in one doesn't block the other — load-bearing because the wiped-DB pass exists precisely for cases where the removed-row pass finds nothing. Safe under multi-platform-on-shared-daemon: only containers carrying our label get reaped, sibling stacks' containers are invisible to this pass. (For now the label is a constant string; a future per-instance UUID layer can refine "ours" further if a real shared-daemon scenario emerges.) Migration: existing platforms running pre-PR builds have UNLABELED ws-* containers. After this lands they continue to NOT be reaped by the new path (no label = invisible). They'll only be cleaned via manual intervention or once the operator recreates them — same as today. No regression. Tests cover all five branches of the new pass: happy-path reap, no-reap when row exists, mixed reap-some-keep-some, Docker error short-circuits cleanly, non-UUID prefixes get filtered before the SQL query. Pairs with PR #2122 (script-level fix). Together they close the orphan-leak path for both `bash scripts/nuke-and-rebuild.sh` users (handled by the script) AND `docker compose down -v` users (handled by the runtime). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>		2026-04-26 14:33:41 -07:00
..
access_test.go	chore: open-source restructure — rename dirs, remove internal files, scrub secrets	2026-04-18 00:24:44 -07:00
access.go	chore: open-source restructure — rename dirs, remove internal files, scrub secrets	2026-04-18 00:24:44 -07:00
healthsweep_test.go	chore: open-source restructure — rename dirs, remove internal files, scrub secrets	2026-04-18 00:24:44 -07:00
healthsweep.go	chore: open-source restructure — rename dirs, remove internal files, scrub secrets	2026-04-18 00:24:44 -07:00
hibernation_test.go	chore: open-source restructure — rename dirs, remove internal files, scrub secrets	2026-04-18 00:24:44 -07:00
hibernation.go	chore: open-source restructure — rename dirs, remove internal files, scrub secrets	2026-04-18 00:24:44 -07:00
liveness_test.go	chore: open-source restructure — rename dirs, remove internal files, scrub secrets	2026-04-18 00:24:44 -07:00
liveness.go	chore: open-source restructure — rename dirs, remove internal files, scrub secrets	2026-04-18 00:24:44 -07:00
orphan_sweeper_test.go	fix(orphan-sweeper): reap labeled containers with no DB row (wiped-DB)	2026-04-26 14:33:41 -07:00
orphan_sweeper.go	fix(orphan-sweeper): reap labeled containers with no DB row (wiped-DB)	2026-04-26 14:33:41 -07:00
provisiontimeout_test.go	fix(registry): runtime-aware provision-timeout sweep — give hermes 30 min	2026-04-26 01:44:09 -07:00
provisiontimeout.go	fix(registry): runtime-aware provision-timeout sweep — give hermes 30 min	2026-04-26 01:44:09 -07:00