molecule-core

History

Hongming Wang 92d99d96fe fix(provisioner): treat "removal already in progress" as no-op success Cascade-deleting a 7-workspace org returned 500 with "workspace marked removed, but 2 stop call(s) failed — please retry: stop eeb99b5d-...: force-remove ws-eeb99b5d-607: Error response from daemon: removal of container ws-eeb99b5d-607 is already in progress" even though the DB-side post-condition succeeded (removed_count=7) and the containers WERE removed shortly after. The fanout fired Stop() on every workspace concurrently and the orphan sweeper happened to reap two of them at the same instant, so Docker rejected the second ContainerRemove with "removal already in progress" — a race-condition ack, not a real failure. Retrying just races the same in-flight removal. The post-condition we care about (the container WILL be gone) is identical to a successful removal, so Stop() should treat it the same way it already treats "No such container" — a no-op return nil that lets the caller proceed with volume cleanup. Real daemon failures (timeout, EOF, ctx cancel) still surface as errors. Two pieces: - New isRemovalInProgress() predicate using the same string-match approach as isContainerNotFound (docker/docker has no typed errdef for this; the CLI itself relies on the message). - Stop() now treats the predicate as success, with a log line distinct from the not-found path so debugging can tell which race fired. Both substrings ("removal of container" + "already in progress") must match — "already in progress" alone would false-positive on unrelated operations like image pulls. Truth table pinned in 7 new test cases. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>		2026-04-27 13:25:32 -07:00
..
backend_contract_test.go	fix(provisioner): nil guards on Stop/IsRunning, unblock contract tests (closes #1813 )	2026-04-26 02:17:51 -07:00
cp_provisioner_instance_id_test.go	test: regression guard for #1738 — cp-provisioner uses real instance_id	2026-04-23 17:45:13 -07:00
cp_provisioner_test.go	fix(cp-provisioner): look up real EC2 instance_id for Stop + IsRunning (#1738 )	2026-04-23 18:25:29 +00:00
cp_provisioner.go	test(provisioner): unblock TestProvisionWorkspaceCP_NoInternalErrorsInBroadcast (#1814 )	2026-04-27 03:28:25 -07:00
isrunning_test.go	fix(provisioner): treat "removal already in progress" as no-op success	2026-04-27 13:25:32 -07:00
platform_test.go	fix(provisioner): force linux/amd64 pull + create on Apple Silicon hosts (#1875 )	2026-04-23 14:55:34 -07:00
provisioner_test.go	feat(provisioner): pull workspace-template images from GHCR	2026-04-22 12:39:56 -07:00
provisioner.go	fix(provisioner): treat "removal already in progress" as no-op success	2026-04-27 13:25:32 -07:00