test(e2e): poll for running container after workspace online in local-provision lifecycle #2659
Reference in New Issue
Block a user
Delete Branch "fix/local-provision-container-race-poll"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Replace the single
container_runningsample after the workspace reachesonlineintests/e2e/test_local_provision_lifecycle_e2e.shwith a 10-second bounded poll.RegistryHandler.Register marks the workspace 'online' as soon as the agent registers, but the
ws-<id>container may not be visible/stable on the shared docker-host for a short moment after that. The existing single sample intermittently fails with "no running ws- container" right after "workspace reached online".The poll preserves the hard failure if no container appears within the window, so a genuine missing-container still fails the test.
Test-only change; no product code modified.
Relates-to: molecule-core#2615
REQUEST_CHANGES: reviewed head
852302723e. The intended local-provision lifecycle change is reasonable: it replaces the single post-online container sample with a bounded 10s poll while preserving hard failure if no container appears. However this PR also changes two unrelated workflow tracker comments in e2e-chat.yml and e2e-staging-external.yml from mc#1982 to mc#2654, overlapping with #2657 and not mentioned in this PR body/title. Please rebase/scope the branch so #2659 contains only the lifecycle poll change, or explicitly retitle/body it if this PR is intended to subsume #2657. I did not find a bug in the poll itself.852302723etoe43e3b700bREQUEST_CHANGES: re-reviewed head
e43e3b700b. The prior mixed-scope blocker is resolved: the diff is now only tests/e2e/test_local_provision_lifecycle_e2e.sh, and the bounded post-online container poll itself still looks sound. However I cannot approve on the requested basis because the Local Provision Lifecycle E2E (stub) job is currently red on this head (run 353677/job 478450). The new poll passes, but the test later fails after restart with workspace status=failed and container logs showing invalid workspace auth token during register/heartbeat. Please get the stub job green or identify that failure as a separately accepted blocker before re-requesting approval.e43e3b700btofd52345509APPROVED: reviewed head
fd52345509with the 5-axis lens. CI / all-required is green and the Local Provision Lifecycle stub is green. The diff is scoped to one test file and replaces the single post-online container sample with a bounded 10s poll while preserving a hard failure if the ws container never appears. This addresses the observed registration/container-visibility race without weakening the lifecycle assertion, touching production code, or adding security/performance risk. No blockers found.