test(staginge2e): data-volume survives recreate e2e (core#2332 P0.5) #2336
Reference in New Issue
Block a user
Delete Branch "e2e/data-persistence-recreate-2332"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
What
Closes the data-persistence coverage gap flagged in core#2332 (P0.5): "data-volume survives recreate" and "snapshot-before-container-swap (/home/agent not wiped)" had no e2e. Both map to a real past incident (
feedback_workspace_container_swap_wipes_home_agent): on a container swap only the/configs+/workspacebinds (the durable data volume, cp#326) survive; the container's own$HOME(/home/agent) is ephemeral and is wiped unless a snapshot precedesdocker stop+rm+run.Persistence invariant asserted
LOAD-BEARING: a workspace created with
compute.data_persistence="persist"must have its/workspacesentinel SURVIVE a recreate / container-swap on the same data volume (cp#326). A wipe fails loud asDATA-VOLUME REGRESSION.The recreate is driven via
POST /workspaces/:id/restart, whose handler callsStopwithprune=false(restart can never erase the data volume — seeworkspace_restart.gocpStopWithRetryErr) then re-provisions on the same volume./home/agent (ephemeral side)
The container-
$HOMEbrowse/write surface is?root=/agent-home, which is stubbed 501 today (internal#425 RFC Phase 2b pending) — because/home/agentis ephemeral and has no durable write path. The test pins that 501 contract and fails loud if it flips to 200 without durable backing + a snapshot-before-swap hook, rather than asserting a wipe (which would fail-open: a no-op write would also "pass"). The snapshot-before-stop+rm+run rule itself is a CP-side provisioner concern, not a tenant ws-server file-API surface.Harness
New package
workspace-server/internal/staginge2e(build tag//go:build staging_e2e), mirroring the CPinternal/staginge2eidioms:STAGING_E2E=1master switch; skips loud when unset, lists missing vars when partially configured. Never fails-open; excluded from defaultgo test ./...by the build tag.TENANT_HOST/TENANT_ADMIN_TOKEN/MOLECULE_ORG_IDwith the SaaS auth-chain headers (Authorization + X-Molecule-Org-Id + Origin).DELETE /workspaces/:id).Validation
go vet -tags staging_e2e ./internal/staginge2e/...— cleango test ./...→[no test files](tag excludes it)Notes
🤖 Generated with Claude Code
APPROVED on current head
6e90d589fe.Fast-track 5-axis: additive staging-e2e test/doc only (
workspace-server/internal/staginge2e/data_persistence_test.go,doc.go), no product-code behavior change. Fail-closed confirmed: the suite is build-tag gated (staging_e2e) and runtime-gated (STAGING_E2E=1); when enabled, missing required env skips loudly with explicit missing vars, while the load-bearing path creates a persist workspace, writes a unique /workspace sentinel, verifies pre-recreate readback, triggers POST /restart, waits online, and t.Fatalf if the sentinel does not survive. The /agent-home contract probe fails loud if the ephemeral surface becomes writable without extending durability/snapshot coverage; it does not assert wipe-as-pass. Required CI/all-required is green. Note: live Gitea currently reports mergeable=false, so this approval is not a merge-ready signal until the branch is rebased/mergeability is refreshed.Fast-track 5-axis-lite on current head
6e90d589fe.Scope is additive test/docs only: new workspace-server/internal/staginge2e data-persistence test plus package doc. No product-code or runtime behavior changes.
Fail-closed confirmed: the test is gated by the staging_e2e build tag and STAGING_E2E + tenant credentials; absent prerequisites skip loudly. Once enabled, it fails hard on workspace create/online/restart/read errors, sentinel mismatch after recreate, unexpected /agent-home 5xx, or /agent-home becoming writable without a durability/snapshot assertion. The durable /workspace sentinel uses unique content and exact readback, so stale state cannot produce a false green.
5-axis: correctness targets the data-volume-survives-recreate invariant; robustness includes cleanup and bounded polling; security uses tenant auth headers only from env and does not expose secrets; performance is infra-bound and dark by default; readability is clear with explicit suite contract in doc.go.
Required contexts are green: CI/all-required, E2E API Smoke Test, Handlers Postgres Integration.
6e90d589feto3180a1109cNew commits pushed, approval review dismissed automatically according to repository settings
New commits pushed, approval review dismissed automatically according to repository settings
3180a1109cto2f369e6362Close the data-persistence coverage gap: "data-volume survives recreate" and "snapshot-before-container-swap (/home/agent not wiped)" had NO e2e, and both map to a real past incident — on a container swap only the /configs + /workspace binds (the durable data volume, cp#326) survive; the container's own $HOME (/home/agent) is ephemeral and is wiped unless snapshotted before docker stop+rm+run. Adds internal/staginge2e (new package, build tag //go:build staging_e2e) to the workspace-server module with a real-infra e2e that drives the tenant ws-server HTTP API against a staging tenant: 1. create a workspace with compute.data_persistence="persist"; online 2. write a unique sentinel into /workspace (?root=/workspace, the data volume per cp#326) and read it back 3. encode the /home/agent contract: ?root=/agent-home is the container -$HOME surface and is stubbed 501 *because* it is ephemeral — assert the 501 contract; fail loud if it flips to 200 without durable backing + a snapshot-before-swap hook 4. trigger a recreate / container-swap on the SAME data volume via POST /restart (Stop is prune=false for restart, so a recreate can never erase the data volume) 5. LOAD-BEARING: assert the /workspace sentinel SURVIVES — a wipe here fails loud as a DATA-VOLUME REGRESSION Env-gated/skip-loud exactly like the CP staginge2e siblings: STAGING_E2E=1 master switch + TENANT_HOST / TENANT_ADMIN_TOKEN / MOLECULE_ORG_ID. Never fails-open; excluded from the default `go test ./...` by the build tag. Promote-to-required is a CTO call (infra-bound suite; see doc.go). Validated: go vet -tags staging_e2e ./internal/staginge2e/... clean; default `go test ./...` shows [no test files]; tagged run without creds SKIPs loud (and with partial creds lists the missing vars). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>2f369e6362to37942699d3