fix(e2e): fail teardown on leaked EC2 #1660
Reference in New Issue
Block a user
Delete Branch "fix/e2e-aws-leak-verification"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Phase 1 evidence
Brief claim: E2E teardown can leak EC2 after CP reports clean.
Evidence confirmed: tests/e2e/test_staging_full_saas.sh deleted the CP tenant, then only polled /cp/admin/orgs before printing clean. The observed leak class was an EC2 whose Name tag still contained the E2E slug after org/secrets were gone, so CP-org-based sweepers could not find it.
Affected surfaces:
Phase 2 design
Add a focused AWS EC2 verifier after CP org teardown. In CI, the verifier is required and uses slug-tagged EC2 lookup. If matching EC2 remains after the poll budget, it optionally terminates the leaked instances and exits with the existing leak rc=4. Local runs stay usable via auto/off modes.
Rollback: revert this PR; it only changes E2E cleanup verification and workflow env wiring.
Changes
Verification
QA review: approved. Covered focused shell regression tests for auto/required/missing-AWS, clean EC2, persistent leak, and terminate-on-leak paths. Also verified bash syntax, cleanup-trap lint, workflow YAML lint, and existing workflow YAML pytest locally before PR.
Security review: approved. The AWS query is scoped to the per-run E2E slug in EC2 Name tags, and termination is gated behind E2E_AWS_TERMINATE_LEAKS=1. No credential values are logged; workflow changes only reference existing secret names.
core-security 5273 Security review: approved. AWS lookup is slug-scoped, termination is opt-in via CI env, and secret values are not logged.
Security review: APPROVED.
QA review: APPROVED.
/qa-recheck
/security-recheck
/sop-ack 1 local tests listed in PR body verified
/sop-ack 2 N/A rationale accepted: no DB or migration path touched
/sop-ack 3 staging smoke pending post-merge with live EC2 pre-scan clean
/sop-ack 4 root cause is CP-org-only teardown verification blind spot
/sop-ack 5 five-axis review evidence present
/sop-ack 6 no shim/dead-code added
/sop-ack 7 memory/SOP context consulted