fix(e2e): safety-net teardown only sweeps this run's orgs
Previously matched every e2e-YYYYMMDD-* slug, which stomped parallel
CI runs AND manual dev probes against staging. Incident 2026-04-21
15:02Z: this workflow's safety net deleted an unrelated manual tenant
1s after it hit 'running', timing out the dev run at 15min.
Scope to f'e2e-{today}-{GITHUB_RUN_ID}-' so each run only cleans its
own leftovers. Empty run_id (local invocation) keeps the old broader
behaviour so dev safety-nets still sweep.
Also fix: the previous filter used o.get('status') which doesn't exist
on the admin API response. Now reads instance_status (the real field).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
parent
e9d111dbc6
commit
81c4c02547
10
.github/workflows/e2e-staging-saas.yml
vendored
10
.github/workflows/e2e-staging-saas.yml
vendored
@ -128,9 +128,15 @@ jobs:
|
||||
run_id = os.environ.get('GITHUB_RUN_ID', '')
|
||||
d = json.load(sys.stdin)
|
||||
today = __import__('datetime').date.today().strftime('%Y%m%d')
|
||||
# ONLY sweep slugs from *this* CI run. Previously the filter was
|
||||
# f'e2e-{today}-' which stomped on parallel CI runs AND any manual
|
||||
# E2E probes a dev was running against staging (incident 2026-04-21
|
||||
# 15:02Z: this workflow's safety net deleted an unrelated manual
|
||||
# run's tenant 1s after it hit 'running').
|
||||
prefix = f'e2e-{today}-{run_id}-' if run_id else f'e2e-{today}-'
|
||||
candidates = [o['slug'] for o in d.get('orgs', [])
|
||||
if o.get('slug','').startswith(f'e2e-{today}-')
|
||||
and o.get('status') not in ('purged',)]
|
||||
if o.get('slug','').startswith(prefix)
|
||||
and o.get('instance_status') not in ('purged',)]
|
||||
print('\n'.join(candidates))
|
||||
" 2>/dev/null)
|
||||
for slug in $orgs; do
|
||||
|
||||
Loading…
Reference in New Issue
Block a user