forked from molecule-ai/molecule-core
Two changes that close one of the leak classes from the
molecule-controlplane#420 vCPU audit:
1. sweep-stale-e2e-orgs.yml: cron */15 (was hourly), MAX_AGE_MINUTES
30 (was 120). E2E runs are 8-25 min wall clock; 30 min is safely
above the longest run while shrinking the worst-case leak window
from ~2h to ~45 min (15-min sweep cadence + 30-min threshold).
2. canary-staging.yml teardown: the per-slug DELETE used `>/dev/null
|| true`, which swallowed every failure. A 5xx or timeout from CP
looked identical to "successfully deleted" and the canary tenant
kept eating ~2 vCPU until the sweeper caught it. Now we capture
the response code and surface non-2xx as a workflow warning that
names the leaked slug.
The exit semantics stay unchanged — a single-canary cleanup miss
shouldn't fail-flag the canary itself when the actual smoke check
passed. The sweeper is the safety net for whatever slips past.
Caught during the molecule-controlplane#420 audit on 2026-05-03 —
3 e2e canary tenant orphans were running for 24-95 min, all under
the previous 120-min sweep threshold so they went unnoticed until
manual cleanup. Same `|| true` pattern exists in
e2e-staging-{canvas,external,saas,sanity}.yml; out of scope for
this PR (mechanical port; tracking separately) but the sweeper
tightening covers all of them by reducing the safety-net latency.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|---|---|---|
| .. | ||
| auto-promote-on-e2e.yml | ||
| auto-promote-staging.yml | ||
| auto-sync-main-to-staging.yml | ||
| auto-tag-runtime.yml | ||
| block-internal-paths.yml | ||
| canary-staging.yml | ||
| canary-verify.yml | ||
| cascade-list-drift-gate.yml | ||
| check-merge-group-trigger.yml | ||
| check-migration-collisions.yml | ||
| ci.yml | ||
| codeql.yml | ||
| continuous-synth-e2e.yml | ||
| e2e-api.yml | ||
| e2e-staging-canvas.yml | ||
| e2e-staging-external.yml | ||
| e2e-staging-saas.yml | ||
| e2e-staging-sanity.yml | ||
| harness-replays.yml | ||
| pr-guards.yml | ||
| promote-latest.yml | ||
| publish-canvas-image.yml | ||
| publish-runtime.yml | ||
| publish-workspace-server-image.yml | ||
| railway-pin-audit.yml | ||
| redeploy-tenants-on-main.yml | ||
| redeploy-tenants-on-staging.yml | ||
| retarget-main-to-staging.yml | ||
| runtime-pin-compat.yml | ||
| runtime-prbuild-compat.yml | ||
| secret-pattern-drift.yml | ||
| secret-scan.yml | ||
| sweep-aws-secrets.yml | ||
| sweep-cf-orphans.yml | ||
| sweep-cf-tunnels.yml | ||
| sweep-stale-e2e-orgs.yml | ||
| test-ops-scripts.yml | ||