molecule-core/.github/workflows
Hongming Wang 6f8f7932d2 feat(ops): add sweep-aws-secrets janitor — orphan tenant bootstrap secrets
CP's deprovision flow calls Secrets.DeleteSecret() (provisioner/ec2.go:806)
but only when the deprovision runs to completion. Crashed provisions and
incomplete teardowns leak the per-tenant `molecule/tenant/<org_id>/bootstrap`
secret. At ~$0.40/secret/month, ~45 leaked secrets surfaced as ~$19/month
on the AWS cost dashboard.

The tenant_resources audit table (mig 024) tracks four kinds today —
CloudflareTunnel, CloudflareDNS, EC2Instance, SecurityGroup — and the
existing reconciler doesn't catch Secrets Manager orphans. The proper fix
(KindSecretsManagerSecret + recorder hook + reconciler enumerator) is filed
as a follow-up controlplane issue. This sweeper is the immediate stopgap.

Parallel-shape to sweep-cf-tunnels.sh:
  - Hourly schedule offset (:30, between sweep-cf-orphans :15 and
    sweep-cf-tunnels :45) so the three janitors don't burst CP admin
    at the same minute.
  - 24h grace window — never deletes a secret younger than the
    provisioning roundtrip, so an in-flight provision can't be racemurdered.
  - MAX_DELETE_PCT=50 default (mirrors sweep-cf-orphans for durable
    resources; tenant secrets should track 1:1 with live tenants).
  - Same schedule-vs-dispatch hardening as the other janitors:
    schedule → hard-fail on missing secrets, dispatch → soft-skip.
  - 8-way xargs parallelism, dry-run by default, --execute to delete.

Requires a dedicated AWS_JANITOR_* IAM principal — the prod molecule-cp
principal lacks secretsmanager:ListSecrets (it only has scoped
Get/Create/Update/Delete). The workflow's verify-secrets step will hard-fail
on the first scheduled run until those secrets are configured, surfacing
the missing setup loudly rather than silently no-op'ing.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-03 02:38:08 -07:00
..
auto-promote-on-e2e.yml chore(deps)(deps): bump imjasonh/setup-crane from 0.4 to 0.5 2026-05-02 19:23:13 +00:00
auto-promote-staging.yml ci(auto-sync): App-token dispatch + ubuntu-latest + workflow_dispatch 2026-05-01 22:28:35 -07:00
auto-sync-main-to-staging.yml chore(deps)(deps): bump actions/checkout from 4 to 6 2026-05-02 19:23:01 +00:00
auto-tag-runtime.yml chore(deps)(deps): bump actions/checkout from 4 to 6 2026-05-02 19:23:01 +00:00
block-internal-paths.yml chore(deps)(deps): bump actions/checkout from 4 to 6 2026-05-02 19:23:01 +00:00
canary-staging.yml Merge pull request #2523 from Molecule-AI/dependabot/github_actions/actions/github-script-9.0.0 2026-05-03 01:37:00 +00:00
canary-verify.yml Merge pull request #2521 from Molecule-AI/dependabot/github_actions/actions/checkout-6 2026-05-03 01:36:57 +00:00
check-merge-group-trigger.yml chore(deps)(deps): bump actions/checkout from 4 to 6 2026-05-02 19:23:01 +00:00
check-migration-collisions.yml chore(deps)(deps): bump actions/checkout from 4 to 6 2026-05-02 19:23:01 +00:00
ci.yml chore(deps)(deps): bump actions/checkout from 4 to 6 2026-05-02 19:23:01 +00:00
codeql.yml chore(deps)(deps): bump actions/checkout from 4 to 6 2026-05-02 19:23:01 +00:00
continuous-synth-e2e.yml chore(deps)(deps): bump actions/checkout from 4 to 6 2026-05-02 19:23:01 +00:00
e2e-api.yml chore(deps)(deps): bump actions/checkout from 4 to 6 2026-05-02 19:23:01 +00:00
e2e-staging-canvas.yml chore(deps)(deps): bump actions/checkout from 4 to 6 2026-05-02 19:23:01 +00:00
e2e-staging-external.yml chore(deps)(deps): bump actions/checkout from 4 to 6 2026-05-02 19:23:01 +00:00
e2e-staging-saas.yml chore(deps)(deps): bump actions/checkout from 4 to 6 2026-05-02 19:23:01 +00:00
e2e-staging-sanity.yml Merge pull request #2523 from Molecule-AI/dependabot/github_actions/actions/github-script-9.0.0 2026-05-03 01:37:00 +00:00
harness-replays.yml chore(deps)(deps): bump actions/checkout from 4 to 6 2026-05-02 19:23:01 +00:00
pr-guards.yml ci: add pr-guards caller that disables auto-merge on push 2026-04-27 06:39:31 -07:00
promote-latest.yml chore(deps)(deps): bump imjasonh/setup-crane from 0.4 to 0.5 2026-05-02 19:23:13 +00:00
publish-canvas-image.yml Merge pull request #2521 from Molecule-AI/dependabot/github_actions/actions/checkout-6 2026-05-03 01:36:57 +00:00
publish-runtime.yml chore(deps)(deps): bump actions/checkout from 4 to 6 2026-05-02 19:23:01 +00:00
publish-workspace-server-image.yml Merge pull request #2521 from Molecule-AI/dependabot/github_actions/actions/checkout-6 2026-05-03 01:36:57 +00:00
railway-pin-audit.yml Merge pull request #2523 from Molecule-AI/dependabot/github_actions/actions/github-script-9.0.0 2026-05-03 01:37:00 +00:00
redeploy-tenants-on-main.yml fix(redeploy-main): pull staging-<head_sha> instead of stale :latest 2026-05-01 23:17:59 -07:00
redeploy-tenants-on-staging.yml fix(redeploy-staging): tolerate e2e-* teardown race in fleet HTTP 500 2026-05-02 02:17:36 -07:00
retarget-main-to-staging.yml ci(retarget): handle 422 'duplicate PR' by closing redundant main-PR (closes #1884) 2026-04-26 00:53:55 -07:00
runtime-pin-compat.yml chore(deps)(deps): bump actions/checkout from 4 to 6 2026-05-02 19:23:01 +00:00
runtime-prbuild-compat.yml chore(deps)(deps): bump actions/checkout from 4 to 6 2026-05-02 19:23:01 +00:00
secret-pattern-drift.yml chore(deps)(deps): bump actions/checkout from 4 to 6 2026-05-02 19:23:01 +00:00
secret-scan.yml chore(deps)(deps): bump actions/checkout from 4 to 6 2026-05-02 19:23:01 +00:00
sweep-aws-secrets.yml feat(ops): add sweep-aws-secrets janitor — orphan tenant bootstrap secrets 2026-05-03 02:38:08 -07:00
sweep-cf-orphans.yml chore(deps)(deps): bump actions/checkout from 4 to 6 2026-05-02 19:23:01 +00:00
sweep-cf-tunnels.yml chore(deps)(deps): bump actions/checkout from 4 to 6 2026-05-02 19:23:01 +00:00
sweep-stale-e2e-orgs.yml ci: hourly sweep of stale e2e-* orgs on staging 2026-04-24 23:07:57 -07:00
test-ops-scripts.yml chore(deps)(deps): bump actions/checkout from 4 to 6 2026-05-02 19:23:01 +00:00