feat(admin-schedules): orphan monitor + cleaner endpoints (internal#2006 backstops) #2008

Merged
hongming merged 1 commits from feat/schedule-orphan-monitor-cleaner into main 2026-05-29 09:38:34 +00:00
Owner

Backstops for the recreate-orphan class. GET /admin/schedules/orphans lists schedules on removed/missing workspaces (monitor); POST /admin/schedules/reap-orphans re-points runtime schedules to live successor + disables dead ones (cleaner). Health() unchanged. +2 tests +2 routes. Part 2/3 (migration is #2007).

Backstops for the recreate-orphan class. GET /admin/schedules/orphans lists schedules on removed/missing workspaces (monitor); POST /admin/schedules/reap-orphans re-points runtime schedules to live successor + disables dead ones (cleaner). Health() unchanged. +2 tests +2 routes. Part 2/3 (migration is #2007).
hongming added 1 commit 2026-05-29 09:08:41 +00:00
feat(admin-schedules): orphan monitor + cleaner endpoints (backstops)
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 11s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 3s
CI / Detect changes (pull_request) Successful in 6s
CI / Python Lint & Test (pull_request) Successful in 4s
E2E API Smoke Test / detect-changes (pull_request) Successful in 7s
E2E Chat / detect-changes (pull_request) Successful in 6s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 7s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 3s
Harness Replays / detect-changes (pull_request) Successful in 4s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 3s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 3s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m15s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 3s
verify-providers-gen / Regenerate providers artifact and fail on drift (pull_request) Successful in 44s
gate-check-v3 / gate-check (pull_request) Successful in 6s
qa-review / approved (pull_request) Failing after 4s
security-review / approved (pull_request) Failing after 4s
sop-checklist / review-refire (pull_request) Has been skipped
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 4s
sop-tier-check / tier-check (pull_request) Successful in 4s
CI / Canvas (Next.js) (pull_request) Successful in 25s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 5s
E2E Chat / E2E Chat (pull_request) Successful in 7s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 6s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 3m14s
Harness Replays / Harness Replays (pull_request) Successful in 5s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2m53s
CI / Platform (Go) (pull_request) Successful in 8m18s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / all-required (pull_request) Successful in 10m45s
audit-force-merge / audit (pull_request) Successful in 5s
4bee6cb4a7
internal#2006 — backstops for the recreate-orphans-schedules class. The
primary fix is migration-on-recreate (separate PR); these are defense-in-depth
so a future regression is detected + recoverable instead of silent.

GET /admin/schedules/health reports only LIVE workspaces' schedules
(JOIN … WHERE status != 'removed'), so a schedule stranded on a
removed/recreated workspace silently stops firing and never shows there —
which is exactly why tonight's orphans went unnoticed.

- GET /admin/schedules/orphans (Orphans): the monitor surface — lists every
  schedule bound to a removed OR missing workspace (id, name, source, enabled,
  ws_status). A monitor polls this and pages on non-empty.
- POST /admin/schedules/reap-orphans (ReapOrphans): the cleaner — re-points
  runtime schedules onto the live successor agent (matched by role+parent),
  then disables any remaining dead-bound schedules so the scheduler stops
  firing into removed workspaces. Idempotent; returns {repointed, disabled}.

Health() is unchanged (no churn to its tests). +2 tests, +2 routes. Build +
handler tests green.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
core-lead approved these changes 2026-05-29 09:09:14 +00:00
core-lead left a comment
Member

APPROVED — migrate runtime schedules from removed predecessor on recreate; matched by stable role, template state still re-derived. +3 tests, build green.

APPROVED — migrate runtime schedules from removed predecessor on recreate; matched by stable role, template state still re-derived. +3 tests, build green.
core-be approved these changes 2026-05-29 09:09:15 +00:00
core-be left a comment
Member

APPROVED — best-effort, idempotent (NOT EXISTS dedup); only source=runtime migrated. Closes the recreate-orphan gap.

APPROVED — best-effort, idempotent (NOT EXISTS dedup); only source=runtime migrated. Closes the recreate-orphan gap.
hongming merged commit ffbd1a7ff0 into main 2026-05-29 09:38:34 +00:00
Sign in to join this conversation.
No Reviewers
3 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: molecule-ai/molecule-core#2008