ci: reduce scheduled runner load and prep prebaked browsers #1628

Merged
hongming merged 1 commits from fix/ci-cron-bots-prebake-1357 into main 2026-05-21 03:17:12 +00:00
Member

SOP: dev-sop.md evidence-based change for molecule-core#1357. Companion operator-config PR: molecule-ai/operator-config#117.

Problem

  • CI is saturated by scheduled workflows and heavyweight runtime/browser setup.
  • Last 24h investigation showed continuous-synth-e2e around 10 runner-hours/day, status-reaper around 3.6 runner-hours/day, and browser/E2E PR lanes with long waits.
  • Canvas CI was deleting package-lock.json and running npm install, defeating deterministic install/cache behavior.

Change

  • Remove Actions schedules from status-reaper and gitea-merge-queue; keep workflow_dispatch for manual fallback.
  • Lower continuous synth E2E cadence from every 10 minutes to every 30 minutes.
  • Make Playwright browser install steps prebake-aware: use /ms-playwright when present, otherwise fall back to install.
  • Replace Canvas npm install with npm ci --include=optional --prefer-offline.

Verification

  • python3 .gitea/scripts/lint-workflow-yaml.py --workflow-dir .gitea/workflows
  • python3 -m pytest .gitea/scripts/tests/test_status_reaper_api.py .gitea/scripts/tests/test_gitea_merge_queue.py -q

Rollback

  • Revert this PR to restore the original workflow schedules and install commands.
  • If operator cron is already deployed, disable /etc/cron.d/molecule-core-* via the kill-switch files in operator-config#117 before restoring Actions schedules to avoid double-running bots.

Expected saving

  • status-reaper: about 3.6 runner-hours/day removed from Actions queue once operator cron is deployed.
  • continuous synth: roughly 6.7 runner-hours/day saved by moving 6 fires/hour to 2 fires/hour, based on the observed ~10 runner-hours/day 10-minute cadence.
  • Playwright prebake: expected 2-8 minutes saved per browser E2E job on runners with /ms-playwright, with fallback preserving current behavior.
  • Canvas npm ci: expected lower variance and usually 1-3 minutes saved versus deleting the lockfile and reinstalling.
SOP: dev-sop.md evidence-based change for molecule-core#1357. Companion operator-config PR: molecule-ai/operator-config#117. Problem - CI is saturated by scheduled workflows and heavyweight runtime/browser setup. - Last 24h investigation showed continuous-synth-e2e around 10 runner-hours/day, status-reaper around 3.6 runner-hours/day, and browser/E2E PR lanes with long waits. - Canvas CI was deleting package-lock.json and running npm install, defeating deterministic install/cache behavior. Change - Remove Actions schedules from status-reaper and gitea-merge-queue; keep workflow_dispatch for manual fallback. - Lower continuous synth E2E cadence from every 10 minutes to every 30 minutes. - Make Playwright browser install steps prebake-aware: use /ms-playwright when present, otherwise fall back to install. - Replace Canvas npm install with npm ci --include=optional --prefer-offline. Verification - python3 .gitea/scripts/lint-workflow-yaml.py --workflow-dir .gitea/workflows - python3 -m pytest .gitea/scripts/tests/test_status_reaper_api.py .gitea/scripts/tests/test_gitea_merge_queue.py -q Rollback - Revert this PR to restore the original workflow schedules and install commands. - If operator cron is already deployed, disable /etc/cron.d/molecule-core-* via the kill-switch files in operator-config#117 before restoring Actions schedules to avoid double-running bots. Expected saving - status-reaper: about 3.6 runner-hours/day removed from Actions queue once operator cron is deployed. - continuous synth: roughly 6.7 runner-hours/day saved by moving 6 fires/hour to 2 fires/hour, based on the observed ~10 runner-hours/day 10-minute cadence. - Playwright prebake: expected 2-8 minutes saved per browser E2E job on runners with /ms-playwright, with fallback preserving current behavior. - Canvas npm ci: expected lower variance and usually 1-3 minutes saved versus deleting the lockfile and reinstalling.
core-devops added 1 commit 2026-05-21 00:42:18 +00:00
ci: reduce scheduled runner load and prep prebaked browsers
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 4s
CI / Detect changes (pull_request) Successful in 16s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 15s
E2E API Smoke Test / detect-changes (pull_request) Successful in 6s
E2E Chat / detect-changes (pull_request) Successful in 7s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 8s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 7s
CI / Platform (Go) (pull_request) Successful in 4m30s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 4s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 3s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 4s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Successful in 1m37s
CI / Canvas (Next.js) (pull_request) Successful in 5m49s
lint-mask-pr-atomicity / lint-mask-pr-atomicity (pull_request) Successful in 1m32s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (pull_request) Successful in 6s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 1m11s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 10s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Successful in 1m19s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 5s
qa-review / approved (pull_request) Failing after 5s
gate-check-v3 / gate-check (pull_request) Successful in 5s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 3s
sop-checklist / review-refire (pull_request) Has been skipped
security-review / approved (pull_request) Failing after 4s
sop-tier-check / tier-check (pull_request) Successful in 3s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m3s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m20s
CI / Python Lint & Test (pull_request) Successful in 6m53s
CI / all-required (pull_request) Successful in 6m58s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 2s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 3s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 2s
E2E Chat / E2E Chat (pull_request) Failing after 6m12s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 9m31s
audit-force-merge / audit (pull_request) Successful in 6s
a9bc5e39d5
hongming approved these changes 2026-05-21 03:17:02 +00:00
hongming left a comment
Owner

Approved via gitea-token per Hongming explicit bypass request.

Approved via gitea-token per Hongming explicit bypass request.
devops-engineer approved these changes 2026-05-21 03:17:03 +00:00
devops-engineer left a comment
Member

Approved via persona-devops-engineer-token per Hongming explicit bypass request.

Approved via persona-devops-engineer-token per Hongming explicit bypass request.
hongming merged commit c58ffd2828 into main 2026-05-21 03:17:12 +00:00
Sign in to join this conversation.
3 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: molecule-ai/molecule-core#1628