molecule-core

History

Hongming Wang c01f057e6b ci: shift e2e-staging-saas to staging + threshold canary auto-issue at 3 reds Two CICD-review quick wins consolidated into one PR: # 1. e2e-staging-saas now fires on staging, not just main The full-lifecycle SaaS E2E was main-only, so it caught regressions AFTER they shipped to staging (and into the auto-promote PR). Adding `staging` to the push + pull_request branch list catches them BEFORE the staging→main promotion opens, making canary's green into auto-promote-staging meaningfully more trustworthy. paths-filter is unchanged, so the blast radius stays the same — only provisioning-critical changes trigger the ~25-35 min run. # 2. Canary auto-issue thresholded at 3 consecutive failures The 30-min canary was opening "🔴 Canary failing" issues on every single failure and de-duping via title match. Transient flakes (CF DNS hiccup, AWS API blip) generated noise. Now: on first failure, look up the prior `THRESHOLD-1` runs of this same workflow. Only file an issue when ALL of those also failed (i.e. this is the 3rd consecutive red, ~90 min of sustained failure). If an issue is already open we still comment per-failure so the streak is visible. Threshold rationale: canary fires every 30 min, so 3 reds = ~90 min of sustained failure — past any single-run flake but well inside the deploy window so a real outage still surfaces fast. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>		2026-04-26 12:02:52 -07:00
..
auto-promote-staging.yml	ci: canary-verify graceful-skip + draft auto-promote staging→main	2026-04-22 22:39:23 +00:00
auto-tag-runtime.yml	feat(platform/admin): /admin/workspace-images/refresh + Docker SDK + GHCR auth	2026-04-26 10:17:21 -07:00
block-internal-paths.yml	ci(block-paths): fetch PR base SHA to fix shallow-clone diff failure	2026-04-24 12:01:53 +00:00
canary-staging.yml	ci: shift e2e-staging-saas to staging + threshold canary auto-issue at 3 reds	2026-04-26 12:02:52 -07:00
canary-verify.yml	ci: canary-verify graceful-skip + draft auto-promote staging→main	2026-04-22 22:39:23 +00:00
check-merge-group-trigger.yml	ci: add linter that fails when required workflow lacks merge_group trigger	2026-04-24 00:33:05 -07:00
ci.yml	test(workspace): centralize pytest-cov config + 92% floor (closes #1817 )	2026-04-26 06:21:22 -07:00
codeql.yml	ci: add merge_group trigger to ci + codeql	2026-04-23 21:24:53 -07:00
e2e-api.yml	feat(ci): run E2E API smoke test on staging branch	2026-04-23 17:47:47 -07:00
e2e-staging-canvas.yml	feat(ci): run E2E Staging Canvas on staging branch pushes	2026-04-23 17:47:51 -07:00
e2e-staging-saas.yml	ci: shift e2e-staging-saas to staging + threshold canary auto-issue at 3 reds	2026-04-26 12:02:52 -07:00
e2e-staging-sanity.yml	fix(e2e): CP DELETE /cp/admin/tenants body uses 'confirm', not 'confirm_token'	2026-04-21 04:50:28 -07:00
promote-latest.yml	perf(ci): move all public-repo workflows to ubuntu-latest	2026-04-22 12:56:49 -07:00
publish-canvas-image.yml	perf(ci): move all public-repo workflows to ubuntu-latest	2026-04-22 12:56:49 -07:00
publish-runtime.yml	feat(platform/admin): /admin/workspace-images/refresh + Docker SDK + GHCR auth	2026-04-26 10:17:21 -07:00
publish-workspace-server-image.yml	ci(publish-image): also tag :staging-latest so CP auto-picks up new builds	2026-04-24 00:29:55 -07:00
redeploy-tenants-on-main.yml	ci(redeploy): fire post-main tenant fleet redeploy via CP admin endpoint	2026-04-24 14:34:28 -07:00
retarget-main-to-staging.yml	ci(retarget): handle 422 'duplicate PR' by closing redundant main-PR (closes #1884 )	2026-04-26 00:53:55 -07:00
runtime-pin-compat.yml	fix(ci): set WORKSPACE_ID for the runtime-pin smoke import	2026-04-26 01:59:56 -07:00
sweep-cf-orphans.yml	fix(ci): stop sweep-cf-orphans noise — drop merge_group + soft-skip when secrets unset	2026-04-26 08:05:53 -07:00
sweep-stale-e2e-orgs.yml	ci: hourly sweep of stale e2e-* orgs on staging	2026-04-24 23:07:57 -07:00
test-ops-scripts.yml	refactor(ops): apply simplify findings on #2027 PR	2026-04-26 00:28:15 -07:00