forked from molecule-ai/molecule-core
Two CICD-review quick wins consolidated into one PR:
# 1. e2e-staging-saas now fires on staging, not just main
The full-lifecycle SaaS E2E was main-only, so it caught regressions
AFTER they shipped to staging (and into the auto-promote PR). Adding
`staging` to the push + pull_request branch list catches them BEFORE
the staging→main promotion opens, making canary's green into
auto-promote-staging meaningfully more trustworthy.
paths-filter is unchanged, so the blast radius stays the same — only
provisioning-critical changes trigger the ~25-35 min run.
# 2. Canary auto-issue thresholded at 3 consecutive failures
The 30-min canary was opening "🔴 Canary failing" issues on every
single failure and de-duping via title match. Transient flakes (CF DNS
hiccup, AWS API blip) generated noise.
Now: on first failure, look up the prior `THRESHOLD-1` runs of this
same workflow. Only file an issue when ALL of those also failed (i.e.
this is the 3rd consecutive red, ~90 min of sustained failure). If an
issue is already open we still comment per-failure so the streak is
visible.
Threshold rationale: canary fires every 30 min, so 3 reds = ~90 min
of sustained failure — past any single-run flake but well inside the
deploy window so a real outage still surfaces fast.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
||
|---|---|---|
| .. | ||
| auto-promote-staging.yml | ||
| auto-tag-runtime.yml | ||
| block-internal-paths.yml | ||
| canary-staging.yml | ||
| canary-verify.yml | ||
| check-merge-group-trigger.yml | ||
| ci.yml | ||
| codeql.yml | ||
| e2e-api.yml | ||
| e2e-staging-canvas.yml | ||
| e2e-staging-saas.yml | ||
| e2e-staging-sanity.yml | ||
| promote-latest.yml | ||
| publish-canvas-image.yml | ||
| publish-runtime.yml | ||
| publish-workspace-server-image.yml | ||
| redeploy-tenants-on-main.yml | ||
| retarget-main-to-staging.yml | ||
| runtime-pin-compat.yml | ||
| sweep-cf-orphans.yml | ||
| sweep-stale-e2e-orgs.yml | ||
| test-ops-scripts.yml | ||