molecule-core/.github/workflows
Hongming Wang d45241cae7 fix(ci): distinguish unreachable from stale in /buildinfo verify step
The /buildinfo verify step (PR #2398) was treating "no /buildinfo response"
the same as "tenant returned wrong SHA" — both bumped MISMATCH_COUNT and
hard-failed the workflow. First post-merge run on staging caught a real
edge case: ephemeral E2E tenants (slug e2e-20260430-...) get torn down by
the E2E teardown trap between CP's healthz_ok snapshot and the verify step
running, so the verify step would dial into DNS that no longer resolves
and hard-fail on a benign condition.

The bug class we actually care about is STALE (tenant up + serving old
code, the #2395 root). UNREACHABLE post-redeploy is almost always a benign
teardown race; real "tenant up but unreachable" is caught by CP's own
healthz monitor + the alert pipeline, so double-counting it here was
making this workflow flaky on every staging push that overlapped E2E.

Wire:
  - Split MISMATCH_COUNT into STALE_COUNT + UNREACHABLE_COUNT.
  - STALE → hard-fail the workflow (the bug class we're guarding).
  - UNREACHABLE → :⚠️:, don't fail. Reachable-mismatch still hard-fails.
  - Job summary surfaces both lists separately so on-call can tell at a
    glance which class fired.

Mirror in redeploy-tenants-on-main.yml for shape parity (prod has fewer
ephemeral tenants but identical asymmetry would be a gratuitous fork).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 11:25:46 -07:00
..
auto-promote-on-e2e.yml fix(ci): handle empty E2E lookup in auto-promote-on-e2e gate 2026-04-30 10:07:52 -07:00
auto-promote-staging.yml ci(auto-promote): dispatch publish via molecule-ai App token to unblock workflow_run chain 2026-04-30 08:55:49 -07:00
auto-sync-main-to-staging.yml fix(ci): auto-sync opens a PR + uses merge queue, not direct push 2026-04-28 15:59:26 -07:00
auto-tag-runtime.yml chore(security): pin Actions to SHAs + enable Dependabot auto-bumps 2026-04-28 15:37:06 -07:00
block-internal-paths.yml chore(security): pin Actions to SHAs + enable Dependabot auto-bumps 2026-04-28 15:37:06 -07:00
canary-staging.yml chore(security): pin Actions to SHAs + enable Dependabot auto-bumps 2026-04-28 15:37:06 -07:00
canary-verify.yml chore(security): pin Actions to SHAs + enable Dependabot auto-bumps 2026-04-28 15:37:06 -07:00
check-merge-group-trigger.yml chore(security): pin Actions to SHAs + enable Dependabot auto-bumps 2026-04-28 15:37:06 -07:00
check-migration-collisions.yml fix(ci): drop --depth=1 from migration collision check fetch 2026-04-30 05:28:03 -07:00
ci.yml ci: collapse all 4 path-filtered required checks to single-job-with-conditional-steps 2026-04-29 16:09:22 -07:00
codeql.yml chore(deps): batch dep bumps — 6 safe upgrades (4 actions majors + 2 npm dev deps) 2026-04-28 17:44:55 -07:00
continuous-synth-e2e.yml ci: continuous synthetic E2E against staging (#2342) 2026-04-29 22:04:57 -07:00
e2e-api.yml test(e2e): poll-mode + since_id cursor round-trip (#2339 PR 4) 2026-04-29 23:07:10 -07:00
e2e-staging-canvas.yml fix(e2e-canvas): kill teardown race that poisons concurrent runs 2026-04-29 19:23:56 -07:00
e2e-staging-saas.yml chore(security): pin Actions to SHAs + enable Dependabot auto-bumps 2026-04-28 15:37:06 -07:00
e2e-staging-sanity.yml chore(security): pin Actions to SHAs + enable Dependabot auto-bumps 2026-04-28 15:37:06 -07:00
pr-guards.yml ci: add pr-guards caller that disables auto-merge on push 2026-04-27 06:39:31 -07:00
promote-latest.yml chore(security): pin Actions to SHAs + enable Dependabot auto-bumps 2026-04-28 15:37:06 -07:00
publish-canvas-image.yml chore(security): pin Actions to SHAs + enable Dependabot auto-bumps 2026-04-28 15:37:06 -07:00
publish-runtime.yml Merge pull request #2249 from Molecule-AI/fix/publish-runtime-cascade-hard-fail-on-push 2026-04-29 01:33:10 +00:00
publish-workspace-server-image.yml feat(deploy): verify each tenant /buildinfo matches published SHA after redeploy 2026-04-30 10:55:08 -07:00
railway-pin-audit.yml ci: daily Railway pin-audit cron + issue-on-failure (#2169) 2026-04-29 17:43:01 -07:00
redeploy-tenants-on-main.yml fix(ci): distinguish unreachable from stale in /buildinfo verify step 2026-04-30 11:25:46 -07:00
redeploy-tenants-on-staging.yml fix(ci): distinguish unreachable from stale in /buildinfo verify step 2026-04-30 11:25:46 -07:00
retarget-main-to-staging.yml ci(retarget): handle 422 'duplicate PR' by closing redundant main-PR (closes #1884) 2026-04-26 00:53:55 -07:00
runtime-pin-compat.yml chore(deps): batch dep bumps — 6 safe upgrades (4 actions majors + 2 npm dev deps) 2026-04-28 17:44:55 -07:00
runtime-prbuild-compat.yml chore(deps): batch dep bumps — 6 safe upgrades (4 actions majors + 2 npm dev deps) 2026-04-28 17:44:55 -07:00
secret-pattern-drift.yml chore(deps): batch dep bumps — 6 safe upgrades (4 actions majors + 2 npm dev deps) 2026-04-28 17:44:55 -07:00
secret-scan.yml chore(security): pin Actions to SHAs + enable Dependabot auto-bumps 2026-04-28 15:37:06 -07:00
sweep-cf-orphans.yml Merge pull request #2248 from Molecule-AI/fix/sweep-cf-orphans-hard-fail-on-schedule 2026-04-29 01:16:22 +00:00
sweep-cf-tunnels.yml feat(ops): add sweep-cf-tunnels janitor — orphan Cloudflare Tunnels accumulate 2026-04-29 19:42:47 -07:00
sweep-stale-e2e-orgs.yml ci: hourly sweep of stale e2e-* orgs on staging 2026-04-24 23:07:57 -07:00
test-ops-scripts.yml chore(deps): batch dep bumps — 6 safe upgrades (4 actions majors + 2 npm dev deps) 2026-04-28 17:44:55 -07:00