molecule-core/scripts/ops/test_check_migration_collisions.py
Hongming Wang ea8ff626a9 ci: hard gate against migration version collisions (#2341)
Two PRs targeting staging can each add a migration with the same
numeric prefix (e.g. 044_*.up.sql). Each passes CI independently.
They collide at merge time. Worst case: second migration silently
doesn't apply and prod schema drifts from what the code expects.

Caught manually 2026-04-30 during PR #2276 rebase: 044_runtime_image_pins
collided with 044_platform_inbound_secret from RFC #2312. This workflow
makes that detection automatic at PR-open time.

How it works:
  scripts/ops/check_migration_collisions.py runs on every PR that
  touches workspace-server/migrations/**. For each new/modified
  migration filename, extracts the numeric prefix and checks:

  1. Does the base branch already have a DIFFERENT migration file with
     the same prefix? (PR branched off an old base, base advanced and
     another PR landed the same number — needs rebase.)

  2. Is another OPEN PR (not this one) also adding a migration with
     the same prefix? (Race-window collision — both pass CI separately,
     would collide at merge time.)

Either case → exit 1 with a clear ::error:: message naming the
conflicting PR(s) so the author knows what to renumber.

Implementation notes:
  - Uses git ls-tree (not working-tree walk) so it works against any
    base ref without checkout.
  - Uses gh pr diff --name-only per open PR, bounded by `gh pr list
    --limit 100`. ~30s worst case for a busy repo, <5s normally.
  - --diff-filter=AM picks up Added or Modified — renaming a migration
    in place is also flagged (intentional; renaming migrations isn't
    safe).
  - Same filename in both PR and base = no collision (PR is editing
    in-place, fine).

Tests:
  scripts/ops/test_check_migration_collisions.py — 9 cases on the
  regex classifier (the load-bearing piece). End-to-end git/gh path
  is exercised by running the workflow against real PRs.

Hard-gates Tier 1 item 1 (#2341). Cheapest, cleanest gate. Catches
one specific class of merge-time foot-gun automatically.

Refs hard-gates discussion 2026-04-30. Tier 1 of 4 (others tracked
in #2342, #2343, #2344).
2026-04-29 21:42:42 -07:00

65 lines
2.5 KiB
Python

"""Unit tests for check_migration_collisions.py — focuses on the regex
classifier + the diff/base-set logic that runs without git.
The end-to-end git diff + gh pr list path is exercised manually (running
the workflow against test PRs). These tests pin the pure-logic surface
so a regression in migration-name parsing fails immediately at PR time.
"""
import importlib.util
from pathlib import Path
import pytest
# Load the script as a module without invoking main(). We import the
# regex + helpers directly so we can test them without setting up git.
SCRIPT_PATH = Path(__file__).parent / "check_migration_collisions.py"
spec = importlib.util.spec_from_file_location("ccm", SCRIPT_PATH)
ccm = importlib.util.module_from_spec(spec)
spec.loader.exec_module(ccm)
class TestMigrationFileRe:
"""The regex classifier — the load-bearing piece of the detector."""
def test_matches_standard_three_digit_prefix(self):
m = ccm.MIGRATION_FILE_RE.match("044_platform_inbound_secret.up.sql")
assert m is not None
assert int(m.group(1)) == 44
assert m.group(2) == "up"
def test_matches_down_migration(self):
m = ccm.MIGRATION_FILE_RE.match("044_platform_inbound_secret.down.sql")
assert m is not None
assert int(m.group(1)) == 44
assert m.group(2) == "down"
def test_matches_date_shaped_prefix(self):
# Real example from the repo: 20260417000000_workflow_checkpoints
m = ccm.MIGRATION_FILE_RE.match("20260417000000_workflow_checkpoints.up.sql")
assert m is not None
assert int(m.group(1)) == 20260417000000
def test_matches_long_compound_name(self):
m = ccm.MIGRATION_FILE_RE.match("042_a2a_queue.up.sql")
assert m is not None
assert int(m.group(1)) == 42
def test_rejects_no_prefix(self):
assert ccm.MIGRATION_FILE_RE.match("readme.md") is None
def test_rejects_alpha_prefix(self):
assert ccm.MIGRATION_FILE_RE.match("abc_migration.up.sql") is None
def test_rejects_wrong_extension(self):
assert ccm.MIGRATION_FILE_RE.match("044_test.sql") is None
assert ccm.MIGRATION_FILE_RE.match("044_test.up.txt") is None
def test_rejects_path_separator(self):
# Filename only — paths come pre-split via Path(line).name
assert ccm.MIGRATION_FILE_RE.match("044/test.up.sql") is None
def test_rejects_no_underscore(self):
# Naming convention requires <digits>_<name>
assert ccm.MIGRATION_FILE_RE.match("044.up.sql") is None