ci(handlers-pg): apply all migrations with skip-on-error + sanity check (#320)

Previous workflow applied only 049_delegations.up.sql — fragile to
future migrations that touch the delegations table or any other
handlers/-tested table. Operator would have to remember to update
the workflow's psql -f line per migration.

New behavior: loop every .up.sql in lexicographic order, apply each
with ON_ERROR_STOP=1 + per-migration result captured. Failed migrations
are SKIPPED rather than blocking the suite — handles the historical
migrations (017_memories_fts_namespace, 042_a2a_queue, etc.) that
depend on tables since renamed/dropped and can't replay from scratch.
Migrations that DO succeed land their tables, which is sufficient for
the integration tests in handlers/.

Sanity gate at the end: if the delegations table is missing after the
replay, hard-fail with a loud error. That catches a real regression
where 049 itself becomes broken (e.g., schema rename), separate from
the historical-broken-migration noise above.

Per-migration log line ("✓" or "⊘ skipped") makes it easy to spot
when a migration that SHOULD have replayed didn't.

Verified locally: full migration chain runs, 049 lands, all 7
integration tests pass against the chained-migration DB.

Closes #320.
This commit is contained in:
Hongming Wang 2026-05-05 03:48:43 -07:00
parent 1e8d7ae17c
commit 90d202c80a

View File

@ -100,14 +100,48 @@ jobs:
if pg_isready -h localhost -p 5432 -U postgres -q; then break; fi
echo "waiting for postgres..."; sleep 2
done
# Apply only the delegations migration — it doesn't FK to
# workspaces, so we don't need to fold in the full migration
# chain (some earlier migrations have ordering dependencies on
# tables that have since been dropped). Per-test migrations
# under handlers/ should add their schema files to this list
# as needed.
psql -h localhost -U postgres -d molecule -v ON_ERROR_STOP=1 \
-f migrations/049_delegations.up.sql
# Apply every .up.sql in lexicographic order with
# ON_ERROR_STOP=0 — failing migrations are SKIPPED rather than
# blocking the suite. This handles the current schema state
# where a few historical migrations (e.g. 017_memories_fts_*)
# depend on tables that were later renamed/dropped and so
# cannot replay from scratch. The migrations that DO succeed
# land their tables, which is sufficient for the integration
# tests in handlers/.
#
# Why not maintain a curated allowlist: every new migration
# touching a handlers/-tested table would have to update this
# workflow. With apply-all-or-skip, a future migration that
# adds a column to delegations runs automatically (its base
# table 049_delegations.up.sql already succeeded above it in
# the order). Operators only need to revisit this if the
# migration chain becomes legitimately replayable end-to-end.
#
# Per-migration result is logged so a failed migration that
# SHOULD have been replayable surfaces in the CI log instead
# of silently failing.
set +e
for migration in migrations/*.up.sql; do
if psql -h localhost -U postgres -d molecule -v ON_ERROR_STOP=1 \
-f "$migration" >/dev/null 2>&1; then
echo "✓ $(basename "$migration")"
else
echo "⊘ $(basename "$migration") (skipped — see comment in workflow)"
fi
done
set -e
# Sanity: the delegations table MUST exist for the integration
# tests to be meaningful. Hard-fail if 049 didn't land — that
# would be a real regression we want loud.
if ! psql -h localhost -U postgres -d molecule -tA \
-c "SELECT 1 FROM information_schema.tables WHERE table_name = 'delegations'" \
| grep -q 1; then
echo "::error::delegations table missing after migration replay — handler integration tests would be meaningless"
exit 1
fi
echo "✓ delegations table present"
- if: needs.detect-changes.outputs.handlers == 'true'
name: Run integration tests