feat(ci): sop-checklist-gate — peer-ack merge gate (RFC#351 Phase 2) #688

Merged
core-devops merged 2 commits from feat/sop-checklist-gate-mvp into main 2026-05-12 07:03:50 +00:00

2 Commits

Author SHA1 Message Date
76988c05cd fix(ci): sop-checklist-gate exits 0 by default — POSTed status is the gate
Some checks failed
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 11s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 13s
CI / Detect changes (pull_request) Successful in 27s
E2E API Smoke Test / detect-changes (pull_request) Successful in 27s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 22s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 25s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 13s
qa-review / approved (pull_request) Failing after 14s
security-review / approved (pull_request) Failing after 15s
CI / Platform (Go) (pull_request) Successful in 7s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 25s
sop-tier-check / tier-check (pull_request) Successful in 16s
CI / Canvas (Next.js) (pull_request) Successful in 7s
gate-check-v3 / gate-check (pull_request) Successful in 21s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 4s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / Python Lint & Test (pull_request) Successful in 5s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 7s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 6s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 7s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 6s
CI / all-required (pull_request) Successful in 3s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m9s
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Failing after 1m18s
audit-force-merge / audit (pull_request) Successful in 6s
By default the gate script now exits 0 in non-dry-run mode regardless of
ack state. The job-level pass/fail must NOT carry the gate signal —
otherwise BP sees TWO failure signals (the job-auto-status + our POSTed
status) and the user gets ambiguous error messages.

The POSTed `sop-checklist / all-items-acked (pull_request)` status IS
the gate. Job conclusion is informational.

Added --exit-on-state for local debugging (restores the old
non-zero-on-failure behavior). Default OFF — production behavior is
exit 0 always.

51/51 tests still pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 06:13:58 +00:00
72df12ecef feat(ci): sop-checklist-gate — peer-ack merge gate (RFC#351 Phase 2)
All checks were successful
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 16s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 15s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 33s
CI / Detect changes (pull_request) Successful in 43s
E2E API Smoke Test / detect-changes (pull_request) Successful in 44s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 40s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 13s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 11s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 7s
CI / Platform (Go) (pull_request) Successful in 8s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 7s
CI / Canvas (Next.js) (pull_request) Successful in 9s
CI / Python Lint & Test (pull_request) Successful in 15s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m25s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / all-required (pull_request) Successful in 6s
RFC#351 Step 2 of 6: implementation MVP of the SOP-checklist peer-ack
merge gate. NOT yet wired to branch protection (Phase 4 needs separate
authorization).

What:
- .gitea/sop-checklist-config.yaml — 7-item checklist with slug,
  numeric_alias (1..7), pr_section_marker, required_teams. Includes
  tier-aware failure-mode map: tier:high/medium=hard, tier:low=soft,
  default=hard (never silently lower the bar).
- .gitea/scripts/sop-checklist-gate.py — parses PR body + comments,
  computes per-item ack state, posts commit-status
  "sop-checklist / all-items-acked (pull_request)".
- .gitea/scripts/tests/test_sop_checklist_gate.py — 51 unit tests
  covering slug normalization, directive parsing, section-marker
  detection, ack-state computation (self-ack reject, revoke
  semantics, multi-user/multi-item, numeric aliases), tier-mode
  selection, and end-to-end happy path.
- .gitea/workflows/sop-checklist-gate.yml — pull_request_target
  [opened/edited/synchronize/reopened] + issue_comment
  [created/edited/deleted]. Checks out BASE ref only (trust boundary
  per RFC#324 §A4). Mirrors qa-review/security-review patterns.

Why:
Hongming 2026-05-12T05:42Z asked for SOP-enforcing CI/CD that requires
peer-ack on each checklist item before merge. Composes the existing
patterns (scripts-lint PR-body parser + RFC#324 persona-whitelist
commit-status + sop-tier-check tier-awareness) into one gate.

Slash-command contract:
  /sop-ack <slug> [note]      — register peer-ack (most-recent wins)
  /sop-revoke <slug> [reason] — invalidate own prior ack

Slug normalization accepts kebab-case, snake_case, natural-spaces,
or numeric 1..7 shorthand (all canonicalize to kebab-case via the
config-driven alias table).

Tests: 51/51 pass locally. Dry-run probe against PR#685 verified the
full pipeline (PR fetch, comment fetch, ack computation, status
description rendering inside the 140-char budget).

Not yet:
- Phase 3 (24h soak)
- Phase 4 (BP PATCH to require this context — needs Hongming GO)
- Phase 5 (cross-repo)
- Phase 6 (dev-sop.md codification)
- SOP_CHECKLIST_GATE_TOKEN secret provisioning (separate follow-up;
  fail-closed until provisioned, same as RFC_324_TEAM_READ_TOKEN
  pattern in qa-review.yml).

Cross-links:
- internal#351 (RFC body)
- RFC#324 (qa-review/security-review — reused mechanism)
- internal#346 (dev-sop.md SOP-14..SOP-20 — sibling rules)
- feedback_pull_request_review_no_refire (why issue_comment trigger)
- feedback_checkpointed_workflow_over_good_practice_doc (motivation)
- feedback_fix_root_not_symptom (default-mode=hard rationale)

## What
Add a SOP-checklist peer-ack merge gate: workflow + script + config + 51 unit tests.

## Why
Hongming-requested mechanism to enforce SOP via CI/CD: each PR checklist
item must be peer-acked before merge, with team-membership-verified
ackers and tier-aware failure mode.

## Verification
- 51/51 unit tests pass (slug normalization, parse_directives, section
  marker detection, ack-state including self-ack rejection + revoke
  semantics, tier-mode mapping, end-to-end happy path).
- YAML lint clean (yaml.safe_load + lint-workflow-yaml.py on the new
  workflow — pre-existing fatals on unrelated files only).
- Python syntax clean (py_compile).
- Dry-run against live PR#685: PR fetch, comment enumeration, status
  description render all within 140-char budget — works end-to-end.

## Tier
tier:medium — net-new CI workflow; no production impact; no BP change
yet (Phase 4 separate auth).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-12 06:08:36 +00:00