From e075557b19253441c21666c1bfba91013620c83f Mon Sep 17 00:00:00 2001 From: devops-engineer Date: Thu, 7 May 2026 15:29:26 -0700 Subject: [PATCH 1/2] fix(ci): replace gh pr CLI with Gitea v1 REST in workflows + scripts (#75 class A) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Part of the post-#66 sweep to remove `gh` CLI dependencies that fail silently against Gitea (which exposes /api/v1 only — no GraphQL → 405, no /api/v3 → 404). Class A covers `gh pr list / view / diff / comment` shapes. Affected: - `.github/workflows/auto-tag-runtime.yml` Replaced `gh pr list --search SHA --json number,labels` with a curl to `/api/v1/repos/.../pulls?state=closed&sort=newest&limit=50` + jq filter on `merge_commit_sha == github.sha`. Same end-to-end behaviour: locate the merged PR for this push, read its labels, pick the bump kind. Defensive `?.name // empty` jq guard handles unlabelled PRs without erroring. The 50-PR window is comfortably larger than the volume of staging→main promotes that close in any reasonable detection window. - `scripts/check-stale-promote-pr.sh` Rewrote `fetch_prs` and `post_comment` to call Gitea's REST API directly. Gitea doesn't expose GitHub's compound `mergeStateStatus` / `reviewDecision` fields, so the new fetcher pulls `/pulls?state=open&base=main` then for each PR pulls `/pulls/{n}/reviews` and synthesizes the GitHub-shape JSON the rest of the script (and the existing fixture-based unit tests) consume: BLOCKED + REVIEW_REQUIRED ↔ mergeable=true AND 0 APPROVED reviews DIRTY ↔ mergeable=false (alarm doesn't fire) CLEAN + APPROVED ↔ mergeable=true AND ≥1 APPROVED review Comment-posting moves to `POST /repos/.../issues/{n}/comments` (Gitea treats PRs as issues for the comment surface, same as GitHub's REST). All 23 fixture-driven unit tests still pass — fixtures pass GitHub-shape JSON via PR_FIXTURE which short-circuits the live fetch path. - `scripts/ops/check_migration_collisions.py` Replaced `gh pr list` + `gh pr diff` calls with stdlib `urllib` against /api/v1. Helper `_gitea_get` centralizes auth + error handling; uses GITEA_TOKEN env, falling back to GITHUB_TOKEN (act_runner) and GH_TOKEN. Return shape from `open_prs_with_migration_prefix` mimics the historical `--json number,headRefName` so the call sites are unchanged. All 9 regex-classifier unit tests still pass; live integration test against the production Gitea API returns 0 collisions for prefix=999 as expected. curl invocation pattern is `curl --fail-with-body -sS` (NOT `-fsS` — the two short-fail flags are mutually exclusive in modern curl; caught by `curl: You must select either --fail or --fail-with-body, not both` during local verification). Token model: workflows pass act_runner's GITHUB_TOKEN (per-run, repo read scope) — same surface used by the auto-sync fix in PR #66 plus the surrounding workflows. No new repo secrets required. Verification: bash unit tests (23/23 pass), python unittest (9/9 pass), live curl call against production Gitea returns 200 with the expected shape, YAML / shell / Python syntax all validate. Closes part of #75. Other classes (D — `gh api`; F — `gh run list`) land in follow-up PRs. Co-Authored-By: Claude Opus 4.7 (1M context) --- .github/workflows/auto-tag-runtime.yml | 37 ++++- scripts/check-stale-promote-pr.sh | 157 ++++++++++++++++++++-- scripts/ops/check_migration_collisions.py | 120 ++++++++++++++--- 3 files changed, 277 insertions(+), 37 deletions(-) diff --git a/.github/workflows/auto-tag-runtime.yml b/.github/workflows/auto-tag-runtime.yml index ef9c19af..5ba8257d 100644 --- a/.github/workflows/auto-tag-runtime.yml +++ b/.github/workflows/auto-tag-runtime.yml @@ -57,17 +57,42 @@ jobs: id: bump if: steps.skip.outputs.skip != 'true' env: - GH_TOKEN: ${{ github.token }} + # Gitea-shape token (act_runner forwards GITHUB_TOKEN as a + # short-lived per-run secret with read access to this repo). + # We hit `/api/v1/repos/.../pulls?state=closed` directly + # because `gh pr list` calls Gitea's GraphQL endpoint, which + # returns HTTP 405 (issue #75 / post-#66 sweep). + GITEA_TOKEN: ${{ github.token }} + REPO: ${{ github.repository }} + GITEA_API_URL: ${{ github.server_url }}/api/v1 + PUSH_SHA: ${{ github.sha }} run: | - # The merged PR for this push commit. `gh pr list --search` finds - # closed PRs whose merge commit matches; we take the first. - PR=$(gh pr list --state merged --search "${{ github.sha }}" --json number,labels --jq '.[0]' 2>/dev/null || echo "") + # Find the merged PR whose merge_commit_sha matches this push. + # Gitea's `/repos/{owner}/{repo}/pulls?state=closed` returns + # PRs sorted newest-first; we paginate up to 50 and jq-filter + # on `merge_commit_sha == PUSH_SHA`. Bounded — auto-tag fires + # per push to main, so the matching PR is always among the + # most recent closures. 50 is comfortably more than the + # ~10-20 staging→main promotes that close in any reasonable + # window. + set -euo pipefail + PRS_JSON=$(curl --fail-with-body -sS \ + -H "Authorization: token ${GITEA_TOKEN}" \ + -H "Accept: application/json" \ + "${GITEA_API_URL}/repos/${REPO}/pulls?state=closed&sort=newest&limit=50" \ + 2>/dev/null || echo "[]") + PR=$(printf '%s' "$PRS_JSON" \ + | jq -c --arg sha "$PUSH_SHA" \ + '[.[] | select(.merged_at != null and .merge_commit_sha == $sha)] | .[0] // empty') if [ -z "$PR" ] || [ "$PR" = "null" ]; then - echo "No merged PR found for ${{ github.sha }} — defaulting to patch bump." + echo "No merged PR found for ${PUSH_SHA} — defaulting to patch bump." echo "kind=patch" >> "$GITHUB_OUTPUT" exit 0 fi - LABELS=$(echo "$PR" | jq -r '.labels[].name') + # Gitea returns labels under `.labels[].name`, same shape as + # GitHub's REST. The previous `gh pr list --json number,labels` + # output was identical; jq filter unchanged. + LABELS=$(printf '%s' "$PR" | jq -r '.labels[]?.name // empty') if echo "$LABELS" | grep -qx 'release:major'; then echo "kind=major" >> "$GITHUB_OUTPUT" elif echo "$LABELS" | grep -qx 'release:minor'; then diff --git a/scripts/check-stale-promote-pr.sh b/scripts/check-stale-promote-pr.sh index bcc5afe6..e4b7921c 100755 --- a/scripts/check-stale-promote-pr.sh +++ b/scripts/check-stale-promote-pr.sh @@ -17,12 +17,23 @@ # # Used by .github/workflows/auto-promote-stale-alarm.yml. Logic lives # here (not inline in the workflow YAML) so we can: -# - Unit-test it with a stubbed `gh` (see test-check-stale-promote-pr.sh) +# - Unit-test it with a fixture (see test-check-stale-promote-pr.sh) # - Run it ad-hoc by an operator: `scripts/check-stale-promote-pr.sh` # - Reuse the same surface in any sibling workflow that needs the same # check (SSOT — one detector, many callers). # -# Requires: `gh` CLI, `jq`. `GH_TOKEN` env in the workflow context. +# Requires: `curl`, `jq`. `GITEA_TOKEN` (or `GITHUB_TOKEN` / `GH_TOKEN` +# for back-compat) in the workflow context. Reads `GITHUB_SERVER_URL` +# / `GITEA_API_URL` for the Gitea base, defaulting to +# https://git.moleculesai.app/api/v1. +# +# Post-2026-05-06 (Gitea migration, issue #75): the previous version +# called `gh pr list/view/comment`, all of which hit GitHub.com's +# GraphQL or /api/v3 REST shapes. Gitea exposes /api/v1/ only (no +# GraphQL → 405, no /api/v3 → 404). So this script now talks to the +# Gitea v1 API directly via curl. The fixture-driven unit tests are +# unchanged — they bypass the live fetch via PR_FIXTURE and still pass +# the historical (GitHub-shape) JSON which `detect_stale` consumes. set -euo pipefail @@ -36,14 +47,15 @@ set -euo pipefail # alarming. Override via env for tests + edge ops. STALE_HOURS="${STALE_HOURS:-4}" -# Repo defaults to the current `gh` context. Tests pass --repo explicitly. +# Repo defaults to GITHUB_REPOSITORY (act_runner sets this in workflow +# context). Tests pass --repo explicitly. REPO="${GITHUB_REPOSITORY:-}" # Whether to post a comment to the PR. Off by default to avoid noise on # manual ad-hoc runs; the cron workflow turns it on. POST_COMMENT="${POST_COMMENT:-false}" -# Where to read the open-PR JSON from. Empty = call `gh` live. Tests +# Where to read the open-PR JSON from. Empty = call Gitea live. Tests # point this at a fixture file. PR_FIXTURE="${PR_FIXTURE:-}" @@ -51,6 +63,17 @@ PR_FIXTURE="${PR_FIXTURE:-}" # the staleness math is deterministic. NOW_OVERRIDE="${NOW_OVERRIDE:-}" +# Gitea API base. act_runner forwards github.server_url as +# GITHUB_SERVER_URL; for the molecule-ai fleet that's +# https://git.moleculesai.app. Append /api/v1 to get the REST root. +# Override directly via GITEA_API_URL for tests / non-default hosts. +GITEA_API_URL="${GITEA_API_URL:-${GITHUB_SERVER_URL:-https://git.moleculesai.app}/api/v1}" + +# Token. Workflow context sets GITHUB_TOKEN; we accept GITEA_TOKEN as +# the explicit name and GH_TOKEN for back-compat with operator habits +# from the GitHub era. First non-empty wins. +GITEA_TOKEN="${GITEA_TOKEN:-${GITHUB_TOKEN:-${GH_TOKEN:-}}}" + while [ $# -gt 0 ]; do case "$1" in --repo) REPO="$2"; shift 2 ;; @@ -83,7 +106,7 @@ now_epoch() { fi } -# Parse RFC3339 timestamps the way GitHub emits them (e.g. +# Parse RFC3339 timestamps the way Gitea / GitHub emit them (e.g. # "2026-05-05T23:15:00Z"). gnu-date uses -d, bsd-date uses -j -f. Cover # both because the workflow runs on ubuntu-latest (gnu) but operators # may run this script on macOS (bsd). @@ -106,14 +129,100 @@ to_epoch() { # Fetch open auto-promote PRs # ----------------------------------------------------------------------------- +# Gitea v1 returns PRs with the canonical Gitea shape (number, title, +# created_at, html_url, mergeable, state). The previous GitHub-CLI +# version returned a derived `mergeStateStatus` / `reviewDecision` +# pair which only GitHub computes — Gitea doesn't expose them +# natively. Rebuild equivalents: +# +# mergeStateStatus = BLOCKED ↔ Gitea: state==open AND mergeable==true +# AND no APPROVED review yet +# (i.e. branch protection is gating +# the auto-merge pending an approval) +# reviewDecision = REVIEW_REQUIRED ↔ Gitea: 0 APPROVED reviews +# +# This mirrors the SAME silent-block failure mode the GitHub version +# detected: auto-merge armed, branch protection requires 1 review, +# nobody's approved yet. +# +# Implementation: pull the open PR list base=main, then for each PR +# pull /pulls/{n}/reviews and synthesize the GitHub-shape JSON the +# rest of the script + the test fixtures consume. fetch_prs() { if [ -n "$PR_FIXTURE" ]; then cat "$PR_FIXTURE" return 0 fi - gh pr list --repo "$REPO" \ - --base main --head staging --state open \ - --json number,title,createdAt,mergeStateStatus,reviewDecision,url + if [ -z "$GITEA_TOKEN" ]; then + echo "::error::GITEA_TOKEN / GITHUB_TOKEN unset — cannot fetch PRs from $GITEA_API_URL" >&2 + return 1 + fi + local prs_json + prs_json="$(curl --fail-with-body -sS \ + -H "Authorization: token ${GITEA_TOKEN}" \ + -H "Accept: application/json" \ + "${GITEA_API_URL}/repos/${REPO}/pulls?state=open&base=main&limit=50" \ + 2>/dev/null)" || { + echo "::error::Failed to fetch PRs from ${GITEA_API_URL}/repos/${REPO}/pulls" >&2 + return 1 + } + + # Filter to head=staging (the auto-promote shape) and synthesize + # mergeStateStatus + reviewDecision per PR. Approval count via + # /pulls/{n}/reviews. Errors fall through to 0-approvals (treated + # as REVIEW_REQUIRED) preserving the existing "fail-safe — alarm if + # uncertain" semantic. + local synthesized="[]" + while IFS= read -r pr; do + [ -z "$pr" ] && continue + [ "$pr" = "null" ] && continue + local num + num="$(printf '%s' "$pr" | jq -r '.number')" + [ -z "$num" ] && continue + [ "$num" = "null" ] && continue + local approved_count + approved_count="$(curl --fail-with-body -sS \ + -H "Authorization: token ${GITEA_TOKEN}" \ + -H "Accept: application/json" \ + "${GITEA_API_URL}/repos/${REPO}/pulls/${num}/reviews" 2>/dev/null \ + | jq '[.[] | select(.state == "APPROVED" and (.dismissed // false) == false)] | length' \ + 2>/dev/null || echo 0)" + local mergeable + mergeable="$(printf '%s' "$pr" | jq -r '.mergeable')" + local merge_state="UNKNOWN" + local review_decision="REVIEW_REQUIRED" + if [ "$mergeable" = "true" ]; then + if [ "$approved_count" -ge 1 ]; then + merge_state="CLEAN" + review_decision="APPROVED" + else + # mergeable but no approving review — exactly the wedge state + # the alarm targets. + merge_state="BLOCKED" + review_decision="REVIEW_REQUIRED" + fi + else + # not mergeable (conflicts, behind, failed checks) — different + # failure mode, the author owns the fix; the alarm doesn't fire. + merge_state="DIRTY" + review_decision="REVIEW_REQUIRED" + fi + synthesized="$(printf '%s' "$synthesized" \ + | jq -c --argjson pr "$pr" \ + --arg ms "$merge_state" \ + --arg rd "$review_decision" \ + '. + [{ + number: $pr.number, + title: $pr.title, + createdAt: $pr.created_at, + mergeStateStatus: $ms, + reviewDecision: $rd, + url: $pr.html_url + }]')" + done < <(printf '%s' "$prs_json" \ + | jq -c '.[] | select(.head.ref == "staging")' 2>/dev/null) + + printf '%s\n' "$synthesized" } # ----------------------------------------------------------------------------- @@ -171,18 +280,40 @@ post_comment() { if [ "$POST_COMMENT" != "true" ]; then return 0 fi + if [ -z "$GITEA_TOKEN" ]; then + echo "::warning::GITEA_TOKEN unset — cannot post stale-alarm comment on PR #$pr_num" >&2 + return 0 + fi # Idempotency: only one alarm comment per PR. Look for the marker - # string in existing comments before posting a new one. + # string in existing comments before posting a new one. Gitea's + # /repos/{owner}/{repo}/issues/{n}/comments returns the same shape + # for issues + PRs (PRs are issues internally on Gitea, same as + # GitHub's REST). local existing - existing="$(gh pr view "$pr_num" --repo "$REPO" --json comments \ - --jq '.comments[] | select(.body | test("scripts/check-stale-promote-pr.sh per issue #2975")) | .databaseId' \ + existing="$(curl --fail-with-body -sS \ + -H "Authorization: token ${GITEA_TOKEN}" \ + -H "Accept: application/json" \ + "${GITEA_API_URL}/repos/${REPO}/issues/${pr_num}/comments?limit=50" 2>/dev/null \ + | jq -r '.[] | select(.body | test("scripts/check-stale-promote-pr.sh per issue #2975")) | .id' \ | head -n1)" if [ -n "$existing" ]; then echo "::notice::PR #$pr_num already has a stale-alarm comment ($existing) — not re-posting" return 0 fi - comment_body "$age_h" | gh pr comment "$pr_num" --repo "$REPO" --body-file - - echo "::notice::Posted stale-alarm comment on PR #$pr_num (age=${age_h}h)" + local body + body="$(comment_body "$age_h")" + if curl --fail-with-body -sS \ + -X POST \ + -H "Authorization: token ${GITEA_TOKEN}" \ + -H "Accept: application/json" \ + -H "Content-Type: application/json" \ + "${GITEA_API_URL}/repos/${REPO}/issues/${pr_num}/comments" \ + -d "$(jq -nc --arg b "$body" '{body: $b}')" \ + >/dev/null 2>&1; then + echo "::notice::Posted stale-alarm comment on PR #$pr_num (age=${age_h}h)" + else + echo "::warning::Failed to POST stale-alarm comment on PR #$pr_num" >&2 + fi } # ----------------------------------------------------------------------------- diff --git a/scripts/ops/check_migration_collisions.py b/scripts/ops/check_migration_collisions.py index 28901505..f98eb26a 100755 --- a/scripts/ops/check_migration_collisions.py +++ b/scripts/ops/check_migration_collisions.py @@ -19,9 +19,15 @@ Exit codes: 0 — no collisions 1 — collision detected; output names the conflicting PR(s) for the author -Designed to run from a GitHub Actions PR check. Reads PR metadata via the -GitHub CLI (gh) which is preinstalled on ubuntu-latest runners. Runs in -under 10s against a typical PR. +Designed to run from a Gitea Actions PR check. Reads PR metadata via direct +HTTP calls to Gitea's REST API (`/api/v1/`), which on the molecule-ai fleet +lives at https://git.moleculesai.app. Runs in under 10s against a typical PR. + +Post-2026-05-06 (Gitea migration, issue #75): the previous version called +the GitHub CLI (``gh pr list``, ``gh pr diff``). On Gitea those calls hit +either the GraphQL endpoint (HTTP 405) or /api/v3 (HTTP 404). This module +now talks to /api/v1 directly via urllib so it works against any Gitea +host without a `gh` install or extra dependencies. """ from __future__ import annotations @@ -31,12 +37,70 @@ import os import re import subprocess import sys +import urllib.error +import urllib.parse +import urllib.request from pathlib import Path MIGRATIONS_DIR = "workspace-server/migrations" MIGRATION_FILE_RE = re.compile(r"^(\d+)_[^/]+\.(up|down)\.sql$") +def _gitea_api_url() -> str: + """Resolve the Gitea API base URL. + + act_runner forwards github.server_url as GITHUB_SERVER_URL; for the + molecule-ai fleet that's https://git.moleculesai.app. Append /api/v1 + to get the REST root. Override directly via GITEA_API_URL for tests + or non-default hosts. + """ + env_override = os.environ.get("GITEA_API_URL", "").rstrip("/") + if env_override: + return env_override + server = os.environ.get("GITHUB_SERVER_URL", "https://git.moleculesai.app").rstrip("/") + return f"{server}/api/v1" + + +def _gitea_token() -> str: + """Resolve the Gitea token from env. GITEA_TOKEN wins; falls back + to GITHUB_TOKEN (set by act_runner) and GH_TOKEN (operator habit + from the GitHub era).""" + return ( + os.environ.get("GITEA_TOKEN") + or os.environ.get("GITHUB_TOKEN") + or os.environ.get("GH_TOKEN") + or "" + ) + + +def _gitea_get(path: str, params: dict[str, str] | None = None) -> bytes | None: + """GET against /api/v1; returns response body or None on HTTP error. + + Errors return None (not raise) because callers handle missing data + by emitting an actionable workflow message rather than crashing the + PR check on a transient API blip. + """ + base = _gitea_api_url() + qs = "" + if params: + qs = "?" + urllib.parse.urlencode(params) + url = f"{base}/{path.lstrip('/')}{qs}" + req = urllib.request.Request(url) + token = _gitea_token() + if token: + req.add_header("Authorization", f"token {token}") + req.add_header("Accept", "application/json") + try: + with urllib.request.urlopen(req, timeout=20) as resp: # noqa: S310 + return resp.read() + except urllib.error.HTTPError as e: + sys.stderr.write(f"Gitea API HTTP {e.code} on {path}: {e.reason}\n") + return None + except (urllib.error.URLError, TimeoutError) as e: + sys.stderr.write(f"Gitea API network error on {path}: {e}\n") + return None + + def run(cmd: list[str], check: bool = True) -> str: """Run a subprocess and return stdout. Raise on non-zero when check=True.""" result = subprocess.run(cmd, capture_output=True, text=True) @@ -96,32 +160,49 @@ def open_prs_with_migration_prefix( repo: str, prefix: int, exclude_pr: int ) -> list[dict]: """Return open PRs (other than `exclude_pr`) that add a migration with - `prefix`. Uses `gh pr diff` per PR — we only need to walk PRs that are - actually in flight, so the cost is bounded by open-PR count. + `prefix`. Walks open PRs via Gitea's `/repos/{owner}/{repo}/pulls` and + pulls each one's changed-file list via `/pulls/{n}/files`. The cost is + bounded by open-PR count, which is small (<100) on this repo. The + return shape mimics the GitHub CLI's `--json number,headRefName`: + ``[{"number": int, "headRefName": str}, ...]``. """ - out = run([ - "gh", "pr", "list", "--repo", repo, "--state", "open", - "--json", "number,headRefName", "--limit", "100", - ]) - prs = json.loads(out) + body = _gitea_get( + f"repos/{repo}/pulls", + {"state": "open", "limit": "50"}, + ) + if body is None: + # Best-effort: a transient Gitea blip shouldn't fail the PR + # check (the base-branch collision check runs locally and is + # the more common failure mode). + return [] + prs = json.loads(body) matches: list[dict] = [] for pr in prs: num = pr["number"] if num == exclude_pr: continue - try: - files = run([ - "gh", "pr", "diff", str(num), "--repo", repo, "--name-only", - ], check=False) - except Exception: # noqa: BLE001 + # Gitea returns the head ref under .head.ref (REST shape); + # GitHub CLI's --json headRefName flattens it. Normalize on + # the way out so callers see the historical shape. + head_ref_name = (pr.get("head") or {}).get("ref", "") + files_body = _gitea_get(f"repos/{repo}/pulls/{num}/files", {"limit": "100"}) + if files_body is None: continue - for raw in files.splitlines(): + try: + files = json.loads(files_body) + except json.JSONDecodeError: + continue + for f in files: + # Gitea's /pulls/{n}/files returns objects with `.filename` + # (same as GitHub's REST). Older Gitea versions emit + # `.name` instead — handle both. + raw = f.get("filename") or f.get("name") or "" path = Path(raw.strip()) if not path.name: continue m = MIGRATION_FILE_RE.match(path.name) if m and int(m.group(1)) == prefix: - matches.append(pr) + matches.append({"number": num, "headRefName": head_ref_name}) break return matches @@ -138,7 +219,10 @@ def main() -> int: pr_number = int(pr_number_env) base_ref = os.environ.get("BASE_REF", "origin/staging") head_ref = os.environ.get("HEAD_REF", "HEAD") - repo = os.environ.get("GITHUB_REPOSITORY", "Molecule-AI/molecule-core") + # Default kept lowercase to match the Gitea-canonical org name + # (post-2026-05-06 migration). Tests + workflow context override + # via GITHUB_REPOSITORY which act_runner sets per-run. + repo = os.environ.get("GITHUB_REPOSITORY", "molecule-ai/molecule-core") added = migrations_in_diff(base_ref, head_ref) if not added: From e43bd7ceb01e6fe728626b4fa48213ae39d3c358 Mon Sep 17 00:00:00 2001 From: devops-engineer Date: Thu, 7 May 2026 15:41:00 -0700 Subject: [PATCH 2/2] =?UTF-8?q?chore:=202nd=20verification=20trigger=20for?= =?UTF-8?q?=20#75=20class=20A=20(per=20Phase=204=20=E2=89=A52=20green=20ru?= =?UTF-8?q?ns)?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Empty commit to trigger CI a second consecutive time per the SOP 'verify ≥1 representative workflow per class via workflow_dispatch or push event ... ≥2 consecutive successful runs per class'. Co-Authored-By: Claude Opus 4.7 (1M context)