fix(ci): close fail-open in qa/security review checks (RFC#324 v1.3 §A1.1) + drop dead jq fallback
Some checks failed
publish-runtime-autobump / bump-and-tag (pull_request) Has been skipped
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 15s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 14s
sop-tier-check / tier-check (pull_request) Successful in 19s
gate-check-v3 / gate-check (pull_request) Failing after 30s
CI / Detect changes (pull_request) Successful in 44s
E2E API Smoke Test / detect-changes (pull_request) Successful in 43s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 43s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 37s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 45s
publish-runtime-autobump / pr-validate (pull_request) Successful in 47s
CI / Platform (Go) (pull_request) Successful in 8s
CI / Canvas (Next.js) (pull_request) Successful in 10s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 8s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 11s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 12s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 12s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 2m15s
CI / Python Lint & Test (pull_request) Successful in 7m16s

Addresses hongming-pc review #1421 on PR #535.

Blocker 1 (fail-open privilege gate):
  Original v1.2 design `if:`-gated the "Check out BASE" and "Evaluate"
  steps on the privilege-check step's `proceed` output. A non-collaborator
  commenting `/qa-recheck` produced proceed=false → both steps skipped →
  job conclusion = success → `qa-review / approved` context published as
  success with ZERO real APPROVE. Any visitor could green the gate.

  Fix per RFC#324 v1.3 §A1.1 option (b): drop privilege-gating of the
  eval entirely. The eval is read-only and idempotent (reads
  pulls/{N}/reviews + teams/{id}/members/{u}, both server-side state
  uninfluenced by who commented). Re-running on a non-collaborator's
  comment is harmless: if a real team-member APPROVE exists, the eval
  flips green; if not, it stays red. The privilege step is retained as
  a `::notice::` log line only (griefer-spotting), not a gate.

Non-blocking nit 5 (dead jq fallback):
  `apt-get install jq` (no root) and `curl -o /usr/local/bin/jq` (no
  write perm on uid-1001 rootless runner) both can't succeed. Per
  feedback_ci_runner_install_needs_writable_path + #391/#402, jq is
  already baked into runner-base. Replace the install dance with a
  clear `exit 1` + diagnostic so a missing-jq runner fails loud rather
  than confusingly.

Smoke-test (mocked Gitea API):
  no-approve         → exit 1  (gate red)
  self-approve       → exit 1  (gate red)
  dismissed-approve  → exit 1  (gate red)
  non-team-approve   → exit 1  (gate red)
  team-approve       → exit 0  (gate green)

Blocker 2 (A1-α event-suffix context-name verification) is the
smoke-PR's job and is flagged in a follow-up comment on this PR — does
not require workflow changes here.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
Molecule AI · core-devops 2026-05-11 11:45:59 -07:00
parent e922351b78
commit ecbfa60f04
3 changed files with 70 additions and 57 deletions

View File

@ -21,11 +21,18 @@
# suspenders. Keeping it off in v1 simplifies semantics; flip in a follow-up
# PR if reviewer telemetry shows residual stale-APPROVE merges.
#
# Privilege gate (RFC A1.1 — CRITICAL):
# The /qa-recheck and /security-recheck slash-command refire must NOT
# re-fire for non-collaborators. The workflow-yaml privilege-check step
# handles this BEFORE this script runs. This script does not re-check
# privilege; the workflow guards the entry.
# Privilege gate (RFC#324 v1.3 §A1.1 — INFORMATIONAL ONLY):
# The /qa-recheck and /security-recheck slash-commands can be triggered
# by anyone who can comment on the PR. The workflow's privilege step
# logs collaborator-status but does NOT gate execution of this script.
# Why this is safe: this evaluator is read-only and idempotent —
# reading `pulls/{N}/reviews` and `teams/{id}/members/{u}` can't be
# influenced by who triggered the run. If a real team-member APPROVE
# exists the gate flips green; otherwise it stays red. A
# non-collaborator commenting /qa-recheck cannot manufacture a green
# gate. Original (v1.2) design with `if:`-gating of this step was
# fail-open (skipped-via-`if:` job still publishes the status as
# `success`) — corrected in v1.3 per hongming-pc review 1421.
#
# Trust boundary (RFC A4):
# This script is loaded from the BASE branch (sourced via .gitea/scripts/
@ -56,19 +63,16 @@
set -euo pipefail
# jq is required for JSON parsing — same install dance sop-tier-check uses.
# jq is required for JSON parsing. It is pre-baked into the runner-base
# image (per RFC#268 workflow-smoke), so the only reason we'd not find it
# is a broken runner. The previous fallback dance (apt-get + curl to
# /usr/local/bin/jq) cannot succeed on a uid-1001 rootless runner
# (#391/#402 + feedback_ci_runner_install_needs_writable_path), so it's
# dropped. Fail loud with a clear diagnostic rather than attempt an
# install that physically cannot work.
if ! command -v jq >/dev/null 2>&1; then
echo "::notice::jq not on PATH — installing..."
if apt-get update -qq && apt-get install -y -qq jq 2>/dev/null; then
echo "::notice::jq installed via apt-get: $(jq --version)"
elif timeout 60 curl -sSL \
"https://github.com/jqlang/jq/releases/download/jq-1.7.1/jq-linux-amd64" \
-o /usr/local/bin/jq && chmod +x /usr/local/bin/jq; then
echo "::notice::jq downloaded as binary: $(/usr/local/bin/jq --version)"
else
echo "::error::jq install failed; cannot parse Gitea API JSON"
exit 1
fi
echo "::error::jq missing from runner-base image — bake it into the runner image (see RFC#268 workflow-smoke / feedback_ci_runner_install_needs_writable_path). This evaluator cannot run without jq."
exit 1
fi
: "${GITEA_TOKEN:?GITEA_TOKEN required}"

View File

@ -18,14 +18,30 @@
# `qa-review / approved (<event>)` — NO `POST /statuses` call → NO
# write:repository token scope needed. Sidesteps internal#321 defect #2.
#
# A1.1 (privilege check on slash-comment):
# A1.1 (privilege check on slash-comment — INFORMATIONAL ONLY, NOT a gate):
# The `issue_comment` event fires for ANY commenter, including
# non-collaborators. We MUST NOT use `github.event.comment.author_association`
# (the field doesn't exist on Gitea 1.22.6 webhook payload — this was
# sop-tier-refire's defect #1). Instead, the Privilege check step probes
# GET /repos/.../collaborators/{login}:
# - 204 → is-collaborator → proceed
# - 404 → not-collaborator → exit 0 (ignored; status unchanged)
# non-collaborators. The original (v1.2) design gated the eval step
# behind a collaborator probe → if a non-collaborator commented
# /qa-recheck, the eval was `if:`-skipped → the job exited 0 anyway →
# the status context published `success` with ZERO real APPROVE.
# That was a fail-open: any visitor could green the gate.
#
# RFC#324 v1.3 §A1.1 correction (option b per hongming-pc 1421):
# drop privilege-gating of the evaluation entirely. The eval is
# read-only and idempotent — it reads `pulls/{N}/reviews` and
# `teams/{id}/members/{u}` (both API-side state that a commenter can't
# change). Re-running it on a non-collaborator's comment is harmless
# AND correct: if a real team-member APPROVE exists, the eval flips
# green; if not, it stays red.
#
# We KEEP the privilege step as a `::notice::` log line only — useful
# for griefer-spotting (one operator spamming /recheck) without
# touching the gate. If rate-limiting is needed later, add it as a
# separate concern (time-window throttle, not a privilege gate).
#
# We MUST NOT use `github.event.comment.author_association` (the
# field doesn't exist on Gitea 1.22.6 webhook payload — this was
# sop-tier-refire's defect #1).
#
# A4 (no PR-head checkout under pull_request_target):
# We check out the BASE ref explicitly so the review-check.sh script is
@ -60,7 +76,9 @@
#
# /qa-recheck — re-evaluate the gate (e.g. after an APPROVE lands)
#
# Restricted to repo collaborators only (privilege-check step A1.1).
# Open to any PR commenter. The eval is read-only and idempotent, so
# unprivileged refires are harmless (RFC#324 v1.3 §A1.1). Collaborator
# status is logged for griefer-spotting but does NOT gate execution.
name: qa-review
@ -79,12 +97,10 @@ jobs:
# Gate the job:
# - On pull_request_target events: always run.
# - On issue_comment events: only when it's a PR comment and the body
# contains the slash-command. The privilege check is enforced as
# a job step (A1.1), not here — `if:` can't call the collaborator
# API. The privilege step exits the job early for non-collaborators
# while still publishing the job's previous status (i.e. it does
# NOT flip qa-review from failure to success without a real
# team-member APPROVE).
# contains the slash-command. NO privilege gate at the step level
# (RFC#324 v1.3 §A1.1): a non-collaborator's /qa-recheck is fine
# because the eval is read-only and idempotent — re-running it
# just re-confirms whether a real team-member APPROVE exists.
if: |
github.event_name == 'pull_request_target' ||
(github.event_name == 'issue_comment' &&
@ -92,10 +108,15 @@ jobs:
startsWith(github.event.comment.body, '/qa-recheck'))
runs-on: ubuntu-latest
steps:
- name: Privilege check (A1.1 — collaborator API, NOT author_association)
# Only runs on issue_comment events. For pull_request_target the
# commenter doesn't exist, so the step is a no-op skip.
id: priv
- name: Privilege check (A1.1 — INFORMATIONAL log only, NOT a gate)
# RFC#324 v1.3 §A1.1: this step does NOT gate subsequent steps.
# It exists solely as a log line for griefer-spotting (one
# operator spamming /qa-recheck without merit). Re-running the
# read-only eval on a non-collaborator comment is harmless;
# gating it would be fail-open (skipped steps still publish
# `success` for the job's status context).
# Only runs on issue_comment events; pull_request_target has
# no comment.user.login so the step is a no-op skip there.
if: github.event_name == 'issue_comment'
env:
GITEA_TOKEN: ${{ secrets.RFC_324_TEAM_READ_TOKEN || secrets.GITHUB_TOKEN }}
@ -106,11 +127,9 @@ jobs:
-H "Authorization: token ${GITEA_TOKEN}" \
"${{ github.server_url }}/api/v1/repos/${{ github.repository }}/collaborators/${login}")
if [ "$code" = "204" ]; then
echo "::notice::commenter ${login} is a collaborator — proceeding"
echo "proceed=true" >> "$GITHUB_OUTPUT"
echo "::notice::Recheck from ${login} (collaborator=true)"
else
echo "::notice::comment from non-collaborator ${login} (HTTP ${code}) ignored"
echo "proceed=false" >> "$GITHUB_OUTPUT"
echo "::notice::Recheck from ${login} (collaborator=false, HTTP ${code}) — proceeding with read-only eval anyway"
fi
- name: Check out BASE ref (A4 — never PR-head)
@ -120,17 +139,11 @@ jobs:
# script source is always the default-branch version.
# NEVER use ref: ${{ github.event.pull_request.head.sha }} —
# that would execute PR-head code with secrets-context.
if: |
github.event_name == 'pull_request_target' ||
(github.event_name == 'issue_comment' && steps.priv.outputs.proceed == 'true')
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
ref: ${{ github.event.repository.default_branch }}
- name: Evaluate qa-review
if: |
github.event_name == 'pull_request_target' ||
(github.event_name == 'issue_comment' && steps.priv.outputs.proceed == 'true')
env:
GITEA_TOKEN: ${{ secrets.RFC_324_TEAM_READ_TOKEN || secrets.GITHUB_TOKEN }}
GITEA_HOST: git.moleculesai.app

View File

@ -21,6 +21,8 @@ permissions:
jobs:
approved:
# See qa-review.yml header for full A1-α / A1.1 (v1.3 — informational
# log only, NOT a gate) / A4 / A5 design rationale.
if: |
github.event_name == 'pull_request_target' ||
(github.event_name == 'issue_comment' &&
@ -28,8 +30,10 @@ jobs:
startsWith(github.event.comment.body, '/security-recheck'))
runs-on: ubuntu-latest
steps:
- name: Privilege check (A1.1 — collaborator API, NOT author_association)
id: priv
- name: Privilege check (A1.1 — INFORMATIONAL log only, NOT a gate)
# RFC#324 v1.3 §A1.1: does NOT gate subsequent steps. See
# qa-review.yml for full rationale. Eval is read-only/idempotent
# so re-running on a non-collaborator comment is harmless.
if: github.event_name == 'issue_comment'
env:
GITEA_TOKEN: ${{ secrets.RFC_324_TEAM_READ_TOKEN || secrets.GITHUB_TOKEN }}
@ -40,25 +44,17 @@ jobs:
-H "Authorization: token ${GITEA_TOKEN}" \
"${{ github.server_url }}/api/v1/repos/${{ github.repository }}/collaborators/${login}")
if [ "$code" = "204" ]; then
echo "::notice::commenter ${login} is a collaborator — proceeding"
echo "proceed=true" >> "$GITHUB_OUTPUT"
echo "::notice::Recheck from ${login} (collaborator=true)"
else
echo "::notice::comment from non-collaborator ${login} (HTTP ${code}) ignored"
echo "proceed=false" >> "$GITHUB_OUTPUT"
echo "::notice::Recheck from ${login} (collaborator=false, HTTP ${code}) — proceeding with read-only eval anyway"
fi
- name: Check out BASE ref (A4 — never PR-head)
if: |
github.event_name == 'pull_request_target' ||
(github.event_name == 'issue_comment' && steps.priv.outputs.proceed == 'true')
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
ref: ${{ github.event.repository.default_branch }}
- name: Evaluate security-review
if: |
github.event_name == 'pull_request_target' ||
(github.event_name == 'issue_comment' && steps.priv.outputs.proceed == 'true')
env:
GITEA_TOKEN: ${{ secrets.RFC_324_TEAM_READ_TOKEN || secrets.GITHUB_TOKEN }}
GITEA_HOST: git.moleculesai.app