Compare commits

...

4 Commits

Author SHA1 Message Date
infra-sre e860114ef1 fix(merge-queue): add remove_label function needed by ApiError handler
Runtime PR-Built Compatibility / detect-changes (pull_request) Waiting to run
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Blocked by required conditions
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 1m3s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 15s
CI / Canvas Deploy Reminder (pull_request) Blocked by required conditions
E2E API Smoke Test / E2E API Smoke Test (pull_request) Blocked by required conditions
Harness Replays / Harness Replays (pull_request) Blocked by required conditions
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Waiting to run
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Waiting to run
lint-mask-pr-atomicity / lint-mask-pr-atomicity (pull_request) Waiting to run
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Waiting to run
publish-runtime-autobump / bump-and-tag (pull_request) Waiting to run
Harness Replays / detect-changes (pull_request) Successful in 1m0s
E2E API Smoke Test / detect-changes (pull_request) Waiting to run
Handlers Postgres Integration / detect-changes (pull_request) Waiting to run
gate-check-v3 / gate-check (pull_request) Successful in 28s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 46s
publish-runtime-autobump / pr-validate (pull_request) Successful in 1m19s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m39s
audit-force-merge / audit (pull_request) Has been skipped
qa-review / approved (pull_request) Successful in 28s
security-review / approved (pull_request) Successful in 28s
sop-checklist / all-items-acked (pull_request) Successful in 33s
sop-tier-check / tier-check (pull_request) Successful in 29s
Ops Scripts Tests / Ops scripts (unittest) (pull_request) Failing after 1m47s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 2m18s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 40s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Successful in 3m1s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 45s
MCP Stdio Transport Regression / MCP stdio with regular-file stdout (pull_request) Successful in 2m11s
Check migration collisions / Migration version collision check (pull_request) Successful in 2m25s
CI / Detect changes (pull_request) Successful in 2m0s
CI / Python Lint & Test (pull_request) Successful in 8m13s
CI / Canvas (Next.js) (pull_request) Successful in 18m12s
CI / Platform (Go) (pull_request) Failing after 20m25s
CI / all-required (pull_request) Failing after 27m35s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Has been cancelled
The ApiError handler (added in 7c08352d) calls remove_label() to strip
the queue label from PRs blocked by pre-receive hooks, but the function
was never defined — causing NameError on the first merge failure and
crashing the workflow tick.

Fixes: mc#1144 (queue stalls after pre-receive hook 405)
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-15 09:15:08 +00:00
infra-sre d882060249 Merge branch 'main' of https://git.moleculesai.app/molecule-ai/molecule-core into sre/fix-queue-gate-context 2026-05-15 09:09:15 +00:00
infra-sre a1146d2f5f fix(merge-queue): remove broken qa/sec gates from REQUIRED_CONTEXTS
CI / Shellcheck (E2E scripts) (pull_request) Waiting to run
CI / Python Lint & Test (pull_request) Waiting to run
CI / all-required (pull_request) Waiting to run
E2E API Smoke Test / detect-changes (pull_request) Waiting to run
E2E API Smoke Test / E2E API Smoke Test (pull_request) Blocked by required conditions
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Waiting to run
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Blocked by required conditions
Harness Replays / detect-changes (pull_request) Waiting to run
Harness Replays / Harness Replays (pull_request) Blocked by required conditions
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Waiting to run
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Waiting to run
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Waiting to run
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Waiting to run
lint-required-no-paths / lint-required-no-paths (pull_request) Waiting to run
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Waiting to run
Secret scan / Scan diff for credential-shaped strings (pull_request) Waiting to run
Ops Scripts Tests / Ops scripts (unittest) (pull_request) Waiting to run
qa-review / approved (pull_request) Waiting to run
security-review / approved (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 40s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 1m3s
CI / Detect changes (pull_request) Successful in 2m43s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 1m55s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 13s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 7m31s
CI / Platform (Go) (pull_request) Successful in 23m21s
CI / Canvas (Next.js) (pull_request) Successful in 23m56s
CI / Canvas Deploy Reminder (pull_request) Successful in 8s
gate-check-v3 / gate-check (pull_request) Successful in 8s
sop-checklist / all-items-acked (pull_request) Successful in 13s
sop-tier-check / tier-check (pull_request) Successful in 18s
lint-mask-pr-atomicity / lint-mask-pr-atomicity (pull_request) Failing after 1m19s
qa-review and security-review gates permanently fail (mc#1111:
SOP_TIER_CHECK_TOKEN missing PAT — token owner not in qa/security
teams, HTTP 403 on team membership probe). Adding them to
REQUIRED_CONTEXTS would cause the queue to strip the merge-queue
label from every PR in the queue, breaking the queue for all
contributors.

Keep the ApiError error-handling from the previous commit (catches
405/422/409 from merge_pull and removes the label + posts a comment).
That logic prevents infinite retries on blocked PRs even without
qa/sec gates.

Re-add qa-review and security-review to REQUIRED_CONTEXTS once
mc#1111 is resolved.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-15 07:52:11 +00:00
infra-sre 27b6df119c fix(merge-queue): add review gates and handle merge failures gracefully
CI / Canvas Deploy Reminder (pull_request) Blocked by required conditions
CI / Python Lint & Test (pull_request) Waiting to run
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 17s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 37s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 22s
CI / Detect changes (pull_request) Successful in 1m9s
E2E API Smoke Test / detect-changes (pull_request) Successful in 1m49s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 28s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 1m44s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 30s
security-review / approved (pull_request) Failing after 38s
gate-check-v3 / gate-check (pull_request) Successful in 56s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 1m23s
qa-review / approved (pull_request) Failing after 42s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m51s
sop-tier-check / tier-check (pull_request) Successful in 33s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 2m4s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 34s
Ops Scripts Tests / Ops scripts (unittest) (pull_request) Failing after 2m1s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 2m53s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 19s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 11s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Successful in 3m5s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 34s
CI / Canvas (Next.js) (pull_request) Failing after 17m0s
CI / Platform (Go) (pull_request) Failing after 17m1s
CI / all-required (pull_request) Failing after 26m53s
sop-checklist / all-items-acked (pull_request) [info tier:low] acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, l
Two fixes to the serialized Gitea merge queue:

1. REQUIRED_CONTEXTS now includes qa-review and security-review gates.
   Previously only CI/all-required and sop-checklist were checked, so
   PRs with failed reviews were merged (blocked by pre-receive hook)
   and retried forever — each tick re-attempting the same blocked PR.
   Adding the explicit review contexts causes the queue to WAIT instead
   of attempting merge, unblocking the next queued PR.

2. process_once() now catches ApiError on merge attempt and removes the
   merge-queue label rather than returning 0 and retrying the same PR
   on every subsequent tick. The comment on the PR informs the author
   what blocked the merge and tells them to re-add the label once
   resolved.

Fixes: mc# queue infinite retry on review-blocked PRs
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-15 06:24:15 +00:00
2 changed files with 53 additions and 1 deletions
+47 -1
View File
@@ -314,6 +314,29 @@ def post_comment(pr_number: int, body: str, *, dry_run: bool) -> None:
api("POST", f"/repos/{OWNER}/{NAME}/issues/{pr_number}/comments", body={"body": body})
def remove_label(pr_number: int, label: str, *, dry_run: bool) -> None:
print(f"::notice::removing label '{label}' from PR #{pr_number}")
if dry_run:
return
# Gitea requires label ID, not name, for deletion.
# Multiple labels can share the same name with different IDs — remove all.
_, body = api("GET", f"/repos/{OWNER}/{NAME}/pulls/{pr_number}")
pr_labels = body.get("labels", []) if isinstance(body, dict) else []
removed = False
for lbl in pr_labels:
if lbl.get("name") == label:
label_id = lbl.get("id")
if label_id:
api(
"DELETE",
f"/repos/{OWNER}/{NAME}/issues/{pr_number}/labels/{label_id}",
expect_json=False,
)
removed = True
if not removed:
print(f"::notice::label '{label}' not found on PR #{pr_number}")
def update_pull(pr_number: int, *, dry_run: bool) -> None:
print(f"::notice::updating PR #{pr_number} with base branch via style={UPDATE_STYLE}")
if dry_run:
@@ -407,7 +430,30 @@ def process_once(*, dry_run: bool = False) -> int:
"deferring to next tick"
)
return 0
merge_pull(pr_number, dry_run=dry_run)
try:
merge_pull(pr_number, dry_run=dry_run)
except ApiError as exc:
# Merge failed (pre-receive hook, branch protection, etc.).
# Remove queue label so next tick picks the next PR.
msg = str(exc)
if "405" in msg or "not allowed to merge" in msg.lower():
hint = "pre-receive hook or branch protection blocked the merge"
elif "422" in msg or "Unprocessable" in msg:
hint = "branch protection required-status check failed"
elif "409" in msg or "conflict" in msg.lower():
hint = "merge conflict"
else:
hint = msg[:200]
remove_label(pr_number, QUEUE_LABEL, dry_run=dry_run)
post_comment(
pr_number,
(
f"merge-queue: merge blocked ({hint}). "
f"Label removed — re-add once the block is resolved."
),
dry_run=dry_run,
)
return 0
return 0
return 0
+6
View File
@@ -48,6 +48,12 @@ jobs:
REQUIRED_CONTEXTS: >-
CI / all-required (pull_request),
sop-checklist / all-items-acked (pull_request)
# NOTE: qa-review / security-review gates intentionally omitted.
# These gates permanently fail (mc#1111: SOP_TIER_CHECK_TOKEN missing
# PAT — token owner not in qa/security teams). Adding them to
# REQUIRED_CONTEXTS would strip the merge-queue label from every PR
# in the queue, breaking the queue for all contributors.
# Re-add these gates once mc#1111 is resolved.
# Push-side required contexts. Checking CI / all-required (push)
# explicitly instead of the combined state avoids false-pause when
# non-blocking jobs (continue-on-error: true) have failed — those