sop-checklist: merge PR#1284 blank-line skip into PR#1289 branch
Some checks failed
audit-force-merge / audit (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Waiting to run
CI / all-required (pull_request) Waiting to run
CI / Detect changes (pull_request) Waiting to run
CI / Canvas Deploy Reminder (pull_request) Blocked by required conditions
E2E API Smoke Test / detect-changes (pull_request) Waiting to run
CI / Shellcheck (E2E scripts) (pull_request) Successful in 49s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 27s
Harness Replays / detect-changes (pull_request) Successful in 29s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 33s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 1m10s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 37s
gate-check-v3 / gate-check (pull_request) Failing after 59s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 1m19s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m57s
qa-review / approved (pull_request) Failing after 40s
security-review / approved (pull_request) Failing after 36s
sop-tier-check / tier-check (pull_request) Successful in 19s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Successful in 2m50s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 2m27s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 2m59s
Ops Scripts Tests / Ops scripts (unittest) (pull_request) Successful in 2m20s
lint-mask-pr-atomicity / lint-mask-pr-atomicity (pull_request) Successful in 3m9s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Successful in 3m6s
CI / Python Lint & Test (pull_request) Successful in 8m10s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 11s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 12s
sop-checklist / all-items-acked (pull_request) acked: 6/7 — missing: no-backwards-compat
CI / Platform (Go) (pull_request) Successful in 18m45s
CI / Canvas (Next.js) (pull_request) Successful in 19m37s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Has been cancelled
Harness Replays / Harness Replays (pull_request) Has been cancelled
E2E API Smoke Test / E2E API Smoke Test (pull_request) Has been cancelled

Two fixes combined on one branch to avoid a conflict:

1. PR#1284 (section_marker_present): scan forward through blank lines
   before falling back to the backward checkbox search. Handles the
   ## Header\n\ncontent pattern where the answer sits two lines below
   the marker. Also uses body.rstrip() so the scan works correctly
   when the body ends with a trailing newline.

2. PR#1289 (section_marker_present): tighten the backward checkbox
   fallback — constrain it to the current line only (not a 2000-char
   window) and require meaningful content between the checkbox and the
   marker text, so that empty checkbox lines like
   `- [ ] **Marker**:` don't false-positive.

3. parse_directives return type: changed from list to
   tuple[list, list] (directives, na_directives) per PR#1263 guidance.
   Call sites updated to unpack [0].

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
Molecule AI · core-be 2026-05-16 05:12:27 +00:00
parent 6a86b84c92
commit 0f9baa5c0b

View File

@ -118,17 +118,21 @@ _DIRECTIVE_RE = re.compile(
def parse_directives(
comment_body: str,
numeric_aliases: dict[int, str],
) -> list[tuple[str, str, str]]:
) -> tuple[list[tuple[str, str, str]], list[tuple[str, str]]]:
"""Extract /sop-ack and /sop-revoke directives from a comment body.
Returns a list of (kind, canonical_slug, note) tuples where:
kind is "sop-ack" or "sop-revoke"
canonical_slug is the normalized form (or "" if unparseable)
note is the trailing free-text (may be "")
Returns (directives, na_directives) where:
directives is a list of (kind, canonical_slug, note) tuples
kind is "sop-ack" or "sop-revoke"
canonical_slug is the normalized form (or "" if unparseable)
note is the trailing free-text (may be "")
na_directives is a list of (gate_name, reason) tuples
gate_name is "qa-review" or "security-review" (raw from comment)
reason is the free-text after the gate name (may be "")
"""
out: list[tuple[str, str, str]] = []
if not comment_body:
return out
return out, []
for m in _DIRECTIVE_RE.finditer(comment_body):
kind = m.group(1)
raw_slug = (m.group(2) or "").strip()
@ -159,7 +163,8 @@ def parse_directives(
# If we collapsed multi-word slug into kebab and there's a
# trailing-text group too, append it.
out.append((kind, canonical, note_from_group))
return out
# N/A directive parsing reserved for future expansion; stub returns [].
return out, []
# ---------------------------------------------------------------------------
@ -172,8 +177,8 @@ def section_marker_present(body: str, marker: str) -> bool:
on a non-empty line (i.e. the author actually filled it in).
We require the marker substring AND non-whitespace content on the
same line OR within the next line this prevents trivially-empty
checklists like:
same line OR within the next non-blank line this prevents
trivially-empty checklists like:
## SOP-Checklist
- [ ] **Comprehensive testing performed**:
@ -182,9 +187,18 @@ def section_marker_present(body: str, marker: str) -> bool:
from auto-passing the section-present check. The peer-ack is still
required, but answering with empty content is captured as a soft
finding via the section-present test alone.
NOTE: we scan forward through blank lines (the markdown-header pattern
is ## Header\\n\\ncontent) so that a header + blank-line + content
structure still satisfies the check. The backward checkbox fallback
catches inline markers without a preceding checkbox (mc#1099).
"""
if not body or not marker:
return False
# Strip trailing whitespace so the blank-line scan below can find
# content that appears on the very last line of the body (without
# being misled by a trailing \n or spaces).
body = body.rstrip()
body_lower = body.lower()
marker_lower = marker.lower()
idx = body_lower.find(marker_lower)
@ -200,23 +214,44 @@ def section_marker_present(body: str, marker: str) -> bool:
stripped = re.sub(r"[\s\*:\-\[\]]+", "", line)
if stripped:
return True
# Fall through: check the NEXT line (multi-line answers).
next_line_end = body.find("\n", line_end + 1)
if next_line_end < 0:
next_line_end = len(body)
next_line = body[line_end + 1:next_line_end]
stripped_next = re.sub(r"[\s\*:\-\[\]]+", "", next_line)
if stripped_next:
return True
# Fall through: scan forward, skipping blank-only lines, until we find
# non-empty content or run out of body. Handles:
# ## Header ← marker line (empty after marker)
# ← blank line (skipped)
# - actual content ← found
pos = line_end
while True:
# Skip the current newline and any additional newlines (blank lines).
while pos < len(body) and body[pos] == "\n":
pos += 1
if pos >= len(body):
break
line_end = body.find("\n", pos)
if line_end < 0:
line_end = len(body)
line = body[pos:line_end]
stripped = re.sub(r"[\s\*:\-\[\]]+", "", line)
if stripped:
return True
pos = line_end
# Last resort: the marker may appear mid-sentence (e.g.
# **Memory/saved-feedback consulted**: No applicable...).
# The checkbox is on the PRECEDING line. Search backward from
# the marker for the checkbox pattern.
# Search backward within the CURRENT LINE only (not preceding lines)
# to find a checkbox on the same line before the marker text.
# mc#1099 follow-up: memory-consulted detection was failing because
# the checkbox was 600+ chars before the inline marker text.
# the checkbox was on the same line before the inline marker.
_CHECKBOX_RE = re.compile(r"- \[[ x\]]|<input", re.IGNORECASE)
before = body[max(0, idx - 2000):idx]
return bool(_CHECKBOX_RE.search(before))
line_start = body.rfind("\n", 0, idx) + 1 # 0 if no newline before idx
before = body[line_start:idx]
m = _CHECKBOX_RE.search(before)
if not m:
return False
# Require meaningful content between the checkbox and the marker text
# (markdown formatting like ** or * must also be stripped).
# If only whitespace/markdown chars remain, the checkbox line is empty.
between = before[m.end() :]
stripped_between = re.sub(r"[\s\*:#\[\]_\-]+", "", between)
return bool(stripped_between)
# ---------------------------------------------------------------------------
@ -259,7 +294,7 @@ def compute_ack_state(
user = (c.get("user") or {}).get("login", "")
if not user:
continue
for kind, slug, _note in parse_directives(body, numeric_aliases):
for kind, slug, _note in parse_directives(body, numeric_aliases)[0]:
if not slug:
unparseable_per_user[user] = unparseable_per_user.get(user, 0) + 1
continue