Compare commits

..

8 Commits

Author SHA1 Message Date
core-uiux c0d0d6bd1a test(canvas): add form-inputs coverage (35 cases) + Section accessibility fix
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 10s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 17s
Harness Replays / detect-changes (pull_request) Successful in 19s
qa-review / approved (pull_request) Failing after 16s
CI / Detect changes (pull_request) Successful in 28s
security-review / approved (pull_request) Failing after 16s
E2E API Smoke Test / detect-changes (pull_request) Successful in 32s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 32s
gate-check-v3 / gate-check (pull_request) Successful in 27s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 31s
CI / Platform (Go) (pull_request) Successful in 7s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 31s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 7s
CI / Python Lint & Test (pull_request) Successful in 8s
Harness Replays / Harness Replays (pull_request) Successful in 8s
sop-tier-check / tier-check (pull_request) Successful in 13s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 7s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 6s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 7s
CI / Canvas (Next.js) (pull_request) Successful in 5m56s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / all-required (pull_request) Successful in 3s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 10m12s
audit-force-merge / audit (pull_request) Has been skipped
+ form-inputs.test.tsx: 35 cases across TextInput, NumberInput, Toggle,
  TagList, and Section — pure presentational components in the Config tab.
  Uses vi.hoisted() patterns from established suite; no jest-dom matchers.

+ form-inputs.tsx (Section): add aria-expanded + aria-controls to the
  collapsible toggle button for WCAG 2.1 AA compliance. The content div
  gets a stable id derived from the title; aria-controls links button to
  region. Indicator span gains aria-hidden="true" (decorative only).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-12 05:33:09 +00:00
core-uiux f3fd486d4e test(canvas/settings,chat): add coverage for EmptyState, SearchBar, UnsavedChangesGuard, AttachmentVideo
- EmptyState: 6 cases — icon aria-hidden, title, body text, CTA button
- SearchBar: 14 cases — store binding, onChange, Escape, Ctrl/Cmd+F focus
- UnsavedChangesGuard: 7 cases — dialog states, Keep/Discard actions, backdrop
  FIX: UnsavedChangesGuard now wires onDiscard via pendingDiscard ref so
  clicking Discard correctly calls the callback on dialog close
- AttachmentVideo: 8 cases — loading/ready/error states, tone borders,
  blob URL cleanup, external URI direct href

No breaking changes. 2387 tests passing.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-12 05:33:09 +00:00
core-uiux 2aceeb9ac3 test(canvas/settings): add DeleteConfirmDialog + SettingsButton coverage (26 cases)
- DeleteConfirmDialog (15 cases): dialog open via secret:delete-request event,
  title/body text, Cancel closes, dependents loading/list/none states,
  deleteSecret call, confirm 1s delay, disabled→enabled button transition
- SettingsButton (11 cases): aria-label, aria-expanded, gear SVG aria-hidden,
  toggle openPanel/closePanel, active class, tooltip Mac/Ctrl shortcut
  ResizeObserver polyfill for Radix Tooltip

No breaking changes. 2413 tests passing.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-12 05:33:09 +00:00
core-uiux 747cae8c15 test(canvas/settings): add ServiceGroup coverage (10 cases)
- role=group with aria-label containing service label
- Service icon aria-hidden, correct emoji per service name
- Count label: "1 key" vs "N keys"
- Renders SecretRow for each secret
- Header and rows div structure

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-12 05:33:09 +00:00
core-uiux b52fa5c065 test(canvas/chat): add AttachmentImage coverage (10 cases)
Adds Vitest coverage for AttachmentImage — inline image thumbnail with
click-to-fullscreen lightbox. Covers: loading skeleton (240×180),
ready state with blob URL, tone=user/agent border classes, lightbox
open/close on click and Escape, AttachmentChip error fallback, img
onError transition to chip, external URI direct href (no fetch), and
blob URL cleanup on unmount.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-12 05:33:09 +00:00
core-uiux 2793e2425b test(canvas/chat): add AttachmentAudio + AttachmentPDF coverage (18 cases)
Adds Vitest coverage for two missing attachment renderers:

AttachmentAudio (9 cases):
  - Loading skeleton (280x40) with aria-label
  - <audio controls> with blob src when ready
  - Filename label in ready state
  - tone=user -> blue/accent border
  - tone=agent -> neutral border
  - Error -> AttachmentChip fallback
  - audio onError -> chip transition
  - External URI -> direct href, no fetch
  - Blob URL cleanup on unmount

AttachmentPDF (9 cases):
  - Loading skeleton with PdfGlyph + filename
  - Preview button with glyph, filename, "PDF" label
  - Lightbox opens with <embed> on click
  - Lightbox closes on Escape
  - tone=user -> blue/accent classes on button
  - tone=agent -> neutral border
  - Error -> AttachmentChip fallback
  - External URI -> direct href, no fetch
  - Blob URL cleanup on unmount

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-12 05:33:09 +00:00
core-uiux 9fc718cd3d test(canvas/chat): add AttachmentTextPreview coverage (12 cases)
Adds Vitest coverage for AttachmentTextPreview — inline text/code
preview with streaming fetch and expand/truncate.

Covers:
  - Loading skeleton (320x80) with aria-label
  - Ready state with correct text content
  - Filename shown in header
  - Expand button appears when lines > 10
  - Expand button hidden when all lines shown
  - Expand button updates display to full content
  - Download button calls onDownload
  - tone=user -> blue/accent border
  - tone=agent -> neutral border
  - Truncated notice when file exceeds 256 KB
  - Error -> AttachmentChip fallback
  - Cleanup on unmount

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-12 05:33:08 +00:00
core-uiux 59f738a5d3 test(settings): add TokensTab coverage (12 cases)
12 passing: loading spinner, empty state, token list rendering,
each token's prefix/age/Revoke button, API URL correctness, revoke
confirm + cancel dialogs, new-token creation + dismiss, create error,
network error banner.

Root bug fixed: confirm button search was unscoped — when the dialog
opened, two "Revoke" buttons existed (tok2's row + dialog confirm);
find() returned tok2's button first. Scoped the search to
document.querySelector('[role="dialog"]') to hit the correct target.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-12 05:33:08 +00:00
95 changed files with 598 additions and 13476 deletions
-113
View File
@@ -1,113 +0,0 @@
#!/usr/bin/env python3
"""Lint workflow bash for curl status-code capture pollution.
The bad shape is:
HTTP_CODE=$(curl ... -w '%{http_code}' ... || echo "000")
`curl -w` writes the HTTP code to stdout before returning non-zero, so
fallback output inside the same command substitution appends another code.
"""
from __future__ import annotations
import argparse
import glob
import re
import sys
from pathlib import Path
from typing import NamedTuple
SELF = ".gitea/workflows/lint-curl-status-capture.yml"
class Finding(NamedTuple):
path: str
snippet: str
BAD_STATUS_CAPTURE = re.compile(
r"""
\$\(\s*
curl\b
[^)]*
-w\s*['"]%\{http_code\}['"]
[^)]*
\|\|\s*
(?:
echo\s+['"]?000['"]?
|
printf\s+['"]000['"]
)
\s*\)
""",
re.DOTALL | re.VERBOSE,
)
def _logical_shell(content: str) -> str:
"""Collapse bash line continuations so one curl command is one string."""
return re.sub(r"\\\s*\n\s*", " ", content)
def scan_content(path: str, content: str) -> list[Finding]:
flat = _logical_shell(content)
return [
Finding(path=path, snippet=re.sub(r"\s+", " ", match.group(0)).strip()[:160])
for match in BAD_STATUS_CAPTURE.finditer(flat)
]
def scan_paths(paths: list[str]) -> list[Finding]:
findings: list[Finding] = []
for path in paths:
if path == SELF:
continue
content = Path(path).read_text(encoding="utf-8")
findings.extend(scan_content(path, content))
return findings
def default_paths() -> list[str]:
return sorted(glob.glob(".gitea/workflows/*.yml"))
def print_report(findings: list[Finding]) -> None:
if not findings:
print("OK No curl-status-capture pollution patterns detected")
return
print(f"::error::Found {len(findings)} curl-status-capture pollution site(s):")
for finding in findings:
print(
f"::error file={finding.path}::Curl status-capture pollution: "
"'|| echo/printf 000' inside a $(curl ... -w '%{http_code}' ...) "
"subshell. On non-2xx or connection failure, curl's -w writes a "
"status, then exits non-zero, then the fallback appends another "
"status. Fix: route -w into a tempfile so the exit code cannot "
"pollute stdout."
)
print(f" matched: {finding.snippet}...")
print()
print("Fix template:")
print(" set +e")
print(" curl ... -w '%{http_code}' >code.txt 2>/dev/null")
print(" set -e")
print(' HTTP_CODE=$(cat code.txt 2>/dev/null)')
print(' [ -z "$HTTP_CODE" ] && HTTP_CODE="000"')
def main(argv: list[str] | None = None) -> int:
parser = argparse.ArgumentParser()
parser.add_argument("paths", nargs="*", help="workflow files to scan")
args = parser.parse_args(argv)
paths = args.paths or default_paths()
findings = scan_paths(paths)
print_report(findings)
return 1 if findings else 0
if __name__ == "__main__":
raise SystemExit(main())
-404
View File
@@ -1,404 +0,0 @@
#!/usr/bin/env python3
"""lint-required-no-paths — structural enforcement of
`feedback_path_filtered_workflow_cant_be_required`.
For every workflow whose status-check context appears in
`branch_protections/<branch>.status_check_contexts`, assert that the
workflow's `on:` block has NO `paths:` and NO `paths-ignore:` filter.
A required-check workflow with a paths filter silently degrades the
merge gate:
- If the PR's diff doesn't match the `paths:` glob, the workflow
never fires.
- Gitea (1.22.6) reports the required context as `pending` (never as
`skipped == success`), so the PR cannot merge.
- For a docs-only PR against `paths: ['**.go']`, the PR is
blocked forever — no human action can produce a green.
The class was previously prevented only by reviewer vigilance + the
saved memory `feedback_path_filtered_workflow_cant_be_required`. This
script makes it a hard CI gate so a future PR adding `paths:` to a
required workflow fails fast at PR time, not after merge when the next
docs PR wedges main.
The lint runs as `.gitea/workflows/lint-required-no-paths.yml` on every
PR. The lint workflow ITSELF must not have a paths-filter (otherwise it
could be circumvented by a paths-non-matching PR) — that's enforced by
self-reference and by the workflow's own `on:` block deliberately
omitting filters.
Sources of truth:
- `branch_protections/<branch>` `status_check_contexts` (the merge gate)
- `.gitea/workflows/*.yml` `name:` + `on:` (the workflow set)
Context-format note (Gitea 1.22.6):
Status-check contexts are formatted `{workflow_name} / {job_name_or_key} ({event})`.
We parse the workflow_name prefix and walk `.gitea/workflows/*.yml` for
a file whose `name:` attr matches. (The filename is NOT the source of
truth; `name:` is, because Gitea formats the context from `name:`.)
Exit codes:
0 — no required workflow has a paths/paths-ignore filter (clean) OR
branch_protections endpoint returned 403/404 (token-scope issue;
surfaced via ::error:: but non-fatal so a missing scope doesn't
red-X every PR — fix the token, not the lint).
1 — at least one required workflow has a paths/paths-ignore filter
(the gate-degrading defect class).
2 — env contract violation (missing GITEA_TOKEN/HOST/REPO/BRANCH).
3 — workflows directory missing or workflow YAML unparseable.
4 — protection response shape unexpected (non-dict body on 2xx).
Auth note: `GET /repos/.../branch_protections/{branch}` requires
repo-admin role in Gitea 1.22.6. The workflow-default `GITHUB_TOKEN`
is non-admin; we re-use `DRIFT_BOT_TOKEN` (same persona that powers
ci-required-drift.yml). If `DRIFT_BOT_TOKEN` is unavailable in a future
context, the script falls through gracefully (exit 0 + ::error::).
"""
from __future__ import annotations
import json
import os
import re
import sys
import urllib.error
import urllib.parse
import urllib.request
from pathlib import Path
from typing import Any
import yaml # PyYAML 6.0.2 — installed by the workflow before this runs.
# --------------------------------------------------------------------------
# Environment
# --------------------------------------------------------------------------
def _env(key: str, *, required: bool = True, default: str | None = None) -> str:
val = os.environ.get(key, default)
if required and not val:
sys.stderr.write(f"::error::missing required env var: {key}\n")
sys.exit(2)
return val or ""
GITEA_TOKEN = _env("GITEA_TOKEN", required=False)
GITEA_HOST = _env("GITEA_HOST", required=False)
REPO = _env("REPO", required=False)
BRANCH = _env("BRANCH", required=False, default="main")
WORKFLOWS_DIR = _env(
"WORKFLOWS_DIR", required=False, default=".gitea/workflows"
)
OWNER, NAME = (REPO.split("/", 1) + [""])[:2] if REPO else ("", "")
API = f"https://{GITEA_HOST}/api/v1" if GITEA_HOST else ""
def _require_runtime_env() -> None:
"""Enforce env contract — called from `run()` only. Tests import
individual functions without setting the full env contract."""
for key in ("GITEA_TOKEN", "GITEA_HOST", "REPO", "BRANCH"):
if not os.environ.get(key):
sys.stderr.write(f"::error::missing required env var: {key}\n")
sys.exit(2)
# --------------------------------------------------------------------------
# Tiny HTTP helper (mirrors ci-required-drift.py contract:
# raise on non-2xx and on JSON-decode-fail when JSON expected, per
# `feedback_api_helper_must_raise_not_return_dict`).
# --------------------------------------------------------------------------
class ApiError(RuntimeError):
"""Raised when a Gitea API call cannot be trusted to have succeeded."""
def api(
method: str,
path: str,
*,
body: dict | None = None,
query: dict[str, str] | None = None,
expect_json: bool = True,
) -> tuple[int, Any]:
url = f"{API}{path}"
if query:
url = f"{url}?{urllib.parse.urlencode(query)}"
data = None
headers = {
"Authorization": f"token {GITEA_TOKEN}",
"Accept": "application/json",
}
if body is not None:
data = json.dumps(body).encode("utf-8")
headers["Content-Type"] = "application/json"
req = urllib.request.Request(url, method=method, data=data, headers=headers)
try:
with urllib.request.urlopen(req, timeout=30) as resp:
raw = resp.read()
status = resp.status
except urllib.error.HTTPError as e:
raw = e.read()
status = e.code
if not (200 <= status < 300):
snippet = raw[:500].decode("utf-8", errors="replace") if raw else ""
raise ApiError(f"{method} {path} → HTTP {status}: {snippet}")
if not raw:
return status, None
try:
return status, json.loads(raw)
except json.JSONDecodeError as e:
if expect_json:
raise ApiError(
f"{method} {path} → HTTP {status} but body is not JSON: {e}"
) from e
return status, {"_raw": raw.decode("utf-8", errors="replace")}
# --------------------------------------------------------------------------
# Status-check context parser
# --------------------------------------------------------------------------
# Format: "<workflow_name> / <job_name_or_key> (<event>)"
# Examples observed on molecule-core/main:
# "Secret scan / Scan diff for credential-shaped strings (pull_request)"
# "sop-tier-check / tier-check (pull_request)"
#
# Split strategy: peel off the trailing ` (<event>)` first, then split
# the leading `<workflow> / <rest>` on the FIRST ` / ` (workflow names
# come from `name:` attrs which conventionally don't embed ' / '; job
# names CAN, so we keep the rest of the slash-divided text as the job
# name). This matches Gitea's `name: ` semantics.
_CONTEXT_RE = re.compile(r"^(?P<workflow>.+?) / (?P<job>.+) \((?P<event>[^)]+)\)$")
def parse_context(ctx: str) -> tuple[str, str, str] | None:
"""Parse `<workflow> / <job> (<event>)` → (workflow, job, event) or None."""
if not ctx:
return None
m = _CONTEXT_RE.match(ctx)
if not m:
return None
return m.group("workflow"), m.group("job"), m.group("event")
# --------------------------------------------------------------------------
# workflow-name → file resolution
# --------------------------------------------------------------------------
def _iter_workflow_files() -> list[Path]:
d = Path(WORKFLOWS_DIR)
if not d.is_dir():
sys.stderr.write(f"::error::workflows directory not found: {d}\n")
sys.exit(3)
# `.yml` and `.yaml` — Gitea accepts both (rarely used `.yaml`, but
# don't silently miss it if a future port uses it).
return sorted(list(d.glob("*.yml")) + list(d.glob("*.yaml")))
def resolve_workflow_file(workflow_name: str) -> Path | None:
"""Find the YAML file whose `name:` attr matches `workflow_name`.
Returns None if no match. Filename is NOT used as a fallback —
Gitea's context format uses `name:`, so a `name:`-less workflow
won't even appear in the protection list. (A YAML with no `name:`
would default the context to the file basename, but our protection
contexts on molecule-core are all `name:`-derived; we trust the
same.)
"""
for f in _iter_workflow_files():
try:
doc = yaml.safe_load(f.read_text(encoding="utf-8"))
except yaml.YAMLError as e:
sys.stderr.write(f"::error::YAML parse error in {f}: {e}\n")
sys.exit(3)
if isinstance(doc, dict) and doc.get("name") == workflow_name:
return f
return None
# --------------------------------------------------------------------------
# paths-filter detection
# --------------------------------------------------------------------------
# Triggers that accept `paths:` / `paths-ignore:` (per GitHub Actions /
# Gitea Actions docs): pull_request, pull_request_target, push.
# We don't enumerate — any sub-key named `paths` or `paths-ignore`
# inside an event mapping is flagged.
_PATHS_KEYS = ("paths", "paths-ignore")
def detect_paths_filters(workflow_path: Path) -> list[str]:
"""Walk the workflow's `on:` block and return a list of findings, one
per offending `paths`/`paths-ignore` key.
Returns:
Empty list if the workflow has no paths/paths-ignore filter
anywhere in its `on:` block. Otherwise, a list of human-readable
strings naming the event and filter key + the filter contents.
"""
try:
doc = yaml.safe_load(workflow_path.read_text(encoding="utf-8"))
except yaml.YAMLError as e:
sys.stderr.write(f"::error::YAML parse error in {workflow_path}: {e}\n")
sys.exit(3)
if not isinstance(doc, dict):
return []
on_block = doc.get("on") or doc.get(True) # PyYAML 6 quirk: `on:`
# under default constructor sometimes becomes the bool key `True`
# because YAML 1.1 treats `on` as a boolean. Tolerate both.
if on_block is None:
return []
findings: list[str] = []
# Shape A: `on: pull_request` (string shorthand) — cannot carry filters.
if isinstance(on_block, str):
return []
# Shape B: `on: [pull_request, push]` (list shorthand) — cannot carry filters.
if isinstance(on_block, list):
return []
# Shape C: `on: { event: { ... } }` — the standard mapping case.
if isinstance(on_block, dict):
# Defensive: top-level malformed `on.paths` (someone wrote
# `on: { paths: ['x'] }` thinking it's a workflow-level filter).
# This is invalid syntax, but if present, flag it — it might
# not block the workflow from registering (Gitea may ignore the
# unknown key) and would create a false sense of "filter exists"
# the lint should still surface.
for k in _PATHS_KEYS:
if k in on_block:
v = on_block[k]
findings.append(
f"top-level `on.{k}` filter (malformed but present): {v!r}"
)
for event, event_body in on_block.items():
if event in _PATHS_KEYS:
continue # already handled above
if not isinstance(event_body, dict):
# `pull_request: null` / `pull_request: [opened]` shapes —
# no place for a paths filter to live; skip.
continue
for k in _PATHS_KEYS:
if k in event_body:
v = event_body[k]
findings.append(
f"`on.{event}.{k}` filter present: {v!r}"
)
return findings
# --------------------------------------------------------------------------
# Driver
# --------------------------------------------------------------------------
def run() -> int:
"""Main lint entrypoint. Returns the process exit code.
Exit semantics (see module docstring for full table):
0 — clean (no offending paths-filter on any required workflow),
OR protection unreadable (403/404) — surfaced as ::error::
but treated as non-fatal so token-scope issues don't red-X
every PR.
1 — at least one required workflow carries a paths/paths-ignore
filter — the regression class this lint exists to prevent.
"""
_require_runtime_env()
protection_path = f"/repos/{OWNER}/{NAME}/branch_protections/{BRANCH}"
try:
_, protection = api("GET", protection_path)
except ApiError as e:
msg = str(e)
m = re.search(r"HTTP (\d{3})", msg)
http_status = int(m.group(1)) if m else None
if http_status in (403, 404):
sys.stderr.write(
f"::error::GET {protection_path} returned HTTP {http_status}"
f"DRIFT_BOT_TOKEN lacks repo-admin scope (Gitea 1.22.6 "
f"requires it for this endpoint) OR branch '{BRANCH}' has "
f"no protection configured. Cannot enumerate required "
f"checks; skipping lint with exit 0 to avoid red-X on "
f"every PR. Fix: grant repo-admin to mc-drift-bot.\n"
)
return 0
raise
if not isinstance(protection, dict):
sys.stderr.write(
f"::error::protection response for {BRANCH} not a JSON object\n"
)
return 4
contexts: list[str] = list(protection.get("status_check_contexts") or [])
if not contexts:
print(
f"::notice::branch_protections/{BRANCH} has 0 required "
f"status_check_contexts; nothing to lint. (no required contexts)"
)
return 0
print(f"::notice::Linting {len(contexts)} required context(s) for paths-filter regressions:")
for c in contexts:
print(f" - {c}")
offenders: list[tuple[str, Path, list[str]]] = []
unresolved: list[str] = []
for ctx in contexts:
parsed = parse_context(ctx)
if parsed is None:
print(
f"::warning::could not parse context '{ctx}' "
f"(expected `<workflow> / <job> (<event>)`); skipping"
)
unresolved.append(ctx)
continue
workflow_name, _job, _event = parsed
wf_path = resolve_workflow_file(workflow_name)
if wf_path is None:
print(
f"::warning::no workflow file in {WORKFLOWS_DIR} has "
f"`name: {workflow_name}` (required context '{ctx}'); "
f"skipping paths-filter check. "
f"(orphaned-context detection is ci-required-drift's job.)"
)
unresolved.append(ctx)
continue
findings = detect_paths_filters(wf_path)
if findings:
offenders.append((workflow_name, wf_path, findings))
else:
print(f"::notice::OK {wf_path.name} ({workflow_name}) — no paths filter")
if offenders:
print("")
print(f"::error::Found {len(offenders)} required workflow(s) with paths/paths-ignore filters:")
for workflow_name, wf_path, findings in offenders:
for finding in findings:
# ::error file=... lets Gitea Actions surface a per-file
# annotation in the PR UI (when annotations are wired).
print(
f"::error file={wf_path}::Required workflow "
f"'{workflow_name}' ({wf_path.name}) has a paths "
f"filter that would degrade the merge gate to a "
f"silent indefinite pending: {finding}. "
f"See feedback_path_filtered_workflow_cant_be_required. "
f"Fix: remove the filter and instead gate per-step "
f"inside the job with `if: contains(steps.changed.outputs.files, ...)` "
f"or refactor to a single-job-with-per-step-if shape."
)
return 1
print("")
print(
f"::notice::OK — all {len(contexts) - len(unresolved)} resolvable "
f"required workflow(s) clean (no paths/paths-ignore filters)."
)
if unresolved:
print(
f"::notice::{len(unresolved)} required context(s) were not "
f"resolved to a workflow file (warn-not-fail); see warnings above."
)
return 0
if __name__ == "__main__":
sys.exit(run())
-369
View File
@@ -1,369 +0,0 @@
#!/usr/bin/env python3
"""lint-workflow-yaml — catch Gitea-1.22.6-hostile workflow YAML shapes.
This script enforces six structural rules that have historically caused
silent CI failures on Gitea Actions (1.22.6) — workflows that the server's
YAML parser rejects with `[W] ignore invalid workflow ...` and registers
for zero events, or shape conventions that produce ambiguous status
contexts. Each rule maps to a documented incident in saved memory.
Rules (4 fatal + 1 fatal cross-file + 1 heuristic-warn):
1. `workflow_dispatch.inputs:` block — Gitea 1.22.6 mis-parses the
`inputs` keys as sibling event types and rejects the whole file.
Memory: feedback_gitea_workflow_dispatch_inputs_unsupported.
Origin: 2026-05-11 PyPI freeze (publish-runtime).
2. `on: workflow_run:` event — not enumerated in Gitea 1.22.6's
supported event list (verified via modules/actions/workflows.go
enumeration; task #81). Workflow registers, fires for 0 events.
3. `name:` containing `/` — breaks the
`<workflow> / <job> (<event>)` commit-status context convention;
downstream parsers (sop-tier-check, status-reaper) tokenize on `/`.
4. `name:` collision across files — Gitea routes commit-status updates
by `name` and behavior on collision is undefined (status-reaper
rev1 fail-loud).
5. Cross-repo `uses: org/repo/path@ref` — blocked while
`[actions].DEFAULT_ACTIONS_URL=github` is the server default;
resolves to github.com/<org-suspended>/... and 404s.
Memory: feedback_gitea_cross_repo_uses_blocked. Cross-link: task #109.
6. (HEURISTIC, warn-not-fail) Steps reference `https://api.github.com`
or `https://github.com/.../releases/download` without a
workflow-level `env.GITHUB_SERVER_URL` set to the Gitea instance.
Memory: feedback_act_runner_github_server_url.
Per `feedback_smoke_test_vendor_truth_not_shape_match`: fixtures used to
validate this lint must mirror real Gitea 1.22.6 YAML semantics, not
Python yaml-parser quirks. The test suite at tests/test_lint_workflow_yaml.py
includes a vendor-truth fixture (the exact publish-runtime regression).
Usage:
python3 .gitea/scripts/lint-workflow-yaml.py
Lint every `*.yml` in `.gitea/workflows/`.
python3 .gitea/scripts/lint-workflow-yaml.py --workflow-dir <path>
Lint a custom directory (used by tests/test_lint_workflow_yaml.py).
Exit codes:
0 — clean OR only heuristic-warnings emitted.
1 — at least one fatal rule (1-5) violated.
2 — YAML parse error or argv usage error.
"""
from __future__ import annotations
import argparse
import collections
import glob
import os
import re
import sys
from pathlib import Path
from typing import Any, Iterable
try:
import yaml
except ImportError:
print("::error::PyYAML is required. Install with: pip install PyYAML", file=sys.stderr)
sys.exit(2)
# YAML quirk: bare `on:` at the top level parses to the Python `True`
# (because `on` is a YAML 1.1 boolean alias). Handle both keys.
def _get_on(d: dict) -> Any:
if not isinstance(d, dict):
return None
if "on" in d:
return d["on"]
if True in d:
return d[True]
return None
# ---------------------------------------------------------------------------
# Rule 1 — workflow_dispatch.inputs block (Gitea 1.22.6 parser rejects)
# ---------------------------------------------------------------------------
def check_workflow_dispatch_inputs(filename: str, doc: Any) -> list[str]:
"""Return per-violation error lines if `workflow_dispatch.inputs` is set."""
errors: list[str] = []
on = _get_on(doc)
if not isinstance(on, dict):
return errors
wd = on.get("workflow_dispatch")
if isinstance(wd, dict) and wd.get("inputs"):
errors.append(
f"::error file={filename}::Rule 1 (FATAL): "
f"`on.workflow_dispatch.inputs:` block detected. Gitea 1.22.6 "
f"silently rejects the entire workflow with `[W] ignore invalid "
f"workflow: unknown on type: map[...]`. Drop the `inputs:` block "
f"and derive parameters from tag name / env / external query. "
f"Memory: feedback_gitea_workflow_dispatch_inputs_unsupported."
)
return errors
# ---------------------------------------------------------------------------
# Rule 2 — on: workflow_run (not supported on Gitea 1.22.6)
# ---------------------------------------------------------------------------
def check_workflow_run_event(filename: str, doc: Any) -> list[str]:
"""Return per-violation error lines if `on: workflow_run:` is used."""
errors: list[str] = []
on = _get_on(doc)
if isinstance(on, dict) and "workflow_run" in on:
errors.append(
f"::error file={filename}::Rule 2 (FATAL): `on: workflow_run:` "
f"event used. Gitea 1.22.6 does NOT support `workflow_run` "
f"(verified via modules/actions/workflows.go enumeration; "
f"task #81). Workflow will fire for zero events. Use a "
f"`schedule:` cron OR a `push:` trigger with `paths:` filter "
f"on the upstream workflow file as the cross-workflow gate."
)
elif isinstance(on, list) and "workflow_run" in on:
errors.append(
f"::error file={filename}::Rule 2 (FATAL): `on: workflow_run` "
f"in event list. Not supported on Gitea 1.22.6 — task #81."
)
return errors
# ---------------------------------------------------------------------------
# Rule 3 — name: contains "/" (breaks status-context tokenization)
# ---------------------------------------------------------------------------
def check_name_with_slash(filename: str, doc: Any) -> list[str]:
"""Return per-violation error lines if workflow `name:` contains a slash."""
errors: list[str] = []
if not isinstance(doc, dict):
return errors
name = doc.get("name")
if isinstance(name, str) and "/" in name:
errors.append(
f"::error file={filename}::Rule 3 (FATAL): workflow `name: "
f"{name!r}` contains `/`. The commit-status context convention "
f"is `<workflow> / <job> (<event>)`; embedding `/` in the "
f"workflow name makes downstream parsers (sop-tier-check, "
f"status-reaper) tokenize ambiguously. Rename to use `-` or "
f"` ` instead."
)
return errors
# ---------------------------------------------------------------------------
# Rule 4 — cross-file name collision
# ---------------------------------------------------------------------------
def check_name_collision_across_files(
docs_by_file: dict[str, Any],
) -> list[str]:
"""Return per-collision error lines if two files share the same `name:`."""
errors: list[str] = []
by_name: dict[str, list[str]] = collections.defaultdict(list)
for filename, doc in docs_by_file.items():
if isinstance(doc, dict):
n = doc.get("name")
if isinstance(n, str) and n:
by_name[n].append(filename)
for n, files in sorted(by_name.items()):
if len(files) > 1:
errors.append(
f"::error::Rule 4 (FATAL): workflow `name: {n!r}` collision "
f"across {len(files)} files: {files}. Gitea routes "
f"commit-status updates by `name`; collision yields "
f"undefined behavior. Give each workflow a unique `name:`."
)
return errors
# ---------------------------------------------------------------------------
# Rule 5 — cross-repo `uses: org/repo/path@ref`
# ---------------------------------------------------------------------------
# `uses: <foo>@<ref>` — match the value form Gitea/act actually parse.
# We need to distinguish:
# - `actions/checkout@<sha>` OK (bare org/repo@ref, no subpath)
# - `./.gitea/actions/foo` OK (local path)
# - `docker://image:tag` OK (docker-image form)
# - `molecule-ai/molecule-ci/.gitea/actions/audit-force-merge@main` BAD
USES_CROSS_REPO_RE = re.compile(
r"""^
(?P<owner>[A-Za-z0-9_.\-]+)
/
(?P<repo>[A-Za-z0-9_.\-]+)
/ # mandatory subpath separator => cross-repo composite/reusable
(?P<path>[^@\s]+)
@
(?P<ref>\S+)
$""",
re.VERBOSE,
)
def _iter_uses(doc: Any) -> Iterable[str]:
"""Yield every `uses:` string from job steps in a workflow document."""
if not isinstance(doc, dict):
return
jobs = doc.get("jobs")
if not isinstance(jobs, dict):
return
for job in jobs.values():
if not isinstance(job, dict):
continue
# reusable workflow: `uses:` at the job level
if isinstance(job.get("uses"), str):
yield job["uses"]
steps = job.get("steps")
if not isinstance(steps, list):
continue
for step in steps:
if isinstance(step, dict) and isinstance(step.get("uses"), str):
yield step["uses"]
def check_cross_repo_uses(filename: str, doc: Any) -> list[str]:
"""Return per-violation error lines for cross-repo `uses:` references."""
errors: list[str] = []
for uses in _iter_uses(doc):
# Skip docker:// and local ./
if uses.startswith(("docker://", "./", "../")):
continue
m = USES_CROSS_REPO_RE.match(uses.strip())
if m:
errors.append(
f"::error file={filename}::Rule 5 (FATAL): cross-repo "
f"`uses: {uses}` detected. Gitea 1.22.6 with "
f"`[actions].DEFAULT_ACTIONS_URL=github` resolves this to "
f"github.com/{m.group('owner')}/{m.group('repo')} which "
f"404s (org suspended 2026-05-06). Inline the shared bash "
f"into `.gitea/scripts/` until task #109 (actions mirror) "
f"ships. Memory: feedback_gitea_cross_repo_uses_blocked."
)
return errors
# ---------------------------------------------------------------------------
# Rule 6 — heuristic: github.com/api refs without workflow-level
# GITHUB_SERVER_URL (WARN-not-FAIL per halt-condition 3)
# ---------------------------------------------------------------------------
# Match `https://api.github.com/...` (API call) — that's the actionable
# pattern. We intentionally do NOT match `https://github.com/.../releases/
# download/...` (jq-release pin) nor `https://github.com/${{ github.repository
# }}` (OCI label) because those are documented benign references on current
# main and would 100% false-positive (3 hits, per Phase 1 audit).
GITHUB_API_REF_RE = re.compile(
r"https://api\.github\.com\b|https://github\.com/api/",
re.IGNORECASE,
)
def _has_workflow_level_server_url(doc: Any) -> bool:
if not isinstance(doc, dict):
return False
env = doc.get("env")
if isinstance(env, dict) and "GITHUB_SERVER_URL" in env:
return True
return False
def check_github_server_url_missing(filename: str, doc: Any, raw: str) -> list[str]:
"""Return warn-lines (NOT errors) if api.github.com is referenced without
workflow-level GITHUB_SERVER_URL. Heuristic — false-positives possible.
"""
warns: list[str] = []
if not GITHUB_API_REF_RE.search(raw):
return warns
if _has_workflow_level_server_url(doc):
return warns
warns.append(
f"::warning file={filename}::Rule 6 (WARN, heuristic): file "
f"references `https://api.github.com` without a workflow-level "
f"`env.GITHUB_SERVER_URL: https://git.moleculesai.app`. The "
f"act_runner default for `${{{{ github.server_url }}}}` is "
f"github.com, which can break actions that auth-condition on "
f"server_url (e.g. actions/setup-go). If this curl is "
f"intentionally hitting GitHub (e.g. public release pin), ignore. "
f"Memory: feedback_act_runner_github_server_url."
)
return warns
# ---------------------------------------------------------------------------
# Driver
# ---------------------------------------------------------------------------
def main(argv: list[str] | None = None) -> int:
p = argparse.ArgumentParser(
description="Lint Gitea Actions workflow YAML for 1.22.6-hostile shapes."
)
p.add_argument(
"--workflow-dir",
default=".gitea/workflows",
help="Directory of workflow *.yml files (default: .gitea/workflows).",
)
args = p.parse_args(argv)
wf_dir = Path(args.workflow_dir)
if not wf_dir.exists():
# Empty / missing dir = nothing to lint, not a failure.
print(f"::notice::No workflow directory at {wf_dir}; skipping.")
return 0
yml_paths = sorted(
glob.glob(str(wf_dir / "*.yml")) + glob.glob(str(wf_dir / "*.yaml"))
)
if not yml_paths:
print(f"::notice::No workflow files in {wf_dir}; nothing to lint.")
return 0
fatal_errors: list[str] = []
warnings: list[str] = []
docs_by_file: dict[str, Any] = {}
for path in yml_paths:
rel = os.path.relpath(path)
try:
raw = Path(path).read_text()
doc = yaml.safe_load(raw)
except yaml.YAMLError as e:
fatal_errors.append(
f"::error file={rel}::YAML parse error: {e}. Cannot lint "
f"a file the parser rejects."
)
continue
docs_by_file[rel] = doc
# Per-file checks
fatal_errors.extend(check_workflow_dispatch_inputs(rel, doc))
fatal_errors.extend(check_workflow_run_event(rel, doc))
fatal_errors.extend(check_name_with_slash(rel, doc))
fatal_errors.extend(check_cross_repo_uses(rel, doc))
warnings.extend(check_github_server_url_missing(rel, doc, raw))
# Cross-file checks
fatal_errors.extend(check_name_collision_across_files(docs_by_file))
# Emit warnings first (non-blocking)
for w in warnings:
print(w)
if not fatal_errors:
n = len(yml_paths)
print(
f"::notice::lint-workflow-yaml: {n} workflow file(s) checked, "
f"no fatal Gitea-1.22.6-hostile shapes. "
f"({len(warnings)} heuristic warning(s) emitted.)"
)
return 0
# Emit fatal errors
print(
f"::error::lint-workflow-yaml: {len(fatal_errors)} fatal violation(s) "
f"across {len(yml_paths)} workflow file(s). See rule documentation "
f"in .gitea/scripts/lint-workflow-yaml.py docstring."
)
for e in fatal_errors:
print(e)
return 1
if __name__ == "__main__":
sys.exit(main())
@@ -1,509 +0,0 @@
#!/usr/bin/env python3
"""lint_bp_context_emit_match — Tier 2f per internal#350.
Rule
----
For a given protected branch, every context in
`branch_protections/<branch>.status_check_contexts` MUST be emitted
by at least one workflow in `.gitea/workflows/*.yml`. Two contexts
match when:
1. The workflow's `name:` equals the context's workflow-part (the
prefix before ` / `).
2. Some job in that workflow has a `name:` (or default-fallback
job-key) equal to the context's job-part (between ` / ` and
` (`).
3. The workflow's `on:` block includes the context's event-part
(in parens at the end), with Gitea's event-name mapping:
- `pull_request` and `pull_request_target` BOTH emit
`(pull_request)` contexts (verified empirically on
molecule-core/main).
- `push` emits `(push)`.
A BP context with no emitter blocks merges forever — Gitea treats
absent-as-`pending`, NOT absent-as-`skipped`-as-`success`. This is
the phantom-required-check class
(`feedback_phantom_required_check_after_gitea_migration`).
The inverse direction (emitter without BP context) is INFORMATIONAL
only — Tier 2g handles that direction at PR-time. Flagging it here
on a daily schedule would falsely surface every transitional state
during a BP rollout.
How the gate works
------------------
Daily scheduled run + workflow_dispatch:
1. GET `branch_protections/{BRANCH}` (needs DRIFT_BOT_TOKEN with
repo-admin scope; same persona as ci-required-drift.yml).
Graceful-degrade on 403/404 per Tier 2a contract.
2. Walk `.gitea/workflows/*.yml` via PyYAML AST. For each workflow,
enumerate its emitted contexts: `{workflow.name} / {job.name or
job-key} ({event})` for each event in `on:` that emits a status.
3. For each BP context, look for an emitter match. Aggregate
orphans.
4. If orphans exist:
- File or PATCH a `[ci-bp-drift]` issue (idempotency contract:
search for exact title prefix, edit existing if open).
- Apply labels `tier:high` + `ci-bp-drift` (lookup IDs per
repo; per `feedback_tier_label_ids_are_per_repo`).
- Exit 1.
5. If no orphans:
- Close any existing `[ci-bp-drift]` issue with a clean-state
comment.
- Exit 0.
Exit codes
----------
0 — clean OR API 403/404 (graceful-degrade, surfaces ::error::).
1 — at least one BP context has no emitter.
2 — env contract violation, workflows-dir missing, or YAML parse
error.
Env
---
GITEA_TOKEN — DRIFT_BOT_TOKEN (repo-admin for branch_protections)
GITEA_HOST — e.g. git.moleculesai.app
REPO — owner/name
BRANCH — defaults to `main`
WORKFLOWS_DIR — defaults to `.gitea/workflows`
DRIFT_LABEL — defaults to `ci-bp-drift`
Memory cross-links
------------------
- internal#350 (the RFC that specs this lint)
- feedback_phantom_required_check_after_gitea_migration
- feedback_tier_label_ids_are_per_repo
- reference_post_suspension_pipeline
"""
from __future__ import annotations
import json
import os
import re
import sys
import urllib.error
import urllib.parse
import urllib.request
from pathlib import Path
from typing import Any
try:
import yaml
except ImportError:
sys.stderr.write(
"::error::PyYAML is required. Install with: pip install PyYAML\n"
)
sys.exit(2)
# Status-check context regex (mirrors lint-required-no-paths.py).
_CONTEXT_RE = re.compile(
r"^(?P<workflow>.+?) / (?P<job>.+) \((?P<event>[^)]+)\)$"
)
# Map a workflow `on:` event-key to the context's event-part. Gitea's
# emitter convention (verified on molecule-core):
# - pull_request → `(pull_request)`
# - pull_request_target → `(pull_request)` (same surface)
# - push → `(push)`
# - schedule → no PR status; scheduled runs don't post
# commit-statuses unless the workflow itself does so explicitly.
# - workflow_dispatch → manually dispatched runs may or may not
# emit; safest to treat as "no PR status" (informational notice
# only).
_EVENT_MAP = {
"pull_request": "pull_request",
"pull_request_target": "pull_request",
"push": "push",
}
# ---------------------------------------------------------------------------
# Env
# ---------------------------------------------------------------------------
def _env(key: str, default: str | None = None) -> str:
v = os.environ.get(key, default)
return v if v is not None else ""
def _require_env(key: str) -> str:
v = os.environ.get(key)
if not v:
sys.stderr.write(f"::error::missing required env var: {key}\n")
sys.exit(2)
return v
# ---------------------------------------------------------------------------
# API helper. Mirrors lint-required-no-paths.py's contract: returns
# (status, payload) tuple with status ∈ {"ok", "not_found", "forbidden",
# "error"}.
# ---------------------------------------------------------------------------
def api(
method: str,
path: str,
*,
body: dict | None = None,
query: dict[str, str] | None = None,
) -> tuple[str, Any]:
host = _env("GITEA_HOST")
token = _env("GITEA_TOKEN")
url = f"https://{host}/api/v1{path}"
if query:
url = f"{url}?{urllib.parse.urlencode(query)}"
data = None
headers = {
"Authorization": f"token {token}",
"Accept": "application/json",
}
if body is not None:
data = json.dumps(body).encode("utf-8")
headers["Content-Type"] = "application/json"
req = urllib.request.Request(
url, method=method, data=data, headers=headers
)
try:
with urllib.request.urlopen(req, timeout=30) as resp:
raw = resp.read()
if not raw:
return ("ok", None)
return ("ok", json.loads(raw))
except urllib.error.HTTPError as e:
if e.code == 404:
return ("not_found", None)
if e.code in (401, 403):
return ("forbidden", None)
return ("error", None)
except (urllib.error.URLError, TimeoutError, json.JSONDecodeError):
return ("error", None)
# ---------------------------------------------------------------------------
# Helpers
# ---------------------------------------------------------------------------
def _get_on(d: Any) -> Any:
"""YAML 1.1 boolean quirk: bare `on:` may parse to True. Handle both."""
if not isinstance(d, dict):
return None
if "on" in d:
return d["on"]
if True in d:
return d[True]
return None
def _on_events(doc: Any) -> set[str]:
"""Return the set of event keys in a workflow's `on:` block.
Accepts all three shapes (string / list / mapping). String/list
shapes can't carry filters but they DO emit. Returns the
Gitea-mapped event names per `_EVENT_MAP`.
"""
on = _get_on(doc)
raw_events: set[str] = set()
if on is None:
return raw_events
if isinstance(on, str):
raw_events.add(on)
elif isinstance(on, list):
for e in on:
if isinstance(e, str):
raw_events.add(e)
elif isinstance(on, dict):
for k in on:
if isinstance(k, str):
raw_events.add(k)
return {_EVENT_MAP[e] for e in raw_events if e in _EVENT_MAP}
def _job_display(jbody: dict, jkey: str) -> str:
"""Return job's `name:` if set, else fall back to the job-key.
Gitea formats status contexts with the job's `name:` when set;
when unset it uses the job key. Matches lint-required-no-paths
convention.
"""
n = jbody.get("name") if isinstance(jbody, dict) else None
if isinstance(n, str) and n:
return n
return jkey
def workflow_contexts(doc: Any) -> set[str]:
"""Return the set of contexts a workflow emits."""
contexts: set[str] = set()
if not isinstance(doc, dict):
return contexts
wf_name = doc.get("name")
if not isinstance(wf_name, str) or not wf_name:
return contexts # no name => no addressable context
events = _on_events(doc)
if not events:
return contexts
jobs = doc.get("jobs")
if not isinstance(jobs, dict):
return contexts
for jkey, jbody in jobs.items():
if jkey == "__lines__": # tolerate line-tracking annotations
continue
if not isinstance(jbody, dict):
continue
disp = _job_display(jbody, jkey)
for ev in events:
contexts.add(f"{wf_name} / {disp} ({ev})")
return contexts
def parse_context(ctx: str) -> tuple[str, str, str] | None:
m = _CONTEXT_RE.match(ctx)
if not m:
return None
return (m.group("workflow"), m.group("job"), m.group("event"))
def _iter_workflow_files(wf_dir: Path) -> list[Path]:
return sorted(list(wf_dir.glob("*.yml")) + list(wf_dir.glob("*.yaml")))
# ---------------------------------------------------------------------------
# Issue idempotency — search for an open issue with the canonical
# title prefix; PATCH if found, POST if not. Mirrors ci-required-drift.
# ---------------------------------------------------------------------------
def _canonical_title(repo: str, branch: str) -> str:
return f"[ci-bp-drift] {repo}/{branch}: BP→emitter mismatch"
def _ensure_labels(repo: str, names: list[str]) -> list[int]:
status, labels = api("GET", f"/repos/{repo}/labels", query={"limit": "50"})
if status != "ok" or not isinstance(labels, list):
return []
out: list[int] = []
by_name = {l["name"]: l["id"] for l in labels if isinstance(l, dict)}
for n in names:
if n in by_name:
out.append(by_name[n])
return out
def file_or_update_issue(
repo: str, branch: str, orphans: list[str], emitter_orphans: list[str]
) -> None:
title = _canonical_title(repo, branch)
body_lines = [
f"BP→emitter drift detected on `{branch}` at "
f"{os.environ.get('GITHUB_RUN_URL', '(run url unavailable)')}.",
"",
f"## Orphan BP contexts ({len(orphans)})",
"",
"These contexts are required by branch protection but NO workflow "
"emits them. PRs merging into this branch will wait forever for a "
"status that never arrives (Gitea treats absent-as-`pending`, NOT "
"absent-as-`skipped`). See "
"`feedback_phantom_required_check_after_gitea_migration`.",
"",
]
for o in orphans:
body_lines.append(f"- `{o}`")
if emitter_orphans:
body_lines += [
"",
f"## Workflows emitting contexts NOT in BP ({len(emitter_orphans)})",
"",
"Informational — Tier 2g handles this direction at PR-time. "
"Listed here for completeness.",
"",
]
for o in emitter_orphans:
body_lines.append(f"- `{o}`")
body_lines += [
"",
"Fix options:",
" 1. PATCH `branch_protections/{branch}.status_check_contexts` "
" to remove the orphan.",
" 2. Restore the emitting workflow (if it was deleted/renamed).",
"",
"Linted by `.gitea/workflows/lint-bp-context-emit-match.yml` "
"(Tier 2f, internal#350).",
]
body = "\n".join(body_lines)
# Idempotency search — find an open issue with the canonical title.
status, hits = api(
"GET",
f"/repos/{repo}/issues",
query={
"type": "issues",
"state": "open",
"q": title,
},
)
existing = None
if status == "ok" and isinstance(hits, list):
for h in hits:
if (
isinstance(h, dict)
and h.get("state") == "open"
and isinstance(h.get("title"), str)
and h["title"].startswith(title)
):
existing = h
break
label_ids = _ensure_labels(repo, ["ci-bp-drift", "tier:high"])
if existing:
api(
"PATCH",
f"/repos/{repo}/issues/{existing['number']}",
body={"body": body, "labels": label_ids} if label_ids else {"body": body},
)
print(
f"::notice::Updated existing drift issue "
f"#{existing['number']}: {existing.get('html_url', '')}"
)
else:
status, posted = api(
"POST",
f"/repos/{repo}/issues",
body={"title": title, "body": body, "labels": label_ids},
)
if status == "ok" and isinstance(posted, dict):
print(
f"::notice::Filed new drift issue "
f"#{posted.get('number')}: {posted.get('html_url', '')}"
)
# ---------------------------------------------------------------------------
# Driver
# ---------------------------------------------------------------------------
def run() -> int:
_require_env("GITEA_TOKEN")
_require_env("GITEA_HOST")
repo = _require_env("REPO")
branch = _env("BRANCH", "main")
wf_dir = Path(_env("WORKFLOWS_DIR", ".gitea/workflows"))
if not wf_dir.is_dir():
sys.stderr.write(f"::error::workflows directory not found: {wf_dir}\n")
return 2
# 1. Pull BP.
status, bp = api("GET", f"/repos/{repo}/branch_protections/{branch}")
if status == "forbidden":
sys.stderr.write(
f"::error::GET branch_protections/{branch} returned HTTP 403 — "
f"DRIFT_BOT_TOKEN lacks repo-admin scope (Gitea 1.22.6 requires "
f"it for this endpoint). Skipping lint with exit 0 to avoid "
f"red-X on every run. Fix: grant repo-admin to mc-drift-bot. "
f"Per Tier 2a contract.\n"
)
return 0
if status == "not_found":
print(
f"::notice::branch '{branch}' has no protection configured; "
f"nothing to lint."
)
return 0
if status != "ok" or not isinstance(bp, dict):
sys.stderr.write(
f"::error::branch_protections/{branch} response unexpected; "
f"status={status}. Treating as transient; exit 0.\n"
)
return 0
bp_contexts: list[str] = list(bp.get("status_check_contexts") or [])
if not bp_contexts:
print(
f"::notice::branch_protections/{branch} has 0 required "
f"status_check_contexts; nothing to lint."
)
return 0
# 2. Enumerate emitter contexts from all workflows.
all_emitter: set[str] = set()
for path in _iter_workflow_files(wf_dir):
try:
doc = yaml.safe_load(path.read_text(encoding="utf-8"))
except yaml.YAMLError as e:
sys.stderr.write(
f"::error file={path}::YAML parse error: {e}; skipping.\n"
)
continue
all_emitter |= workflow_contexts(doc)
print(
f"::notice::Linting {len(bp_contexts)} BP context(s) for {branch} "
f"against {len(all_emitter)} workflow-emitted context(s)."
)
bp_set = set(bp_contexts)
# 3. Find orphans (BP-side: required but no emitter).
bp_orphans = sorted(bp_set - all_emitter)
# Informational: workflow emits but BP doesn't list. Tier 2g
# territory at PR-time. We list these as NOTICE only.
emitter_orphans = sorted(all_emitter - bp_set)
if bp_orphans:
print(
f"::error::Found {len(bp_orphans)} BP context(s) with no "
f"emitter — these would block merges forever (Gitea treats "
f"absent-as-pending, not skipped):"
)
for o in bp_orphans:
# Closest-match hint: name a workflow whose name-part is a
# near-match (lev-1 typo, or same workflow with a different
# event).
parsed = parse_context(o)
hint = ""
if parsed:
wf, _job, _ev = parsed
candidates = sorted(
{c for c in all_emitter if c.startswith(wf + " / ")}
)
if candidates:
hint = (
f" — closest emitter(s): {', '.join(candidates[:3])}"
)
print(f"::error:: - {o}{hint}")
if emitter_orphans:
print(
f"::notice::Also: {len(emitter_orphans)} workflow-emitted "
f"context(s) not in BP (informational; Tier 2g handles at "
f"PR-time):"
)
for o in emitter_orphans:
print(f"::notice:: - {o}")
# File / patch tracking issue.
try:
file_or_update_issue(repo, branch, bp_orphans, emitter_orphans)
except Exception as e:
sys.stderr.write(
f"::error::failed to file drift issue: {e}\n"
)
return 1
if emitter_orphans:
print(
f"::notice::{len(emitter_orphans)} workflow-emitted context(s) "
f"not in BP (informational; Tier 2g handles at PR-time):"
)
for o in emitter_orphans:
print(f"::notice:: - {o}")
print(
f"::notice::BP/emitter match clean: all {len(bp_contexts)} required "
f"context(s) have an emitter."
)
return 0
if __name__ == "__main__":
sys.exit(run())
@@ -1,438 +0,0 @@
#!/usr/bin/env python3
"""lint_continue_on_error_tracking — Tier 2e per internal#350.
Rule
----
Every `continue-on-error: true` directive in `.gitea/workflows/*.yml`
must be accompanied by a tracker reference comment within 2 lines
(above OR below the directive's line). The reference is one of:
* `# mc#NNNN` — molecule-core issue
* `# internal#NNNN` — molecule-ai/internal issue
The referenced issue must satisfy ALL of:
1. Exists (HTTP 200 on `/repos/{owner}/{name}/issues/{num}`)
2. `state == "open"`
3. `created_at` is ≤ MAX_AGE_DAYS days ago (default 14)
A passing reference establishes an audit trail and a forced renewal
cadence — after 14 days the issue must either be CLOSED (the masked
defect was fixed) or the comment must point at a NEW tracker
(deliberate decision to keep masking, requires a paper-trail).
The class this prevents
-----------------------
Phase-3-masked failures. `continue-on-error: true` on `platform-build`
had been hiding mc#664-class regressions for ~3 weeks before #656
surfaced them on 2026-05-12. A 14-day cap forces a tracker review
cycle and surfaces mask-drift within at most 14 days of the original
defect.
Behaviour-based gate
--------------------
We parse via PyYAML AST (per `feedback_behavior_based_ast_gates`) to
detect `continue-on-error: <truthy>` at job-key level, then map each
location back to its source line via PyYAML's line-tracking loader.
Comments are scanned from the raw text within a 2-line window of
that source line. Reformatting (block-scalar vs flow-style) does not
break the rule because the source-line anchor is the directive's
own line.
Exit codes
----------
0 — every `continue-on-error: true` has a passing tracker, OR
the issue-API endpoint returned 403/404 (token-scope; graceful
degrade per Tier 2a contract — surface via ::error:: on stderr
but don't red-X every PR over auth).
1 — at least one violation (missing/closed/too-old/non-existent
tracker).
2 — env contract violation, YAML parse error, or workflows-dir
missing.
Env
---
GITEA_TOKEN — read scope on the configured repos.
Auto-injected `GITHUB_TOKEN` works for same-repo
issue reads; for `internal#NNN` we need a token
with `molecule-ai/internal` read scope. Use
DRIFT_BOT_TOKEN (same persona as other Tier 2
lints).
GITEA_HOST — e.g. git.moleculesai.app
REPO — `owner/name` for `mc#NNNN` lookups
INTERNAL_REPO — `owner/name` for `internal#NNNN` lookups
(defaults to derived `molecule-ai/internal`)
WORKFLOWS_DIR — defaults to `.gitea/workflows`
MAX_AGE_DAYS — defaults to 14
Memory cross-links
------------------
- internal#350 (the RFC that specs this lint)
- mc#664 (the masked-3-weeks empirical case)
- feedback_chained_defects_in_never_tested_workflows
- feedback_behavior_based_ast_gates
- feedback_strict_root_only_after_class_a
"""
from __future__ import annotations
import json
import os
import re
import sys
import urllib.error
import urllib.parse
import urllib.request
from datetime import datetime, timedelta, timezone
from pathlib import Path
from typing import Any
try:
import yaml
except ImportError:
sys.stderr.write(
"::error::PyYAML is required. Install with: pip install PyYAML\n"
)
sys.exit(2)
# ---------------------------------------------------------------------------
# Tracker comment regex.
# Matches: `# mc#1234`, `# internal#42`, `# mc#1234 - description`
# Also matches trackers embedded mid-sentence: `# see mc#1234 for details`
# Does NOT match: `# mc1234` (missing inner #), `mc#1234` (no leading
# comment `#`), `# MC#1234` (case-sensitive). The search is line-wide,
# not just at the comment-marker prefix — fixes false-negative when
# the tracker appears mid-sentence (e.g. `internal#350` after prose).
TRACKER_RE = re.compile(
r"(?P<slug>mc|internal)#(?P<num>\d+)\b"
)
# Truthy continue-on-error values we treat as "true". PyYAML decodes
# `continue-on-error: true` to Python `True`. `continue-on-error: "true"`
# decodes to the string "true" — Gitea's evaluator coerces strings,
# so we treat string-`"true"` (case-insensitive) as truthy too.
def _is_truthy_coe(v: Any) -> bool:
if v is True:
return True
if isinstance(v, str) and v.strip().lower() == "true":
return True
return False
# ---------------------------------------------------------------------------
# Env contract
# ---------------------------------------------------------------------------
def _env(key: str, default: str | None = None) -> str:
v = os.environ.get(key, default)
return v if v is not None else ""
def _require_env(key: str) -> str:
v = os.environ.get(key)
if not v:
sys.stderr.write(f"::error::missing required env var: {key}\n")
sys.exit(2)
return v
# ---------------------------------------------------------------------------
# PyYAML line-tracking loader. yaml.SafeLoader nodes carry
# `start_mark.line` (0-based); using construct_mapping with `deep=True`
# preserves that on every node. We need the line of each
# `continue-on-error` key so we can scan the source for comments
# near it.
# ---------------------------------------------------------------------------
class _LineLoader(yaml.SafeLoader):
"""SafeLoader that annotates every dict with `__line__: {key: line}`."""
def _construct_mapping(loader: yaml.SafeLoader, node: yaml.MappingNode) -> dict:
mapping = loader.construct_mapping(node, deep=True)
# Annotate per-key source lines so we can locate `continue-on-error`.
lines: dict[str, int] = {}
for k_node, _v_node in node.value:
try:
key = loader.construct_object(k_node, deep=True)
except Exception:
continue
if isinstance(key, (str, int, bool)):
lines[str(key)] = k_node.start_mark.line + 1 # 1-based
if isinstance(mapping, dict):
mapping["__lines__"] = lines
return mapping
_LineLoader.add_constructor(
yaml.resolver.BaseResolver.DEFAULT_MAPPING_TAG, _construct_mapping
)
# ---------------------------------------------------------------------------
# Issue lookup
# ---------------------------------------------------------------------------
def fetch_issue(slug_kind: str, num: int) -> tuple[str, dict | None]:
"""Return `(status, payload_or_none)`.
status ∈ {"ok", "not_found", "forbidden", "error"}.
"""
repo = (
_env("REPO") if slug_kind == "mc" else _env("INTERNAL_REPO")
)
if not repo:
# Fall through gracefully — caller treats as 403 (token-scope).
return ("forbidden", None)
host = _env("GITEA_HOST")
token = _env("GITEA_TOKEN")
url = f"https://{host}/api/v1/repos/{repo}/issues/{num}"
req = urllib.request.Request(
url,
headers={
"Authorization": f"token {token}",
"Accept": "application/json",
},
)
try:
with urllib.request.urlopen(req, timeout=20) as resp:
return ("ok", json.loads(resp.read()))
except urllib.error.HTTPError as e:
if e.code == 404:
return ("not_found", None)
if e.code in (401, 403):
return ("forbidden", None)
return ("error", None)
except (urllib.error.URLError, TimeoutError, json.JSONDecodeError):
return ("error", None)
# ---------------------------------------------------------------------------
# Locate every continue-on-error: <truthy> in a workflow doc, with line.
# ---------------------------------------------------------------------------
def find_coe_truthies(
doc: Any, raw_lines: list[str]
) -> list[tuple[str, int]]:
"""Return list of (job_key, source_line_1based).
`doc` is the LineLoader-parsed mapping. We descend `jobs.<key>` and
return only those whose value is truthy per `_is_truthy_coe`.
Job-step continue-on-error is intentionally NOT considered: it
suppresses step-level failure rollup only, not job-level. The
masking class this lint targets is the job-level rollup.
"""
out: list[tuple[str, int]] = []
if not isinstance(doc, dict):
return out
jobs = doc.get("jobs")
if not isinstance(jobs, dict):
return out
for jkey, jbody in jobs.items():
if jkey == "__lines__":
continue
if not isinstance(jbody, dict):
continue
if "continue-on-error" not in jbody:
continue
v = jbody["continue-on-error"]
if not _is_truthy_coe(v):
continue
line = jbody.get("__lines__", {}).get("continue-on-error")
if not line:
# PyYAML line-tracking shouldn't miss but guard for safety.
# Fall back to grepping the raw text.
line = _grep_first_coe_line(raw_lines, jkey) or 1
out.append((str(jkey), int(line)))
return out
def _grep_first_coe_line(raw_lines: list[str], jkey: str) -> int | None:
"""Fallback: find the first `continue-on-error:` line after a `jkey:` line."""
saw_job = False
for i, line in enumerate(raw_lines, start=1):
if re.match(rf"^\s*{re.escape(jkey)}\s*:", line):
saw_job = True
continue
if saw_job and "continue-on-error" in line:
return i
return None
# ---------------------------------------------------------------------------
# Scan window for tracker comment
# ---------------------------------------------------------------------------
WINDOW = 2 # lines above OR below the directive's line (inclusive)
def find_tracker_in_window(
raw_lines: list[str], line_1based: int
) -> tuple[str, int] | None:
"""Return (slug, num) if a `# mc#NNN`/`# internal#NNN` appears
in raw_lines within ±WINDOW lines of `line_1based`. None otherwise.
We scan the directive's own line (it may carry an inline comment
like `continue-on-error: true # mc#3`) plus ±WINDOW.
"""
lo = max(1, line_1based - WINDOW)
hi = min(len(raw_lines), line_1based + WINDOW)
for i in range(lo, hi + 1):
line = raw_lines[i - 1]
# Only the comment portion (after `#`) is considered, so
# trailing-inline comments on the directive line are matched.
m = TRACKER_RE.search(line)
if m:
return (m.group("slug"), int(m.group("num")))
return None
# ---------------------------------------------------------------------------
# Tracker validation
# ---------------------------------------------------------------------------
def validate_tracker(
slug: str, num: int, max_age_days: int
) -> tuple[bool, str]:
"""Return (ok?, reason). On 403, ok=True is returned with reason
explaining graceful-degrade — caller treats 403 as a non-fatal
skip (same as Tier 2a contract).
"""
status, payload = fetch_issue(slug, num)
if status == "forbidden":
sys.stderr.write(
f"::error::issue {slug}#{num} unreadable (HTTP 403 — token "
f"scope). Cannot validate; skipping this check to avoid "
f"red-X on every PR. Fix the token, not the lint.\n"
)
return (True, "forbidden — skipped")
if status == "not_found":
return (False, f"{slug}#{num} does not exist (404)")
if status == "error":
sys.stderr.write(
f"::error::issue {slug}#{num} fetch errored — treating as "
f"unverified, skipping this check.\n"
)
return (True, "fetch-error — skipped")
assert payload is not None
state = payload.get("state", "")
if state != "open":
return (False, f"{slug}#{num} state={state!r} (must be open)")
created = payload.get("created_at", "")
try:
# Gitea returns ISO-8601 with timezone; Python 3.11+
# fromisoformat handles `Z` suffix natively from 3.11. Older
# runtimes need explicit replace.
created_dt = datetime.fromisoformat(created.replace("Z", "+00:00"))
except ValueError:
return (False, f"{slug}#{num} created_at unparseable: {created!r}")
age = datetime.now(timezone.utc) - created_dt
# Inclusive boundary at MAX_AGE_DAYS: `age.days` truncates to a
# whole-day floor, so an issue created 14d 0h 5m ago has
# `age.days == 14` and passes; one created 15d 0h 0m ago has
# `age.days == 15` and fails. This is the convention specified
# in internal#350 ("≤14 days old").
if age.days > max_age_days:
return (
False,
f"{slug}#{num} is {age.days} days old (>{max_age_days}d cap). "
f"Close-or-renew the tracker.",
)
return (True, f"{slug}#{num} open, {age.days}d old, ≤{max_age_days}d")
# ---------------------------------------------------------------------------
# Driver
# ---------------------------------------------------------------------------
def _iter_workflow_files(wf_dir: Path) -> list[Path]:
return sorted(list(wf_dir.glob("*.yml")) + list(wf_dir.glob("*.yaml")))
def run() -> int:
wf_dir = Path(_env("WORKFLOWS_DIR", ".gitea/workflows"))
max_age = int(_env("MAX_AGE_DAYS", "14"))
# Defaults for INTERNAL_REPO when unset (best-effort guess based on
# the convention `mc#` = same repo, `internal#` = molecule-ai/internal).
if not os.environ.get("INTERNAL_REPO"):
os.environ["INTERNAL_REPO"] = "molecule-ai/internal"
if not wf_dir.is_dir():
sys.stderr.write(
f"::error::workflows directory not found: {wf_dir}\n"
)
return 2
yml_files = _iter_workflow_files(wf_dir)
if not yml_files:
print(f"::notice::no workflow files under {wf_dir}; nothing to lint.")
return 0
violations: list[str] = []
notices: list[str] = []
total_coe_true = 0
for path in yml_files:
raw = path.read_text(encoding="utf-8")
raw_lines = raw.splitlines()
try:
doc = yaml.load(raw, Loader=_LineLoader)
except yaml.YAMLError as e:
sys.stderr.write(
f"::error file={path}::YAML parse error: {e}. Skipping "
f"this file (lint-workflow-yaml will catch separately).\n"
)
continue
coe_locs = find_coe_truthies(doc, raw_lines)
for jkey, line in coe_locs:
total_coe_true += 1
tracker = find_tracker_in_window(raw_lines, line)
if tracker is None:
violations.append(
f"::error file={path},line={line}::lint-continue-on-error-"
f"tracking (Tier 2e): job '{jkey}' has "
f"`continue-on-error: true` at line {line} with no "
f"`# mc#NNNN` or `# internal#NNNN` tracker comment "
f"within {WINDOW} lines. Add a tracker reference so "
f"this mask has a forced 14-day renewal cycle. "
f"Memory: feedback_chained_defects_in_never_tested_workflows."
)
continue
slug, num = tracker
ok, reason = validate_tracker(slug, num, max_age)
if ok:
notices.append(
f"::notice::{path.name} job '{jkey}' (line {line}): "
f"{reason}"
)
else:
violations.append(
f"::error file={path},line={line}::lint-continue-on-error-"
f"tracking (Tier 2e): job '{jkey}' "
f"`continue-on-error: true` references {slug}#{num}, "
f"but {reason}. FIX: close/fix the underlying defect "
f"and flip continue-on-error: false, OR file a fresh "
f"tracker and update the comment."
)
for n in notices:
print(n)
if violations:
print(
f"::error::lint-continue-on-error-tracking: "
f"{len(violations)} violation(s) across {len(yml_files)} "
f"workflow file(s) (of {total_coe_true} `continue-on-error: "
f"true` directives in total)."
)
for v in violations:
print(v)
return 1
print(
f"::notice::lint-continue-on-error-tracking: "
f"all {total_coe_true} `continue-on-error: true` directive(s) "
f"have valid trackers (open, ≤{max_age}d old)."
)
return 0
if __name__ == "__main__":
sys.exit(run())
-361
View File
@@ -1,361 +0,0 @@
#!/usr/bin/env python3
"""lint_mask_pr_atomicity — Tier 2d structural enforcement per internal#350.
Rule
----
A PR whose diff touches `.gitea/workflows/ci.yml` AND modifies EITHER:
- any `continue-on-error:` value, OR
- the `all-required` sentinel job's `needs:` block
must EITHER:
- Touch BOTH atomically in the same PR (preferred), OR
- Cross-link the paired PR via a literal `Paired: #NNN` reference in
the PR body OR in any commit message between BASE_SHA and HEAD_SHA.
The class this prevents
-----------------------
PR#665 (interim `continue-on-error: true` on `platform-build`) and
PR#668 (sentinel-`needs` demotion of the same job) were designed as a
pair but merged solo — #665 landed at 04:47Z 2026-05-12, #668 was still
open at 05:07Z when the main-red watchdog (#674) fired. Result: ~20
minutes of `main` red and a cascade of false-positives on unrelated PRs.
The lint operates on the YAML AST (PyYAML), not grep, per
`feedback_behavior_based_ast_gates`: a refactor that moves `continue-on-error`
between job keys, or renames the `all-required` job, would still be
detected because we walk the parsed structure.
Why this works on Gitea 1.22.6
------------------------------
We don't use any 1.22.6-missing endpoints (no `/actions/runs/*`, no
`branch_protections/*` — Tier 2f/g need those; Tier 2d does not). All
required inputs come from the workflow `pull_request` event payload
(BASE_SHA, HEAD_SHA, PR_BODY) and from local git via `git show`/`git log`.
The auto-injected `GITHUB_TOKEN` is enough; we don't need
DRIFT_BOT_TOKEN.
Exit codes
----------
0 — ci.yml not in diff, OR diff is no-op for the rule predicates,
OR atomicity satisfied (both touched), OR a valid `Paired: #NNN`
reference is present.
1 — exactly ONE of {coe, sentinel-needs} touched AND no valid
`Paired: #NNN` reference. The split-pair regression class.
2 — env contract violation (BASE_SHA / HEAD_SHA missing) or YAML
parse error on either side.
Env
---
BASE_SHA — PR base (pull_request.base.sha)
HEAD_SHA — PR head (pull_request.head.sha)
PR_BODY — pull_request.body (may be empty)
CI_WORKFLOW_PATH — defaults to `.gitea/workflows/ci.yml`
SENTINEL_JOB_KEY — defaults to `all-required`
Memory cross-links
------------------
- internal#350 (the RFC that specs this lint)
- PR#665 / PR#668 (the empirical split-pair)
- mc#664 (the main-red incident)
- feedback_strict_root_only_after_class_a
- feedback_behavior_based_ast_gates
"""
from __future__ import annotations
import os
import re
import subprocess
import sys
from typing import Any
try:
import yaml
except ImportError:
sys.stderr.write(
"::error::PyYAML is required. Install with: pip install PyYAML\n"
)
sys.exit(2)
# ---------------------------------------------------------------------------
# YAML quirk: bare `on:` at the top level becomes Python `True` because
# `on` is a YAML 1.1 boolean. Not used here but documented for future
# editors who copy from this module.
# ---------------------------------------------------------------------------
# `Paired: #NNN` reference. `#` is mandatory, NNN must be digits. Any
# surrounding markdown/whitespace is fine. The match is case-sensitive
# on `Paired:` because lower-case `paired:` collides with conversational
# prose ("paired: see comment above") and the convention is the exact
# capitalisation.
PAIRED_RE = re.compile(r"\bPaired:\s*#(?P<num>\d+)\b")
# ---------------------------------------------------------------------------
# Env contract
# ---------------------------------------------------------------------------
def _env(key: str, default: str | None = None) -> str:
v = os.environ.get(key, default)
return v if v is not None else ""
def _require_env(key: str) -> str:
v = os.environ.get(key)
if not v:
sys.stderr.write(f"::error::missing required env var: {key}\n")
sys.exit(2)
return v
# ---------------------------------------------------------------------------
# git-show helper. Returns None when the path doesn't exist on that side
# (new file, deleted file, or rename — git returns exit 128 with "fatal:
# path not in tree"). We treat None as "no rule predicate triggered on
# that side".
# ---------------------------------------------------------------------------
def git_show(sha: str, path: str) -> str | None:
r = subprocess.run(
["git", "show", f"{sha}:{path}"],
capture_output=True,
text=True,
)
if r.returncode != 0:
return None
return r.stdout
def git_log_messages(base_sha: str, head_sha: str) -> str:
r = subprocess.run(
["git", "log", "--format=%B", f"{base_sha}..{head_sha}"],
capture_output=True,
text=True,
)
if r.returncode != 0:
return ""
return r.stdout
def git_diff_paths(base_sha: str, head_sha: str) -> list[str]:
r = subprocess.run(
["git", "diff", "--name-only", f"{base_sha}..{head_sha}"],
capture_output=True,
text=True,
)
if r.returncode != 0:
return []
return [p for p in r.stdout.splitlines() if p.strip()]
# ---------------------------------------------------------------------------
# Predicate 1 — any `continue-on-error` value changed between base and head
# ---------------------------------------------------------------------------
def _collect_coe(doc: Any) -> dict[str, Any]:
"""Walk every job in `jobs.*` and collect its continue-on-error value.
Returns a dict {job_key: coe_value}. Missing keys are absent from
the dict (NOT `False` — distinguishes "added the key" from
"unchanged absent"). Job-step `continue-on-error` is NOT considered
— only job-level, because that's the value that masks job status
rollup, which is the class this lint targets.
"""
out: dict[str, Any] = {}
if not isinstance(doc, dict):
return out
jobs = doc.get("jobs")
if not isinstance(jobs, dict):
return out
for k, j in jobs.items():
if not isinstance(j, dict):
continue
if "continue-on-error" in j:
out[k] = j["continue-on-error"]
return out
def coe_changed(base_doc: Any, head_doc: Any) -> tuple[bool, list[str]]:
"""Return (changed?, [reasons]) describing per-job coe diffs."""
base = _collect_coe(base_doc)
head = _collect_coe(head_doc)
reasons: list[str] = []
all_keys = set(base) | set(head)
for k in sorted(all_keys):
b = base.get(k, "<absent>")
h = head.get(k, "<absent>")
if b != h:
reasons.append(f"job '{k}' continue-on-error: {b!r}{h!r}")
return (bool(reasons), reasons)
# ---------------------------------------------------------------------------
# Predicate 2 — sentinel job's `needs:` changed
# ---------------------------------------------------------------------------
def _collect_needs(doc: Any, sentinel_key: str) -> list[str] | None:
"""Return the sentinel job's needs list (sorted) or None if absent."""
if not isinstance(doc, dict):
return None
jobs = doc.get("jobs")
if not isinstance(jobs, dict):
return None
j = jobs.get(sentinel_key)
if not isinstance(j, dict):
return None
needs = j.get("needs")
if needs is None:
return []
if isinstance(needs, str):
return [needs]
if isinstance(needs, list):
# Sort because `needs:` is order-insensitive at the engine
# level; a reorder is not a semantic change and shouldn't
# trip the lint.
return sorted(str(x) for x in needs)
return None
def sentinel_needs_changed(
base_doc: Any, head_doc: Any, sentinel_key: str
) -> tuple[bool, str]:
"""Return (changed?, reason)."""
base = _collect_needs(base_doc, sentinel_key)
head = _collect_needs(head_doc, sentinel_key)
if base == head:
return (False, "")
return (
True,
f"sentinel '{sentinel_key}'.needs: {base!r}{head!r}",
)
# ---------------------------------------------------------------------------
# Predicate 3 — `Paired: #NNN` present in body or any commit message
# ---------------------------------------------------------------------------
def find_paired_refs(pr_body: str, commit_log: str) -> list[str]:
"""Return list of `#NNN` strings found (deduped, sorted)."""
found: set[str] = set()
for src in (pr_body, commit_log):
for m in PAIRED_RE.finditer(src or ""):
found.add(m.group("num"))
return sorted(found)
# ---------------------------------------------------------------------------
# Driver
# ---------------------------------------------------------------------------
def _parse(content: str | None, label: str) -> Any:
if content is None:
return None
try:
return yaml.safe_load(content)
except yaml.YAMLError as e:
sys.stderr.write(f"::error::YAML parse error on {label}: {e}\n")
sys.exit(2)
def run() -> int:
base_sha = _require_env("BASE_SHA")
head_sha = _require_env("HEAD_SHA")
pr_body = _env("PR_BODY", "")
ci_path = _env("CI_WORKFLOW_PATH", ".gitea/workflows/ci.yml")
sentinel_key = _env("SENTINEL_JOB_KEY", "all-required")
# Step 0 — is ci.yml even in the diff? If not, the lint doesn't apply.
changed_paths = git_diff_paths(base_sha, head_sha)
if ci_path not in changed_paths:
print(
f"::notice::{ci_path} not in PR diff; lint-mask-pr-atomicity "
f"skipped (no atomicity risk)."
)
return 0
base_yml = git_show(base_sha, ci_path)
head_yml = git_show(head_sha, ci_path)
base_doc = _parse(base_yml, f"{ci_path}@{base_sha}")
head_doc = _parse(head_yml, f"{ci_path}@{head_sha}")
# If the file is newly added (no base), no flip is possible — every
# value is "newly introduced", not "changed". Tier 2e covers the
# tracking-issue check for new continue-on-error: true. Exit 0.
if base_doc is None:
print(
f"::notice::{ci_path} newly added in this PR; no flip to "
f"analyse — lint-mask-pr-atomicity skipped."
)
return 0
# If the file is deleted on head, ditto — no atomicity question.
if head_doc is None:
print(
f"::notice::{ci_path} deleted in this PR; "
f"lint-mask-pr-atomicity skipped."
)
return 0
coe_yes, coe_reasons = coe_changed(base_doc, head_doc)
needs_yes, needs_reason = sentinel_needs_changed(
base_doc, head_doc, sentinel_key
)
if not coe_yes and not needs_yes:
print(
f"::notice::{ci_path} touched but neither continue-on-error "
f"nor sentinel '{sentinel_key}'.needs changed — no atomicity "
f"risk. OK."
)
return 0
if coe_yes and needs_yes:
print(
f"::notice::Atomic change detected: both continue-on-error "
f"AND sentinel '{sentinel_key}'.needs touched in same PR. OK."
)
for r in coe_reasons:
print(f" - {r}")
print(f" - {needs_reason}")
return 0
# Exactly one side touched — require Paired: #NNN reference.
commit_log = git_log_messages(base_sha, head_sha)
paired = find_paired_refs(pr_body, commit_log)
one_side = "continue-on-error" if coe_yes else f"sentinel '{sentinel_key}'.needs"
other_side = (
f"sentinel '{sentinel_key}'.needs" if coe_yes else "continue-on-error"
)
if paired:
print(
f"::notice::Split-pair detected ({one_side} changed without "
f"{other_side}), but Paired reference(s) present: "
f"{', '.join('#' + n for n in paired)}. OK."
)
for r in coe_reasons:
print(f" - {r}")
if needs_reason:
print(f" - {needs_reason}")
return 0
# The failure mode this lint exists to prevent.
print(
f"::error file={ci_path}::lint-mask-pr-atomicity (Tier 2d): "
f"PR touches {one_side} in {ci_path} but NOT {other_side}, "
f"and no `Paired: #NNN` reference was found in the PR body or "
f"in commit messages between {base_sha[:8]}..{head_sha[:8]}. "
f"This is the PR#665+#668 split-pair regression class "
f"(see internal#350, mc#664). FIX: either (a) include the "
f"matching {other_side} change in the same PR (preferred), or "
f"(b) add `Paired: #NNN` (literal, capital P, with `#`) to the "
f"PR body or a commit message referencing the paired PR."
)
for r in coe_reasons:
print(f" - {r}")
if needs_reason:
print(f" - {needs_reason}")
return 1
if __name__ == "__main__":
sys.exit(run())
@@ -1,681 +0,0 @@
#!/usr/bin/env python3
"""lint-pre-flip-continue-on-error — block a PR that flips a job from
``continue-on-error: true`` to ``continue-on-error: false`` (or removes
the key while the base had it ``true``) without proof that the job's
recent runs on the target branch are actually green.
Empirical class — PR #656 / mc#664:
PR #656 (RFC internal#219 Phase 4) flipped 5 ``platform-build``-class
jobs ``continue-on-error: true → false`` on the basis of a
"verified green on main via combined-status check". But that "green"
was the LIE produced by the prior ``continue-on-error: true``:
Gitea Quirk #10 (internal#342 + dup #287) — when a step inside a
job marked ``continue-on-error: true`` fails, the job-level status
is still rolled up as ``success``. So the precondition the PR
claimed to verify was structurally fooled by the bug being
flipped.
mc#664 then captured the surfaced defects (2 unrelated, mutually-
masked regressions):
Class 1: sqlmock helper drift since 2f36bb9a (24 days old)
Class 2: OFFSEC-001 contract collision since 7d1a189f (1 day old)
Codified 04:35Z as hongming-pc2 charter §SOP-N rule (e)
"run-log-grep-before-flip": pull the actual run log + grep for
``--- FAIL`` / ``FAIL\\s`` BEFORE flipping; don't trust the masked
combined-status.
This script structurally enforces that rule at PR time.
How it works (one PR tick):
1. Parse the diff: compare ``.gitea/workflows/*.yml`` at PR base
vs PR head. For each file present in both, parse the YAML AST
and walk ``jobs.<key>.continue-on-error`` on each side. A
"flip" is base ∈ {true} AND head ∈ {false, None/absent}. We
coerce truthy/falsy per YAML semantics (PyYAML normalizes
``true``/``True``/``yes`` to ``True``).
2. For each flipped job, derive its commit-status context name as
``"{workflow.name} / {job.name or job.key} (push)"`` — that's
how Gitea Actions emits the context for runs on
``main``/``staging`` (push event, see also expected_context()
in ci-required-drift.py).
3. Pull the last N commits of the target branch (PR base), fetch
combined commit-status per commit, scan ``statuses[]`` for
contexts matching ANY of the flipped jobs. For each match,
fetch the actual run log via the web-UI route
``{server_url}/{repo}/actions/runs/{run_id}/jobs/{job_idx}/logs``
(per memory ``reference_gitea_actions_log_fetch`` — Gitea 1.22.6
lacks REST ``/actions/runs/*`` endpoints; the web-UI route is the
only working path; see ``reference_gitea_1_22_6_lacks_rest_rerun_endpoints``).
4. Grep each log for the Go-test failure markers ``--- FAIL`` /
``FAIL\\s+<package>`` AND the bash-step error sentinel
``::error::``. If ANY recent log shows any of these AND the
status itself reads ``success``, the job was masked. ``::error::``
the flip with the offending test name + offending run URL +
the regression commit (HEAD of the run).
5. Exit 1 if any flips have at least one masked run; exit 0
otherwise.
Halt-on-noise contract:
- If a recent log fetch 404s (already-pruned-via-act_runner-gc,
transient gitea-web outage): emit ``::warning::`` and treat the
run as "log unavailable" — does NOT block the flip; logged so
a curious reviewer can re-run.
- If a flipped job has ZERO recent runs on the target branch (newly
added workflow): emit ``::warning::`` "no run history to verify"
and allow the flip. This is the only way a NEW workflow can ever
ship with ``continue-on-error: false``; otherwise we'd have a
chicken-and-egg.
Behavior-based AST gate per ``feedback_behavior_based_ast_gates``:
- YAML parsed via PyYAML safe_load on BOTH sides of the diff
- No grep-by-line — formatting changes (comment churn, key order)
don't false-positive a flip
- Job-key match — so a rename ``platform-build → core-be-build``
appears as a DELETE + an ADD, not a flip (the delete side has no
new value to compare against; the add side has no base side).
Run locally (works against this repo, requires PyYAML + Gitea token
that can read combined-commit-status):
GITEA_TOKEN=... GITEA_HOST=git.moleculesai.app \\
REPO=molecule-ai/molecule-core BASE_REF=main \\
BASE_SHA=$(git rev-parse origin/main) \\
HEAD_SHA=$(git rev-parse HEAD) \\
python3 .gitea/scripts/lint_pre_flip_continue_on_error.py \\
--dry-run
Cross-links: PR#656, mc#664, PR#665 (the interim re-mask),
Quirk #10 (internal#342 + dup #287), hongming-pc2 charter §SOP-N
rule (e), feedback_strict_root_only_after_class_a,
feedback_no_shared_persona_token_use.
"""
from __future__ import annotations
import argparse
import json
import os
import subprocess
import sys
import urllib.error
import urllib.parse
import urllib.request
from typing import Any
import yaml # PyYAML 6.0.2 — installed by the workflow before this runs.
# --------------------------------------------------------------------------
# Environment (read at module-import; runtime contract enforced in main())
# --------------------------------------------------------------------------
def _env(key: str, *, default: str = "") -> str:
return os.environ.get(key, default)
GITEA_TOKEN = _env("GITEA_TOKEN")
GITEA_HOST = _env("GITEA_HOST")
REPO = _env("REPO")
BASE_REF = _env("BASE_REF", default="main")
BASE_SHA = _env("BASE_SHA")
HEAD_SHA = _env("HEAD_SHA")
# How many recent commits to scan on the target branch. 5 by default;
# enough to catch a job that only fails intermittently, not so many
# that the script paginates needlessly. Per spec.
RECENT_COMMITS_N = int(_env("RECENT_COMMITS_N", default="5"))
OWNER, NAME = (REPO.split("/", 1) + [""])[:2] if REPO else ("", "")
API = f"https://{GITEA_HOST}/api/v1" if GITEA_HOST else ""
WEB = f"https://{GITEA_HOST}" if GITEA_HOST else ""
# Failure markers we grep for in the run log.
# --- FAIL — Go test failure marker
# FAIL\s — `FAIL github.com/x/y` package-level rollup
# ::error:: — bash-step `::error::` lines (the lint-curl-status-capture
# pattern: a `python3 <<PY` block writing `::error::` then
# sys.exit(1); also any shell `echo "::error::..."` from
# jobs that wrap pytest/eslint/etc. and convert
# non-zero exits into masked-by-CoE status)
FAIL_PATTERNS = (
"--- FAIL",
"FAIL\t",
"FAIL ",
"::error::",
)
def _require_runtime_env() -> None:
for key in ("GITEA_TOKEN", "GITEA_HOST", "REPO", "BASE_REF", "BASE_SHA", "HEAD_SHA"):
if not os.environ.get(key):
sys.stderr.write(f"::error::missing required env var: {key}\n")
sys.exit(2)
# --------------------------------------------------------------------------
# Tiny HTTP helper (no requests dependency)
# Mirrors the api()/ApiError contract in ci-required-drift.py +
# main-red-watchdog.py per feedback_api_helper_must_raise_not_return_dict.
# --------------------------------------------------------------------------
class ApiError(RuntimeError):
"""Raised when a Gitea API/web call cannot be trusted to have succeeded.
Soft-failure on non-2xx is the duplicate-write bug factory in
find-or-create flows (PR #112 Five-Axis). Here it would mean a
transient gitea-web 502 silently allows a flip whose recent runs
we couldn't actually verify — exactly the regression class this
lint exists to close.
"""
def http(
method: str,
url: str,
*,
body: dict | None = None,
headers: dict[str, str] | None = None,
expect_json: bool = True,
timeout: int = 30,
) -> tuple[int, Any, bytes]:
"""Tiny HTTP helper around urllib.
Returns (status, parsed_or_None, raw_bytes). Raises ApiError on any
non-2xx response. ``expect_json=False`` returns raw bytes in the
parsed slot (for log-fetch from the web-UI which returns text/plain).
"""
final_headers = {
"Authorization": f"token {GITEA_TOKEN}",
"Accept": "application/json" if expect_json else "text/plain",
}
if headers:
final_headers.update(headers)
data = None
if body is not None:
data = json.dumps(body).encode("utf-8")
final_headers["Content-Type"] = "application/json"
req = urllib.request.Request(url, method=method, data=data, headers=final_headers)
try:
with urllib.request.urlopen(req, timeout=timeout) as resp:
raw = resp.read()
status = resp.status
except urllib.error.HTTPError as e:
raw = e.read() or b""
status = e.code
if not (200 <= status < 300):
snippet = raw[:500].decode("utf-8", errors="replace") if raw else ""
raise ApiError(f"{method} {url} → HTTP {status}: {snippet}")
if not expect_json:
return status, raw, raw
if not raw:
return status, None, raw
try:
return status, json.loads(raw), raw
except json.JSONDecodeError as e:
raise ApiError(f"{method} {url} → HTTP {status} but body is not JSON: {e}") from e
def api(method: str, path: str, *, body: dict | None = None, query: dict[str, str] | None = None) -> tuple[int, Any]:
"""Read-shaped Gitea REST helper. Path is API-relative (``/repos/...``)."""
url = f"{API}{path}"
if query:
url = f"{url}?{urllib.parse.urlencode(query)}"
status, parsed, _ = http(method, url, body=body, expect_json=True)
return status, parsed
# --------------------------------------------------------------------------
# YAML parsing — coerce truthy/falsy for continue-on-error
# --------------------------------------------------------------------------
def _coerce_coe(val: Any) -> bool:
"""Coerce a continue-on-error YAML value to bool.
PyYAML safe_load normalizes ``true``/``True``/``yes``/``on`` to
Python ``True`` and ``false``/``False``/``no``/``off`` / absence
to ``False`` (we treat absence/None as False here too — that's the
GitHub Actions default semantics).
Edge cases:
- String ``"true"`` (quoted in YAML) — kept as the string
``"true"``, falsy under bool() but a flip we DO care about
catching. Normalize string forms case-insensitively to bool
so the diff is consistent with the runtime behavior of
Gitea Actions, which YAML-parses the same way.
"""
if isinstance(val, bool):
return val
if val is None:
return False
if isinstance(val, str):
return val.strip().lower() in ("true", "yes", "on", "1")
return bool(val)
def jobs_coe_map(workflow_doc: dict) -> dict[str, bool]:
"""Return ``{job_key: continue_on_error_bool}`` for every job in
the workflow. Job-level ``continue-on-error`` only — does NOT
descend into per-step ``continue-on-error`` (step-level CoE
masking is a separate class and is handled by the test suite
+ reviewer, not by this gate — see Future Work in the workflow
YAML).
"""
out: dict[str, bool] = {}
jobs = workflow_doc.get("jobs")
if not isinstance(jobs, dict):
return out
for key, job in jobs.items():
if not isinstance(job, dict):
continue
out[key] = _coerce_coe(job.get("continue-on-error"))
return out
def workflow_name(workflow_doc: dict, *, fallback: str = "") -> str:
"""Top-level ``name:`` of the workflow. Falls back to the filename
(without extension) per Gitea Actions semantics."""
n = workflow_doc.get("name")
if isinstance(n, str) and n.strip():
return n.strip()
return fallback
def job_display_name(workflow_doc: dict, job_key: str) -> str:
"""``jobs.<key>.name`` if present, else the key. Mirrors
expected_context() in ci-required-drift.py."""
job = workflow_doc.get("jobs", {}).get(job_key)
if isinstance(job, dict):
n = job.get("name")
if isinstance(n, str) and n.strip():
return n.strip()
return job_key
def context_name(workflow_name_str: str, job_name_str: str, event: str = "push") -> str:
"""Render the commit-status context the way Gitea Actions emits it.
Default ``event="push"`` because recent-runs-on-main are push events;
callers can override to ``"pull_request"`` for PR-context lookups."""
return f"{workflow_name_str} / {job_name_str} ({event})"
# --------------------------------------------------------------------------
# Diff detection — flips, not arbitrary changes
# --------------------------------------------------------------------------
def detect_flips(
base_workflows: dict[str, str],
head_workflows: dict[str, str],
) -> list[dict]:
"""Compare per-file CoE maps; return a list of flip records.
Inputs are ``{path: yaml_text}`` for both sides. Output records
have the shape::
{
"workflow_path": ".gitea/workflows/ci.yml",
"workflow_name": "CI",
"job_key": "platform-build",
"job_name": "Platform (Go)",
"context": "CI / Platform (Go) (push)",
}
A flip is base[CoE] ∈ {True} AND head[CoE] ∈ {False}. Files
only present on one side are skipped — adding a new workflow
with ``CoE: false`` is fine (no history to mask), and removing
a workflow can't possibly flip anything.
"""
flips: list[dict] = []
for path, base_text in base_workflows.items():
if path not in head_workflows:
continue
try:
base_doc = yaml.safe_load(base_text) or {}
head_doc = yaml.safe_load(head_workflows[path]) or {}
except yaml.YAMLError as e:
# Don't block on a parse error — the YAML lint workflows
# catch invalid YAML separately. Just warn so the failing
# file is visible.
sys.stderr.write(f"::warning file={path}::YAML parse error: {e}\n")
continue
if not isinstance(base_doc, dict) or not isinstance(head_doc, dict):
continue
base_map = jobs_coe_map(base_doc)
head_map = jobs_coe_map(head_doc)
wf_name = workflow_name(head_doc, fallback=os.path.basename(path).rsplit(".", 1)[0])
for job_key, base_val in base_map.items():
if job_key not in head_map:
continue # job removed — not a flip
if base_val is True and head_map[job_key] is False:
flips.append({
"workflow_path": path,
"workflow_name": wf_name,
"job_key": job_key,
"job_name": job_display_name(head_doc, job_key),
"context": context_name(wf_name, job_display_name(head_doc, job_key), "push"),
})
return flips
# --------------------------------------------------------------------------
# Git: snapshot every .gitea/workflows/*.yml at a SHA (no checkout)
# --------------------------------------------------------------------------
def _git(*args: str, cwd: str | None = None) -> str:
"""Run ``git`` and return stdout (text)."""
result = subprocess.run(
["git", *args],
capture_output=True,
text=True,
check=False,
cwd=cwd,
)
if result.returncode != 0:
raise RuntimeError(f"git {args!r} failed: {result.stderr.strip()}")
return result.stdout
def workflows_at_sha(sha: str, *, repo_dir: str | None = None) -> dict[str, str]:
"""Read every ``.gitea/workflows/*.yml`` blob at ``sha``.
Uses ``git ls-tree`` + ``git show`` so we never need to check out
the SHA (the workflow runs on the PR head; the base SHA is
fetched, not checked out).
"""
out: dict[str, str] = {}
listing = _git("ls-tree", "-r", "--name-only", sha, ".gitea/workflows/", cwd=repo_dir)
for line in listing.splitlines():
line = line.strip()
if not line.endswith((".yml", ".yaml")):
continue
try:
blob = _git("show", f"{sha}:{line}", cwd=repo_dir)
except RuntimeError:
# Symlink or other non-blob; skip.
continue
out[line] = blob
return out
# --------------------------------------------------------------------------
# Gitea: recent commits + per-commit combined status + log fetch
# --------------------------------------------------------------------------
def recent_commits_on_branch(branch: str, n: int) -> list[str]:
"""Last `n` commit SHAs on ``branch`` (oldest→newest is fine; we
treat them as a set). Uses the REST ``/commits`` endpoint with
``sha=branch&limit=n``."""
_, body = api(
"GET",
f"/repos/{OWNER}/{NAME}/commits",
query={"sha": branch, "limit": str(n)},
)
if not isinstance(body, list):
raise ApiError(f"/commits for {branch} returned non-list: {type(body).__name__}")
out: list[str] = []
for c in body:
if isinstance(c, dict):
sha = c.get("sha") or (c.get("commit", {}) or {}).get("id")
if isinstance(sha, str) and len(sha) >= 7:
out.append(sha)
return out
def combined_status(sha: str) -> dict:
"""Combined commit status for a SHA. Same shape as
``main-red-watchdog.get_combined_status``."""
_, body = api("GET", f"/repos/{OWNER}/{NAME}/commits/{sha}/status")
if not isinstance(body, dict):
raise ApiError(f"combined-status for {sha} not a dict")
return body
def _entry_state(s: dict) -> str:
"""Per-entry state — Gitea 1.22.6 schema asymmetry: top-level
uses ``state``, per-entry uses ``status``. Defensive fallback per
main-red-watchdog.py line 233."""
return s.get("status") or s.get("state") or ""
def fetch_log(target_url: str) -> str | None:
"""Fetch a job log given its web-UI ``target_url`` (e.g.
``/molecule-ai/molecule-core/actions/runs/13494/jobs/0``).
Per ``reference_gitea_actions_log_fetch``: append ``/logs`` to the
job route. Per ``reference_gitea_1_22_6_lacks_rest_rerun_endpoints``:
Gitea 1.22.6 lacks the REST ``/api/v1/.../actions/runs/*`` path; the
web-UI route is the only working endpoint until 1.24+.
Returns the log text on success, ``None`` on 404 / log-pruned /
network error (caller treats None as "log unavailable, warn-not-fail").
"""
if not target_url:
return None
# Normalize: target_url may be relative ("/owner/repo/...") or
# absolute. Both need ``/logs`` appended to the job sub-path.
if target_url.startswith("/"):
url = f"{WEB}{target_url}"
else:
url = target_url
if not url.endswith("/logs"):
url = f"{url}/logs"
try:
_, body, _ = http("GET", url, expect_json=False, timeout=60)
except ApiError as e:
sys.stderr.write(f"::warning::log fetch failed for {url}: {e}\n")
return None
if isinstance(body, bytes):
return body.decode("utf-8", errors="replace")
return None
def grep_fail_markers(log_text: str) -> list[str]:
"""Return up to 5 sample matching lines for any FAIL_PATTERNS hit.
Empty list = clean log."""
matches: list[str] = []
for line in log_text.splitlines():
for pat in FAIL_PATTERNS:
if pat in line:
# Truncate to keep error output bounded.
matches.append(line.strip()[:240])
break
if len(matches) >= 5:
break
return matches
# --------------------------------------------------------------------------
# Verification: for one flip, scan recent runs on BASE_REF
# --------------------------------------------------------------------------
def verify_flip(flip: dict, branch: str, n: int) -> dict:
"""Scan the last ``n`` commits on ``branch``. For each commit whose
combined status contains a context matching ``flip["context"]``,
fetch the run log and grep for FAIL markers.
Returns::
{
"flip": flip,
"checked_commits": int, # how many commits had a matching context
"masked_runs": [ # runs where log shows FAIL despite status==success
{"sha": "...", "status": "success", "target_url": "...", "samples": [...]},
...
],
"fail_runs": [ # runs where status itself is failure/error
{"sha": "...", "status": "failure", "target_url": "...", "samples": [...]},
...
],
"warnings": [str], # log-unavailable warnings (not blocking)
}
Blocking condition: ``masked_runs`` OR ``fail_runs`` non-empty.
A ``success`` status with a clean log is the only "OK to flip"
outcome (per hongming-pc2 §SOP-N rule (e)).
"""
target_context = flip["context"]
result = {
"flip": flip,
"checked_commits": 0,
"masked_runs": [],
"fail_runs": [],
"warnings": [],
}
shas = recent_commits_on_branch(branch, n)
if not shas:
result["warnings"].append(
f"no recent commits on {branch} (cannot verify flip)"
)
return result
for sha in shas:
try:
status_doc = combined_status(sha)
except ApiError as e:
result["warnings"].append(f"combined-status for {sha}: {e}")
continue
statuses = status_doc.get("statuses") or []
# First entry matching the context name. Newest SHAs come
# first; one entry per context per SHA is the usual shape.
for s in statuses:
if not isinstance(s, dict):
continue
if s.get("context") != target_context:
continue
result["checked_commits"] += 1
state = _entry_state(s)
target_url = s.get("target_url") or ""
log_text = fetch_log(target_url)
if log_text is None:
result["warnings"].append(
f"log unavailable for {sha} {target_context}"
)
# Still record the status itself if it's red — that's
# a hard signal that doesn't need log access.
if state in ("failure", "error"):
result["fail_runs"].append({
"sha": sha,
"status": state,
"target_url": target_url,
"samples": ["[log unavailable; status itself is " + state + "]"],
})
break
samples = grep_fail_markers(log_text)
if state in ("failure", "error"):
result["fail_runs"].append({
"sha": sha,
"status": state,
"target_url": target_url,
"samples": samples or ["[no FAIL markers found but status is " + state + "]"],
})
elif samples and state == "success":
# The bug class: status==success while log shows FAIL.
# That's exactly Quirk #10 (continue-on-error masking).
result["masked_runs"].append({
"sha": sha,
"status": state,
"target_url": target_url,
"samples": samples,
})
# Either way, we matched one context entry for this SHA;
# don't keep looping `statuses[]`.
break
if result["checked_commits"] == 0:
result["warnings"].append(
f"no runs of {target_context!r} found in the last {n} commits on "
f"{branch} — cannot verify; allowing flip with warning"
)
return result
# --------------------------------------------------------------------------
# Report rendering
# --------------------------------------------------------------------------
def render_flip_report(verdict: dict) -> str:
flip = verdict["flip"]
lines = [
f"job: {flip['job_key']} ({flip['context']})",
f" workflow: {flip['workflow_path']}",
f" checked_commits: {verdict['checked_commits']}",
]
for run in verdict["fail_runs"]:
url = run["target_url"]
# target_url may be relative; render the absolute form for
# click-through.
if url.startswith("/"):
url = f"{WEB}{url}"
lines.append(f" fail run {run['sha'][:10]} (status={run['status']}): {url}")
for sample in run["samples"]:
lines.append(f" | {sample}")
for run in verdict["masked_runs"]:
url = run["target_url"]
if url.startswith("/"):
url = f"{WEB}{url}"
lines.append(
f" MASKED run {run['sha'][:10]} (status=success, log shows FAIL): {url}"
)
for sample in run["samples"]:
lines.append(f" | {sample}")
for w in verdict["warnings"]:
lines.append(f" warning: {w}")
return "\n".join(lines)
# --------------------------------------------------------------------------
# Main
# --------------------------------------------------------------------------
def _parse_args(argv: list[str] | None = None) -> argparse.Namespace:
p = argparse.ArgumentParser(
prog="lint-pre-flip-continue-on-error",
description="Block a PR that flips continue-on-error true→false "
"without proof recent runs are actually green.",
)
p.add_argument(
"--dry-run",
action="store_true",
help="Detect + print findings to stdout; never exit non-zero. "
"Useful for local testing.",
)
return p.parse_args(argv)
def main(argv: list[str] | None = None) -> int:
args = _parse_args(argv)
_require_runtime_env()
base_workflows = workflows_at_sha(BASE_SHA)
head_workflows = workflows_at_sha(HEAD_SHA)
flips = detect_flips(base_workflows, head_workflows)
if not flips:
print("::notice::no continue-on-error true→false flips in this PR")
return 0
print(f"::notice::detected {len(flips)} continue-on-error true→false flip(s); verifying recent runs on {BASE_REF}")
bad_flips: list[dict] = []
for flip in flips:
verdict = verify_flip(flip, BASE_REF, RECENT_COMMITS_N)
report = render_flip_report(verdict)
if verdict["fail_runs"] or verdict["masked_runs"]:
print(f"::error file={flip['workflow_path']}::flip of {flip['job_key']} "
f"({flip['context']}) blocked — recent runs on {BASE_REF} show "
f"FAIL markers OR are red. Pull each run log below + grep "
f"`--- FAIL` / `FAIL ` / `::error::` — DON'T trust the masked "
f"combined-status. See hongming-pc2 charter §SOP-N rule (e). "
f"PR#656 / mc#664 reference class.")
bad_flips.append(verdict)
else:
print(f"::notice::flip of {flip['job_key']} ({flip['context']}) is safe — "
f"{verdict['checked_commits']} recent run(s), no FAIL markers")
# Always print the per-flip detail block so the human-readable
# report is in the run log for both safe and unsafe flips.
print(f"::group::flip detail: {flip['job_key']}")
print(report)
print("::endgroup::")
if bad_flips and not args.dry_run:
print(f"::error::{len(bad_flips)}/{len(flips)} flip(s) failed pre-flip verification")
return 1
if bad_flips and args.dry_run:
print(f"::warning::[dry-run] {len(bad_flips)}/{len(flips)} flip(s) WOULD fail; exit 0 forced")
return 0
if __name__ == "__main__":
sys.exit(main())
@@ -1,526 +0,0 @@
#!/usr/bin/env python3
"""lint_required_context_exists_in_bp — Tier 2g per internal#350.
Rule
----
When a PR adds a NEW commit-status emission (a context that didn't
exist on the base side), the workflow file must carry one of three
directive comments adjacent to the new job:
(a) `# bp-required: yes`
The new context MUST already be in
`branch_protections/<branch>.status_check_contexts`. Verified
via Gitea API at PR time.
(b) `# bp-required: pending #NNN`
Acknowledged asymmetry; references an OPEN tracking issue that
will follow up with the BP PATCH.
(c) `# bp-exempt: <free-text reason>`
Informational job, not intended to be a required gate.
No directive on a new emitter → FAIL with a 3-option fix-hint.
The class this prevents
-----------------------
PR#656 added `CI / all-required (pull_request)` as a sentinel context
that workflows emit, but BP did NOT list it. When `platform-build`
failed, `all-required` failed, but BP let the PR merge anyway →
cascade to mc#664. With this lint, PR#656 would have been blocked
until either the BP PATCH ran alongside OR the author added a
`bp-required: pending` directive.
Why directives MUST live in the workflow YAML
---------------------------------------------
The directive comment lives with the emitter so a scheduled
audit (Tier 2f, daily) can read the same source. PR-body-only
directives invisibly evaporate on merge — the asymmetry would
return to undetected. PR-body claims are advisory; workflow-file
comments are the contract.
How "new emission" is detected
------------------------------
Diff base..head over `.gitea/workflows/*.yml`. For each YAML file
that's added or modified:
- Parse both base-side and head-side via PyYAML AST.
- Enumerate emitted contexts on each side using the same rules as
Tier 2f (workflow.name + job.name|key + event-mapping).
- `new_contexts = head_contexts - base_contexts`.
If `new_contexts` is empty after de-dup, no rule applies → pass.
Per `feedback_behavior_based_ast_gates`: comment scanning uses raw
text in a small window around the job-key line, NOT regex over the
full file. This avoids matching `bp-required:` mentioned in a
comment unrelated to the new job.
Exit codes
----------
0 — no new emissions, all new emissions have valid directives,
or BP read errored (graceful-degrade per Tier 2a contract).
1 — at least one new emission lacks a directive, or has
`bp-required: yes` but the context is missing from BP.
2 — env contract violation or YAML parse error.
Env
---
BASE_SHA — PR base SHA
HEAD_SHA — PR head SHA
GITEA_TOKEN — DRIFT_BOT_TOKEN (repo-admin for BP read)
GITEA_HOST — e.g. git.moleculesai.app
REPO — owner/name
BRANCH — defaults to `main`
WORKFLOWS_DIR — defaults to `.gitea/workflows`
Memory cross-links
------------------
- internal#350 (the RFC that specs this lint)
- PR#656 (the empirical case that prompted Tier 2g)
- mc#664 (the surfaced cascade)
- feedback_phantom_required_check_after_gitea_migration (Tier 2f cousin)
- feedback_behavior_based_ast_gates
"""
from __future__ import annotations
import json
import os
import re
import subprocess
import sys
import urllib.error
import urllib.parse
import urllib.request
from typing import Any
try:
import yaml
except ImportError:
sys.stderr.write(
"::error::PyYAML is required. Install with: pip install PyYAML\n"
)
sys.exit(2)
# Directive comment patterns. We match `# bp-required:` OR `# bp-exempt:`,
# both with optional surrounding whitespace and case-sensitive on the
# `bp-` prefix (convention).
BP_REQUIRED_YES_RE = re.compile(
r"#\s*bp-required:\s*yes\b", re.IGNORECASE
)
BP_REQUIRED_PENDING_RE = re.compile(
r"#\s*bp-required:\s*pending\s*#(?P<num>\d+)\b", re.IGNORECASE
)
BP_EXEMPT_RE = re.compile(
r"#\s*bp-exempt:\s*\S", re.IGNORECASE
)
# Gitea event-mapping (same as Tier 2f).
_EVENT_MAP = {
"pull_request": "pull_request",
"pull_request_target": "pull_request",
"push": "push",
}
# ---------------------------------------------------------------------------
# Env
# ---------------------------------------------------------------------------
def _env(key: str, default: str | None = None) -> str:
v = os.environ.get(key, default)
return v if v is not None else ""
def _require_env(key: str) -> str:
v = os.environ.get(key)
if not v:
sys.stderr.write(f"::error::missing required env var: {key}\n")
sys.exit(2)
return v
# ---------------------------------------------------------------------------
# API helper (same contract as Tier 2f).
# ---------------------------------------------------------------------------
def api(
method: str,
path: str,
*,
body: dict | None = None,
query: dict[str, str] | None = None,
) -> tuple[str, Any]:
host = _env("GITEA_HOST")
token = _env("GITEA_TOKEN")
url = f"https://{host}/api/v1{path}"
if query:
url = f"{url}?{urllib.parse.urlencode(query)}"
data = None
headers = {
"Authorization": f"token {token}",
"Accept": "application/json",
}
if body is not None:
data = json.dumps(body).encode("utf-8")
headers["Content-Type"] = "application/json"
req = urllib.request.Request(url, method=method, data=data, headers=headers)
try:
with urllib.request.urlopen(req, timeout=30) as resp:
raw = resp.read()
if not raw:
return ("ok", None)
return ("ok", json.loads(raw))
except urllib.error.HTTPError as e:
if e.code == 404:
return ("not_found", None)
if e.code in (401, 403):
return ("forbidden", None)
return ("error", None)
except (urllib.error.URLError, TimeoutError, json.JSONDecodeError):
return ("error", None)
# ---------------------------------------------------------------------------
# git helpers
# ---------------------------------------------------------------------------
def git_show(sha: str, path: str) -> str | None:
r = subprocess.run(
["git", "show", f"{sha}:{path}"], capture_output=True, text=True
)
if r.returncode != 0:
return None
return r.stdout
def git_diff_paths(base: str, head: str) -> list[str]:
r = subprocess.run(
["git", "diff", "--name-only", f"{base}..{head}"],
capture_output=True,
text=True,
)
if r.returncode != 0:
return []
return [p for p in r.stdout.splitlines() if p.strip()]
# ---------------------------------------------------------------------------
# Workflow context enumeration (mirror Tier 2f).
# ---------------------------------------------------------------------------
def _get_on(d: Any) -> Any:
if not isinstance(d, dict):
return None
if "on" in d:
return d["on"]
if True in d:
return d[True]
return None
def _on_events(doc: Any) -> set[str]:
on = _get_on(doc)
raw: set[str] = set()
if on is None:
return raw
if isinstance(on, str):
raw.add(on)
elif isinstance(on, list):
for e in on:
if isinstance(e, str):
raw.add(e)
elif isinstance(on, dict):
for k in on:
if isinstance(k, str):
raw.add(k)
return {_EVENT_MAP[e] for e in raw if e in _EVENT_MAP}
def _job_display(jbody: dict, jkey: str) -> str:
n = jbody.get("name") if isinstance(jbody, dict) else None
if isinstance(n, str) and n:
return n
return jkey
def workflow_contexts(doc: Any) -> set[str]:
if not isinstance(doc, dict):
return set()
wf_name = doc.get("name")
if not isinstance(wf_name, str) or not wf_name:
return set()
events = _on_events(doc)
if not events:
return set()
jobs = doc.get("jobs")
if not isinstance(jobs, dict):
return set()
out: set[str] = set()
for jkey, jbody in jobs.items():
if jkey == "__lines__":
continue
if not isinstance(jbody, dict):
continue
disp = _job_display(jbody, jkey)
for ev in events:
out.add(f"{wf_name} / {disp} ({ev})")
return out
# ---------------------------------------------------------------------------
# Find the source line of a job-key in a workflow YAML's raw text.
# Used to scan for nearby directive comments.
# ---------------------------------------------------------------------------
def _find_job_key_line(raw_lines: list[str], jkey: str) -> int | None:
"""Return 1-based line of `<jkey>:` under jobs:."""
in_jobs = False
jobs_indent = -1
for i, line in enumerate(raw_lines, start=1):
stripped = line.lstrip()
if stripped.startswith("jobs:"):
in_jobs = True
jobs_indent = len(line) - len(stripped)
continue
if in_jobs:
# Job key is the next indent level under `jobs:`.
indent = len(line) - len(stripped)
if stripped and indent <= jobs_indent:
# Left the jobs: block
in_jobs = False
continue
if re.match(rf"^\s*{re.escape(jkey)}\s*:", line):
return i
return None
_DIRECTIVE_WINDOW = 3 # lines above the job-key line (inclusive)
def find_directive_for_job(
raw_text: str, jkey: str
) -> tuple[str, str | None] | None:
"""Return (kind, value) tuple for the first directive in a small
window above the job-key line.
kind ∈ {"required-yes", "required-pending", "exempt"}.
value is the pending-issue number for required-pending, else None.
Returns None if no directive found.
We scan ABOVE the line only (the convention is the directive
precedes the job — matches how `# mc#NNN` comments are placed
above `continue-on-error: true`). We don't scan inside the job
body because steps can produce false positives.
"""
lines = raw_text.splitlines()
line_no = _find_job_key_line(lines, jkey)
if line_no is None:
return None
lo = max(1, line_no - _DIRECTIVE_WINDOW)
for i in range(lo, line_no):
line = lines[i - 1]
m = BP_REQUIRED_PENDING_RE.search(line)
if m:
return ("required-pending", m.group("num"))
if BP_REQUIRED_YES_RE.search(line):
return ("required-yes", None)
if BP_EXEMPT_RE.search(line):
return ("exempt", None)
return None
# ---------------------------------------------------------------------------
# Map a context back to its emitting (workflow_path, job_key) pair so
# we know WHERE to look for the directive comment.
# ---------------------------------------------------------------------------
def _resolve_emitter(
ctx: str, head_workflows: dict[str, tuple[str, Any]]
) -> tuple[str, str] | None:
"""Return (file_path, job_key) emitting ctx, or None."""
m = re.match(r"^(?P<wf>.+?) / (?P<job>.+) \((?P<event>[^)]+)\)$", ctx)
if not m:
return None
target_wf = m.group("wf")
target_job_disp = m.group("job")
for path, (_raw, doc) in head_workflows.items():
if not isinstance(doc, dict):
continue
if doc.get("name") != target_wf:
continue
jobs = doc.get("jobs") or {}
if not isinstance(jobs, dict):
continue
for jkey, jbody in jobs.items():
if jkey == "__lines__":
continue
if not isinstance(jbody, dict):
continue
disp = _job_display(jbody, jkey)
if disp == target_job_disp:
return (path, jkey)
return None
# ---------------------------------------------------------------------------
# Driver
# ---------------------------------------------------------------------------
def run() -> int:
base_sha = _require_env("BASE_SHA")
head_sha = _require_env("HEAD_SHA")
_require_env("GITEA_TOKEN")
_require_env("GITEA_HOST")
repo = _require_env("REPO")
branch = _env("BRANCH", "main")
wf_dir = _env("WORKFLOWS_DIR", ".gitea/workflows")
# Step 1 — find workflow files changed in the PR.
changed = git_diff_paths(base_sha, head_sha)
changed_workflows = [
p
for p in changed
if p.startswith(wf_dir + "/")
and (p.endswith(".yml") or p.endswith(".yaml"))
]
if not changed_workflows:
print(
"::notice::no workflow file changes in this PR; "
"lint-required-context-exists-in-bp skipped."
)
return 0
# Step 2 — load base+head + compute new contexts.
head_workflows: dict[str, tuple[str, Any]] = {}
new_contexts: set[str] = set()
for path in changed_workflows:
base_raw = git_show(base_sha, path)
head_raw = git_show(head_sha, path)
if head_raw is None:
# File deleted on head — no new emission contribution.
continue
try:
head_doc = yaml.safe_load(head_raw)
except yaml.YAMLError as e:
sys.stderr.write(
f"::error file={path}::YAML parse error on head: {e}\n"
)
return 2
head_workflows[path] = (head_raw, head_doc)
head_ctx = workflow_contexts(head_doc)
base_ctx: set[str] = set()
if base_raw is not None:
try:
base_doc = yaml.safe_load(base_raw)
except yaml.YAMLError:
base_doc = None
if base_doc is not None:
base_ctx = workflow_contexts(base_doc)
new_contexts |= (head_ctx - base_ctx)
if not new_contexts:
print(
"::notice::no new context emissions detected in this PR; "
"lint-required-context-exists-in-bp skipped."
)
return 0
# Step 3 — fetch BP context list.
status, bp = api("GET", f"/repos/{repo}/branch_protections/{branch}")
bp_contexts: set[str] = set()
if status == "forbidden":
sys.stderr.write(
f"::error::GET branch_protections/{branch} returned HTTP 403 — "
f"DRIFT_BOT_TOKEN lacks repo-admin scope. Cannot verify "
f"bp-required directives; skipping lint with exit 0 per "
f"Tier 2a contract. Fix the token, not the lint.\n"
)
return 0
elif status == "not_found":
# Branch has no protection — nothing to verify against; the
# bp-required: yes directive can't be satisfied. Treat as
# graceful-skip rather than red-X.
print(
f"::notice::branch '{branch}' has no protection; cannot verify "
f"bp-required directives. Skipping (exit 0)."
)
return 0
elif status == "ok" and isinstance(bp, dict):
bp_contexts = set(bp.get("status_check_contexts") or [])
else:
sys.stderr.write(
f"::error::branch_protections/{branch} response unexpected; "
f"status={status}. Treating as transient; exit 0.\n"
)
return 0
# Step 4 — validate each new emission's directive.
violations: list[str] = []
for ctx in sorted(new_contexts):
emitter = _resolve_emitter(ctx, head_workflows)
if emitter is None:
# Shouldn't happen — we just derived ctx from head_workflows.
# Belt-and-suspenders fallback.
violations.append(
f"::error::new emission '{ctx}' (could not resolve emitter "
f"file/job — bug in lint?)"
)
continue
file_path, jkey = emitter
raw_text, _ = head_workflows[file_path]
directive = find_directive_for_job(raw_text, jkey)
if directive is None:
violations.append(
f"::error file={file_path}::lint-required-context-exists-in-bp "
f"(Tier 2g): NEW emission `{ctx}` (job '{jkey}') has no "
f"directive comment. Add ONE of these comments on the line "
f"directly above `{jkey}:` (within {_DIRECTIVE_WINDOW} lines):\n"
f" - `# bp-required: yes` — and ensure the context is "
f"already in branch_protections/{branch}.status_check_contexts.\n"
f" - `# bp-required: pending #NNN` — acknowledged asymmetry, "
f"references the tracking issue for the BP PATCH.\n"
f" - `# bp-exempt: <reason>` — informational job, not a gate.\n"
f"Memory: internal#350 (PR#656 + mc#664 empirical case)."
)
continue
kind, value = directive
if kind == "exempt":
print(f"::notice::{ctx}: bp-exempt directive present, OK.")
continue
if kind == "required-pending":
print(
f"::notice::{ctx}: bp-required: pending #{value}"
f"acknowledged asymmetry, OK."
)
continue
if kind == "required-yes":
if ctx in bp_contexts:
print(
f"::notice::{ctx}: bp-required: yes, and context is in "
f"BP, OK."
)
else:
violations.append(
f"::error file={file_path}::lint-required-context-exists-in-bp "
f"(Tier 2g): job '{jkey}' has `bp-required: yes` "
f"directive but its emitted context `{ctx}` is NOT in "
f"`branch_protections/{branch}.status_check_contexts`. "
f"FIX: either (a) add `{ctx}` to BP (Owners-tier PATCH), "
f"or (b) downgrade the directive to "
f"`# bp-required: pending #NNN` referencing the tracker "
f"for the pending BP PATCH."
)
if violations:
print(
f"::error::lint-required-context-exists-in-bp: "
f"{len(violations)} violation(s) across "
f"{len(changed_workflows)} changed workflow file(s)."
)
for v in violations:
print(v)
return 1
print(
f"::notice::lint-required-context-exists-in-bp: "
f"{len(new_contexts)} new emission(s) all directive-validated."
)
return 0
if __name__ == "__main__":
sys.exit(run())
-823
View File
@@ -1,823 +0,0 @@
#!/usr/bin/env python3
# sop-checklist-gate — evaluate whether a PR has peer-acked each
# SOP-checklist item. Posts a commit-status that branch protection
# can require.
#
# RFC#351 Step 2 of 6 (implementation MVP).
#
# Invoked by .gitea/workflows/sop-checklist-gate.yml on:
# - pull_request_target: [opened, edited, synchronize, reopened]
# - issue_comment: [created, edited, deleted]
#
# Flow:
# 1. Load .gitea/sop-checklist-config.yaml (from BASE ref — trusted).
# 2. GET /repos/{R}/pulls/{N} — author, head.sha, tier label
# 3. GET /repos/{R}/issues/{N}/comments — extract /sop-ack and /sop-revoke
# 4. For each checklist item:
# a. Is the section marker present in PR body? (author answered)
# b. Is there ≥1 unrevoked /sop-ack from a non-author whose
# team-membership matches required_teams?
# 5. POST /repos/{R}/statuses/{sha} — context
# `sop-checklist / all-items-acked (pull_request)`,
# state=success | failure | pending, description=`acked: N/M …`.
#
# Trust boundary (mirrors RFC#324 §A4):
# This script is loaded from the BASE branch. The workflow's
# actions/checkout step pins ref=base.sha. PR-HEAD code is never
# executed. We only HTTP-call the Gitea API.
#
# Token scope:
# - read:repository / read:organization to enumerate PR + comments
# + team membership (Gitea 1.22.6 quirk: team-membership endpoint
# returns 403 if token owner is not in the team; see review-check.sh
# for the same gotcha — we surface the same fail-closed message).
# - write:repository for `POST /repos/{R}/statuses/{sha}`. Unlike
# RFC#324's pattern (which uses the JOB's own pass/fail as the
# status), we POST the status explicitly because the gate posts
# a single multi-item status with a richer description than a
# bare success/failure context can carry.
#
# Slug normalization rules (canonical form: kebab-case):
# - Lowercase
# - Whitespace + underscores → single dash
# - Strip non [a-z0-9-] characters
# - Collapse adjacent dashes
# - Strip leading/trailing dashes
# - If the result is a digit string (e.g. "1"), look up via
# config.items[*].numeric_alias to get the kebab-case slug.
#
# Examples:
# "Comprehensive_Testing" → "comprehensive-testing"
# "comprehensive testing" → "comprehensive-testing"
# "1" → "comprehensive-testing"
# "Five-Axis-Review" → "five-axis-review"
#
# Revoke semantics:
# /sop-revoke <slug> [reason] — most-recent comment per (slug, user)
# wins. So if Alice posts /sop-ack X then later /sop-revoke X, her ack
# for X is invalidated. Bob's prior /sop-ack X is unaffected. If Alice
# posts /sop-revoke X then later /sop-ack X again, the ack is restored.
from __future__ import annotations
import argparse
import json
import os
import re
import sys
import urllib.error
import urllib.parse
import urllib.request
from typing import Any
# ---------------------------------------------------------------------------
# Slug normalization
# ---------------------------------------------------------------------------
_NORMALIZE_REPLACE_RE = re.compile(r"[\s_]+")
_NORMALIZE_STRIP_RE = re.compile(r"[^a-z0-9-]")
_NORMALIZE_DASH_RE = re.compile(r"-+")
def normalize_slug(raw: str, numeric_aliases: dict[int, str] | None = None) -> str:
"""Normalize a user-supplied slug to canonical kebab-case form.
See module header for the rules.
If the input is a pure digit string AND numeric_aliases is provided,
the alias mapping is consulted. Unknown digits return "" so the caller
can flag the comment as unparseable.
"""
if raw is None:
return ""
s = raw.strip().lower()
s = _NORMALIZE_REPLACE_RE.sub("-", s)
s = _NORMALIZE_STRIP_RE.sub("", s)
s = _NORMALIZE_DASH_RE.sub("-", s)
s = s.strip("-")
if s.isdigit() and numeric_aliases is not None:
return numeric_aliases.get(int(s), "")
return s
# ---------------------------------------------------------------------------
# Comment parsing — /sop-ack and /sop-revoke
# ---------------------------------------------------------------------------
# A directive must be on its own line. Permits leading whitespace.
# Optional trailing note after the slug for /sop-ack and required reason
# for /sop-revoke (RFC#351 open question 4 — reason is captured but not
# yet validated; future iteration may require a min-length).
_DIRECTIVE_RE = re.compile(
r"^[ \t]*/(sop-ack|sop-revoke)[ \t]+([A-Za-z0-9_\- ]+?)(?:[ \t]+(.*))?[ \t]*$",
re.MULTILINE,
)
def parse_directives(
comment_body: str,
numeric_aliases: dict[int, str],
) -> list[tuple[str, str, str]]:
"""Extract /sop-ack and /sop-revoke directives from a comment body.
Returns a list of (kind, canonical_slug, note) tuples where:
kind is "sop-ack" or "sop-revoke"
canonical_slug is the normalized form (or "" if unparseable)
note is the trailing free-text (may be "")
"""
out: list[tuple[str, str, str]] = []
if not comment_body:
return out
for m in _DIRECTIVE_RE.finditer(comment_body):
kind = m.group(1)
raw_slug = (m.group(2) or "").strip()
# If the raw match included trailing words, the regex non-greedy
# captured only the first token; strip again for safety.
# We split on whitespace to keep the FIRST word as the slug, and
# everything after as the note.
parts = raw_slug.split()
if not parts:
continue
first = parts[0]
# If the slug-capture greedily matched multiple words (e.g.
# "comprehensive testing"), preserve normalize behavior: join
# the WHOLE first-word-token only; trailing words get appended to
# the note. The regex limits group(2) to [A-Za-z0-9_\- ] so we
# may have multi-word forms here — normalize handles them.
if len(parts) > 1:
# User wrote "/sop-ack comprehensive testing extra-note"
# → treat "comprehensive testing" as the slug source if it
# normalizes to a known item; otherwise treat "comprehensive"
# as slug and "testing extra-note" as note. We defer the
# disambiguation to the caller via the returned canonical
# slug. For simplicity: try the WHOLE captured string first.
canonical = normalize_slug(raw_slug, numeric_aliases)
else:
canonical = normalize_slug(first, numeric_aliases)
note_from_group = (m.group(3) or "").strip()
# If we collapsed multi-word slug into kebab and there's a
# trailing-text group too, append it.
out.append((kind, canonical, note_from_group))
return out
# ---------------------------------------------------------------------------
# PR body section detection
# ---------------------------------------------------------------------------
def section_marker_present(body: str, marker: str) -> bool:
"""Return True if `marker` appears in `body` case-insensitively
on a non-empty line (i.e. the author actually filled it in).
We require the marker substring AND non-whitespace content on the
same line OR within the next line — this prevents trivially-empty
checklists like:
## SOP-Checklist
- [ ] **Comprehensive testing performed**:
- [ ] **Local-postgres E2E run**:
from auto-passing the section-present check. The peer-ack is still
required, but answering with empty content is captured as a soft
finding via the section-present test alone.
"""
if not body or not marker:
return False
body_lower = body.lower()
marker_lower = marker.lower()
idx = body_lower.find(marker_lower)
if idx < 0:
return False
# Walk to end of line.
line_end = body.find("\n", idx)
if line_end < 0:
line_end = len(body)
line = body[idx + len(marker):line_end]
# Strip the colon + checkbox tail patterns; require at least one
# non-whitespace, non-punctuation char.
stripped = re.sub(r"[\s\*:\-\[\]]+", "", line)
if stripped:
return True
# Fall through: check the NEXT line (multi-line answers).
next_line_end = body.find("\n", line_end + 1)
if next_line_end < 0:
next_line_end = len(body)
next_line = body[line_end + 1:next_line_end]
stripped_next = re.sub(r"[\s\*:\-\[\]]+", "", next_line)
return bool(stripped_next)
# ---------------------------------------------------------------------------
# Ack-state computation
# ---------------------------------------------------------------------------
def compute_ack_state(
comments: list[dict[str, Any]],
pr_author: str,
items_by_slug: dict[str, dict[str, Any]],
numeric_aliases: dict[int, str],
team_membership_probe: "callable[[str, list[str]], list[str]]",
) -> dict[str, dict[str, Any]]:
"""Compute per-item ack state.
Each comment is processed in chronological order. The most-recent
directive per (commenter, slug) wins.
Returns a dict keyed by canonical slug:
{
"comprehensive-testing": {
"ackers": ["bob"], # non-author, team-verified
"rejected_ackers": { # debugging info
"self_ack": ["alice"],
"unknown_slug": [],
"not_in_team": ["eve"],
}
},
...
}
"""
# Step 1: collapse directives per (commenter, slug) — most recent wins.
# comments are expected to come in chronological order from the
# API (Gitea returns oldest-first by default for issues/{N}/comments).
latest_directive: dict[tuple[str, str], str] = {} # (user, slug) → kind
unparseable_per_user: dict[str, int] = {}
for c in comments:
body = c.get("body", "") or ""
user = (c.get("user") or {}).get("login", "")
if not user:
continue
for kind, slug, _note in parse_directives(body, numeric_aliases):
if not slug:
unparseable_per_user[user] = unparseable_per_user.get(user, 0) + 1
continue
latest_directive[(user, slug)] = kind
# Step 2: build candidate ackers per slug.
# Filter out self-acks and unknown slugs.
ackers_per_slug: dict[str, list[str]] = {s: [] for s in items_by_slug}
rejected_self: dict[str, list[str]] = {s: [] for s in items_by_slug}
rejected_unknown: dict[str, list[str]] = {s: [] for s in items_by_slug}
pending_team_check: dict[str, list[str]] = {s: [] for s in items_by_slug}
for (user, slug), kind in latest_directive.items():
if kind != "sop-ack":
continue # revokes leave the (user,slug) state as "no ack"
if slug not in items_by_slug:
# Slug normalized to something not in our config — store
# under a synthetic key for diagnostic surfacing. Don't add
# to any item.
continue
if user == pr_author:
rejected_self[slug].append(user)
continue
pending_team_check[slug].append(user)
# Step 3: team membership probe per slug (batched per slug to keep
# API call count down — same user may ack multiple items but the
# required_teams differ per item, so we MUST probe per (user, item)).
rejected_not_in_team: dict[str, list[str]] = {s: [] for s in items_by_slug}
for slug, candidates in pending_team_check.items():
if not candidates:
continue
required = items_by_slug[slug]["required_teams"]
approved = team_membership_probe(slug, candidates) # returns subset
rejected_not_in_team[slug] = [u for u in candidates if u not in approved]
ackers_per_slug[slug] = approved
# Stash required teams for description rendering.
items_by_slug[slug]["_required_resolved"] = required
return {
slug: {
"ackers": ackers_per_slug[slug],
"rejected": {
"self_ack": rejected_self[slug],
"not_in_team": rejected_not_in_team[slug],
},
}
for slug in items_by_slug
}
# ---------------------------------------------------------------------------
# Gitea API client
# ---------------------------------------------------------------------------
class GiteaClient:
def __init__(self, host: str, token: str):
self.base = f"https://{host}/api/v1"
self.token = token
# Cache team-name → team-id resolutions per org.
self._team_id_cache: dict[tuple[str, str], int | None] = {}
def _req(
self,
method: str,
path: str,
body: dict[str, Any] | None = None,
ok_codes: tuple[int, ...] = (200, 201, 204),
) -> tuple[int, Any]:
url = self.base + path
data = None
headers = {
"Authorization": f"token {self.token}",
"Accept": "application/json",
}
if body is not None:
data = json.dumps(body).encode("utf-8")
headers["Content-Type"] = "application/json"
req = urllib.request.Request(url, method=method, data=data, headers=headers)
try:
with urllib.request.urlopen(req, timeout=20) as r:
raw = r.read()
code = r.getcode()
except urllib.error.HTTPError as e:
code = e.code
raw = e.read()
try:
parsed = json.loads(raw.decode("utf-8")) if raw else None
except json.JSONDecodeError:
parsed = raw.decode("utf-8", errors="replace") if raw else None
return code, parsed
def get_pr(self, owner: str, repo: str, pr: int) -> dict[str, Any]:
code, data = self._req("GET", f"/repos/{owner}/{repo}/pulls/{pr}")
if code != 200:
raise RuntimeError(f"GET pulls/{pr} → HTTP {code}: {data!r}")
return data
def get_issue_comments(
self, owner: str, repo: str, issue: int
) -> list[dict[str, Any]]:
# Paginate. Gitea default page size 50.
out: list[dict[str, Any]] = []
page = 1
while True:
code, data = self._req(
"GET",
f"/repos/{owner}/{repo}/issues/{issue}/comments?limit=50&page={page}",
)
if code != 200:
raise RuntimeError(
f"GET issues/{issue}/comments page={page} → HTTP {code}: {data!r}"
)
if not data:
break
out.extend(data)
if len(data) < 50:
break
page += 1
return out
def resolve_team_id(self, org: str, team_name: str) -> int | None:
key = (org, team_name)
if key in self._team_id_cache:
return self._team_id_cache[key]
code, data = self._req("GET", f"/orgs/{org}/teams/search?q={urllib.parse.quote(team_name)}")
team_id = None
if code == 200 and isinstance(data, dict):
for t in data.get("data", []):
if t.get("name") == team_name:
team_id = t.get("id")
break
if team_id is None and code == 200 and isinstance(data, list):
for t in data:
if t.get("name") == team_name:
team_id = t.get("id")
break
self._team_id_cache[key] = team_id
return team_id
def is_team_member(self, team_id: int, login: str) -> bool | None:
"""Return True / False / None (unknown — 403 from API)."""
code, _ = self._req(
"GET", f"/teams/{team_id}/members/{urllib.parse.quote(login)}"
)
if code in (200, 204):
return True
if code == 404:
return False
# 403 means the token owner isn't in this team, so the API
# refuses to confirm membership. Fail-closed at the caller.
return None
def post_status(
self,
owner: str,
repo: str,
sha: str,
state: str,
context: str,
description: str,
target_url: str = "",
) -> None:
body = {
"state": state,
"context": context,
"description": description[:140], # Gitea truncates to 255 but be safe
"target_url": target_url or "",
}
code, data = self._req(
"POST",
f"/repos/{owner}/{repo}/statuses/{sha}",
body=body,
ok_codes=(201,),
)
if code not in (200, 201):
raise RuntimeError(
f"POST statuses/{sha} → HTTP {code}: {data!r}"
)
# ---------------------------------------------------------------------------
# Config loader (PyYAML-free — config file is intentionally tiny + flat)
# ---------------------------------------------------------------------------
def load_config(path: str) -> dict[str, Any]:
"""Load .gitea/sop-checklist-config.yaml.
Uses PyYAML if available, otherwise falls back to a built-in
minimal parser sufficient for our flat config shape. Bundling
PyYAML on the runner is one apt install away but we avoid the
dep by keeping the config shape constrained.
"""
try:
import yaml # type: ignore[import-not-found]
with open(path) as f:
return yaml.safe_load(f)
except ImportError:
return _load_config_minimal(path)
def _load_config_minimal(path: str) -> dict[str, Any]:
"""Minimal YAML subset parser for our config shape.
Supports: top-level scalar:value, top-level map-of-map (e.g.
tier_failure_mode), top-level list of maps (items:), and within an
item map: scalars + lists of scalars. Does NOT support nested lists,
YAML anchors, multi-doc, or flow style.
"""
with open(path) as f:
lines = f.readlines()
return _parse_minimal_yaml(lines)
def _parse_minimal_yaml(lines: list[str]) -> dict[str, Any]: # noqa: C901
"""Hand-rolled subset parser. See _load_config_minimal docstring."""
# Strip comments + blank lines but preserve indentation.
cleaned: list[tuple[int, str]] = []
for raw in lines:
# Don't strip a "#" that is inside a quoted value.
body = raw.rstrip("\n")
# Remove trailing comment.
idx = body.find("#")
if idx >= 0 and (idx == 0 or body[idx - 1] in " \t"):
body = body[:idx].rstrip()
if not body.strip():
continue
indent = len(body) - len(body.lstrip(" "))
cleaned.append((indent, body.strip()))
root: dict[str, Any] = {}
i = 0
n = len(cleaned)
def parse_scalar(s: str) -> Any:
s = s.strip()
if s.startswith('"') and s.endswith('"'):
return s[1:-1]
if s.startswith("'") and s.endswith("'"):
return s[1:-1]
if s.lower() in ("true", "yes"):
return True
if s.lower() in ("false", "no"):
return False
try:
return int(s)
except ValueError:
pass
return s
def parse_inline_list(s: str) -> list[Any]:
s = s.strip()
if not (s.startswith("[") and s.endswith("]")):
return [parse_scalar(s)]
inner = s[1:-1]
if not inner.strip():
return []
return [parse_scalar(x.strip()) for x in inner.split(",")]
while i < n:
indent, line = cleaned[i]
if indent != 0:
i += 1
continue
if ":" not in line:
i += 1
continue
key, _, rest = line.partition(":")
key = key.strip()
rest = rest.strip()
if rest == "":
# Block — could be map or list.
i += 1
# Look ahead for first child.
if i < n and cleaned[i][1].startswith("- "):
# List of items.
items: list[Any] = []
while i < n and cleaned[i][0] > indent and cleaned[i][1].startswith("- "):
item_indent = cleaned[i][0]
first_kv = cleaned[i][1][2:].strip() # strip "- "
item: dict[str, Any] = {}
if ":" in first_kv:
k, _, v = first_kv.partition(":")
k = k.strip()
v = v.strip()
if v == "":
item[k] = ""
elif v.startswith(">-") or v.startswith(">"):
# Folded scalar continues on subsequent indented lines
collected: list[str] = []
i += 1
while i < n and cleaned[i][0] > item_indent:
collected.append(cleaned[i][1])
i += 1
item[k] = " ".join(collected)
items.append(item)
continue
elif v.startswith("["):
item[k] = parse_inline_list(v)
else:
item[k] = parse_scalar(v)
i += 1
# Subsequent k:v lines at deeper indent belong to this item.
while i < n and cleaned[i][0] > item_indent and not cleaned[i][1].startswith("- "):
sub_indent, sub_line = cleaned[i]
if ":" in sub_line:
k, _, v = sub_line.partition(":")
k = k.strip()
v = v.strip()
if v == "":
item[k] = ""
i += 1
elif v.startswith(">-") or v.startswith(">"):
collected = []
i += 1
while i < n and cleaned[i][0] > sub_indent:
collected.append(cleaned[i][1])
i += 1
item[k] = " ".join(collected)
elif v.startswith("["):
item[k] = parse_inline_list(v)
i += 1
else:
item[k] = parse_scalar(v)
i += 1
else:
i += 1
items.append(item)
root[key] = items
else:
# Sub-map.
submap: dict[str, Any] = {}
while i < n and cleaned[i][0] > indent:
sub_indent, sub_line = cleaned[i]
if ":" in sub_line:
k, _, v = sub_line.partition(":")
k = k.strip().strip('"').strip("'")
v = v.strip()
if v.startswith("[") and v.endswith("]"):
submap[k] = parse_inline_list(v)
else:
submap[k] = parse_scalar(v)
i += 1
root[key] = submap
else:
# Inline scalar or list.
if rest.startswith("[") and rest.endswith("]"):
root[key] = parse_inline_list(rest)
else:
root[key] = parse_scalar(rest)
i += 1
return root
# ---------------------------------------------------------------------------
# Main entry point
# ---------------------------------------------------------------------------
def render_status(
items: list[dict[str, Any]],
ack_state: dict[str, dict[str, Any]],
body_state: dict[str, bool],
) -> tuple[str, str]:
"""Return (state, description) for the commit-status post.
state is "success" if every item has at least one valid ack
(body section presence is informational only — peer-ack is the
real gate). "pending" is reserved for the soft-fail path
(tier:low) and is set by the caller.
"""
n = len(items)
fully_acked = [
it["slug"] for it in items if ack_state[it["slug"]]["ackers"]
]
missing = [
it["slug"] for it in items if not ack_state[it["slug"]]["ackers"]
]
missing_body = [it["slug"] for it in items if not body_state.get(it["slug"], False)]
desc_parts = [f"acked: {len(fully_acked)}/{n}"]
if missing:
# Show up to 3 missing slugs to stay inside the 140-char budget.
shown = ", ".join(missing[:3])
if len(missing) > 3:
shown += f", +{len(missing) - 3}"
desc_parts.append(f"missing: {shown}")
if missing_body:
desc_parts.append(f"body-unfilled: {len(missing_body)}")
state = "success" if not missing else "failure"
return state, "".join(desc_parts)
def get_tier_mode(pr: dict[str, Any], cfg: dict[str, Any]) -> str:
"""Read tier label, return 'hard' or 'soft' per cfg.tier_failure_mode."""
labels = pr.get("labels") or []
tier_labels = [l.get("name", "") for l in labels if (l.get("name", "") or "").startswith("tier:")]
mode_map = cfg.get("tier_failure_mode") or {}
default_mode = cfg.get("default_mode", "hard")
for tl in tier_labels:
if tl in mode_map:
return mode_map[tl]
return default_mode
def main(argv: list[str] | None = None) -> int:
p = argparse.ArgumentParser()
p.add_argument("--owner", required=True)
p.add_argument("--repo", required=True)
p.add_argument("--pr", type=int, required=True)
p.add_argument("--config", default=".gitea/sop-checklist-config.yaml")
p.add_argument("--gitea-host", default="git.moleculesai.app")
p.add_argument(
"--dry-run",
action="store_true",
help="Compute state but do not POST the status.",
)
p.add_argument(
"--status-context",
default="sop-checklist / all-items-acked (pull_request)",
)
p.add_argument(
"--exit-on-state",
action="store_true",
help=(
"If set, exit non-zero when state=failure. Default OFF so the "
"job-level conclusion is independent of ack-state — the only "
"thing BP sees is the POSTed status. Useful for local debugging."
),
)
args = p.parse_args(argv)
token = os.environ.get("GITEA_TOKEN", "")
if not token and not args.dry_run:
print("::error::GITEA_TOKEN env required", file=sys.stderr)
return 2
cfg = load_config(args.config)
items: list[dict[str, Any]] = cfg["items"]
items_by_slug = {it["slug"]: it for it in items}
numeric_aliases = {
int(it["numeric_alias"]): it["slug"] for it in items if it.get("numeric_alias")
}
client = GiteaClient(args.gitea_host, token) if token else None
if not client:
print("::error::No client (dry-run without token has nothing to do)", file=sys.stderr)
return 2
pr = client.get_pr(args.owner, args.repo, args.pr)
if pr.get("state") != "open":
print(f"::notice::PR #{args.pr} is {pr.get('state')} — gate is a no-op")
return 0
author = (pr.get("user") or {}).get("login", "")
head_sha = (pr.get("head") or {}).get("sha", "")
body = pr.get("body", "") or ""
if not author or not head_sha:
print("::error::PR payload missing user.login or head.sha", file=sys.stderr)
return 1
comments = client.get_issue_comments(args.owner, args.repo, args.pr)
# Build team-membership probe closure that caches results per
# (user, team-id) so a user acking multiple items only triggers
# one membership lookup per team.
team_member_cache: dict[tuple[str, int], bool | None] = {}
def probe(slug: str, users: list[str]) -> list[str]:
item = items_by_slug[slug]
team_names: list[str] = item["required_teams"]
# Resolve names → ids. NOTE: orgs/{org}/teams/search may not be
# available — fall back to the list endpoint.
team_ids: list[int] = []
for tn in team_names:
tid = client.resolve_team_id(args.owner, tn)
if tid is None:
# Try the list endpoint as a fallback.
code, data = client._req( # noqa: SLF001
"GET", f"/orgs/{args.owner}/teams"
)
if code == 200 and isinstance(data, list):
for t in data:
if t.get("name") == tn:
tid = t.get("id")
client._team_id_cache[(args.owner, tn)] = tid # noqa: SLF001
break
if tid is not None:
team_ids.append(tid)
else:
print(
f"::warning::could not resolve team-id for '{tn}' "
f"in org '{args.owner}' — item '{slug}' will fail closed",
file=sys.stderr,
)
approved: list[str] = []
for u in users:
for tid in team_ids:
cache_key = (u, tid)
if cache_key not in team_member_cache:
team_member_cache[cache_key] = client.is_team_member(tid, u)
result = team_member_cache[cache_key]
if result is True:
approved.append(u)
break
if result is None:
print(
f"::warning::team-probe for {u} in team-id {tid} returned 403 "
"(token owner not in that team — fail-closed per RFC#324)",
file=sys.stderr,
)
# Treat as not-in-team for this user/team pair; loop
# may still find membership in another team.
return approved
ack_state = compute_ack_state(comments, author, items_by_slug, numeric_aliases, probe)
body_state = {it["slug"]: section_marker_present(body, it["pr_section_marker"]) for it in items}
state, description = render_status(items, ack_state, body_state)
mode = get_tier_mode(pr, cfg)
if state == "failure" and mode == "soft":
state = "pending"
description = f"[soft-fail tier:low] {description}"
# Diagnostics to job log.
print(f"::notice::PR #{args.pr} author={author} head={head_sha[:7]} mode={mode}")
for it in items:
slug = it["slug"]
ackers = ack_state[slug]["ackers"]
if ackers:
print(f"::notice:: [PASS] {slug} — acked by {','.join(ackers)}")
else:
r = ack_state[slug]["rejected"]
extras: list[str] = []
if r["self_ack"]:
extras.append(f"self-acks-rejected:{','.join(r['self_ack'])}")
if r["not_in_team"]:
extras.append(f"not-in-team:{','.join(r['not_in_team'])}")
extra = " (" + "; ".join(extras) + ")" if extras else ""
print(f"::notice:: [WAIT] {slug} — no valid peer-ack yet{extra}")
print(f"::notice::posting status: state={state} desc={description!r}")
if args.dry_run:
print("::notice::--dry-run: not posting status")
if args.exit_on_state:
return 0 if state in ("success", "pending") else 1
return 0
target_url = f"https://{args.gitea_host}/{args.owner}/{args.repo}/pulls/{args.pr}"
client.post_status(
args.owner, args.repo, head_sha,
state=state, context=args.status_context,
description=description, target_url=target_url,
)
print(f"::notice::status posted: {args.status_context}{state}")
# By default exit 0 — the POSTed status IS the gate, NOT the job
# conclusion. If the job exits 1 BP will see TWO failure signals
# (one from the job's auto-status, one from our POST), making the
# description less actionable. --exit-on-state restores the old
# behavior for local debugging.
if args.exit_on_state:
return 0 if state in ("success", "pending") else 1
return 0
if __name__ == "__main__":
sys.exit(main())
@@ -1,505 +0,0 @@
"""Unit tests for .gitea/scripts/lint_pre_flip_continue_on_error.py.
These tests pin the pure-logic surface (flip detection + per-flip
verdict aggregation) without making real HTTP calls. The end-to-end
git ls-tree + Gitea API path is exercised by running the workflow
against real PRs.
Run locally::
python3 -m unittest .gitea/scripts/tests/test_lint_pre_flip_continue_on_error.py -v
Mirrors the pattern in scripts/ops/test_check_migration_collisions.py
+ scripts/test_build_runtime_package.py.
"""
from __future__ import annotations
import importlib.util
import os
import sys
import unittest
from pathlib import Path
from unittest import mock
# Load the script as a module without invoking main(). Tests must NOT
# depend on the full runtime env contract (GITEA_TOKEN etc.), so we
# import individual functions and stub the network surface explicitly.
SCRIPT_PATH = Path(__file__).resolve().parent.parent / "lint_pre_flip_continue_on_error.py"
spec = importlib.util.spec_from_file_location("lpfc", SCRIPT_PATH)
lpfc = importlib.util.module_from_spec(spec)
spec.loader.exec_module(lpfc)
# --------------------------------------------------------------------------
# Fixtures: minimal valid workflow YAML on each side of a "diff"
# --------------------------------------------------------------------------
CI_YML_BASE = """\
name: CI
on:
push:
branches: [main]
jobs:
platform-build:
name: Platform (Go)
runs-on: ubuntu-latest
continue-on-error: true
steps:
- run: echo platform
canvas-build:
name: Canvas (Next.js)
runs-on: ubuntu-latest
continue-on-error: true
steps:
- run: echo canvas
all-required:
runs-on: ubuntu-latest
continue-on-error: true
needs: [platform-build, canvas-build]
steps:
- run: echo ok
"""
CI_YML_HEAD_FLIPPED = """\
name: CI
on:
push:
branches: [main]
jobs:
platform-build:
name: Platform (Go)
runs-on: ubuntu-latest
continue-on-error: false
steps:
- run: echo platform
canvas-build:
name: Canvas (Next.js)
runs-on: ubuntu-latest
continue-on-error: false
steps:
- run: echo canvas
all-required:
runs-on: ubuntu-latest
continue-on-error: true
needs: [platform-build, canvas-build]
steps:
- run: echo ok
"""
CI_YML_HEAD_NO_DIFF = CI_YML_BASE # identical to base, no flip
# --------------------------------------------------------------------------
# 1. CoE coercion (truthy/falsy/quoted/absent)
# --------------------------------------------------------------------------
class TestCoerceCoE(unittest.TestCase):
def test_python_bool_true(self):
self.assertTrue(lpfc._coerce_coe(True))
def test_python_bool_false(self):
self.assertFalse(lpfc._coerce_coe(False))
def test_none_is_false(self):
# GitHub Actions default: absent == false.
self.assertFalse(lpfc._coerce_coe(None))
def test_string_true_lowercase(self):
# Quoted "true" in YAML — Gitea Actions normalizes to True.
self.assertTrue(lpfc._coerce_coe("true"))
def test_string_True_titlecase(self):
self.assertTrue(lpfc._coerce_coe("True"))
def test_string_yes(self):
# YAML 1.1 truthy form.
self.assertTrue(lpfc._coerce_coe("yes"))
def test_string_false(self):
self.assertFalse(lpfc._coerce_coe("false"))
def test_string_random_falsy(self):
# An unrecognized string is treated as falsy — safer than
# silently coercing "maybe" to True and false-positiving a
# flip.
self.assertFalse(lpfc._coerce_coe("maybe"))
# --------------------------------------------------------------------------
# 2. Diff detection — flips, not arbitrary changes
# --------------------------------------------------------------------------
class TestDetectFlips(unittest.TestCase):
def test_no_flip_in_diff_passes(self):
# Acceptance test #1: PR doesn't flip continue-on-error → 0 flips.
flips = lpfc.detect_flips(
{".gitea/workflows/ci.yml": CI_YML_BASE},
{".gitea/workflows/ci.yml": CI_YML_HEAD_NO_DIFF},
)
self.assertEqual(flips, [])
def test_flip_detected_in_one_file(self):
flips = lpfc.detect_flips(
{".gitea/workflows/ci.yml": CI_YML_BASE},
{".gitea/workflows/ci.yml": CI_YML_HEAD_FLIPPED},
)
# Two jobs flipped: platform-build, canvas-build. all-required
# is still true on both sides.
self.assertEqual(len(flips), 2)
keys = sorted(f["job_key"] for f in flips)
self.assertEqual(keys, ["canvas-build", "platform-build"])
def test_context_name_render(self):
flips = lpfc.detect_flips(
{".gitea/workflows/ci.yml": CI_YML_BASE},
{".gitea/workflows/ci.yml": CI_YML_HEAD_FLIPPED},
)
platform = next(f for f in flips if f["job_key"] == "platform-build")
self.assertEqual(platform["context"], "CI / Platform (Go) (push)")
self.assertEqual(platform["workflow_name"], "CI")
def test_context_falls_back_to_job_key_when_no_name(self):
base = "name: WF\njobs:\n foo:\n continue-on-error: true\n runs-on: x\n steps: []\n"
head = "name: WF\njobs:\n foo:\n continue-on-error: false\n runs-on: x\n steps: []\n"
flips = lpfc.detect_flips({"a.yml": base}, {"a.yml": head})
self.assertEqual(len(flips), 1)
self.assertEqual(flips[0]["context"], "WF / foo (push)")
def test_no_flip_when_only_one_side_has_file(self):
# Newly added workflow file — head has CoE:false, base has no
# file. Adding a new workflow with CoE:false is fine; there's
# nothing to mask.
flips = lpfc.detect_flips(
{}, # base has no workflow files
{".gitea/workflows/new.yml": CI_YML_HEAD_FLIPPED},
)
self.assertEqual(flips, [])
def test_no_flip_when_job_removed(self):
# Job exists on base, not on head — a removal, not a flip.
head = """\
name: CI
jobs:
canvas-build:
name: Canvas (Next.js)
continue-on-error: true
runs-on: ubuntu-latest
steps: []
"""
flips = lpfc.detect_flips(
{".gitea/workflows/ci.yml": CI_YML_BASE},
{".gitea/workflows/ci.yml": head},
)
self.assertEqual(flips, [])
def test_no_flip_when_job_added_with_false(self):
# New job on head with CoE:false — no base side; not a flip.
head_with_new = CI_YML_BASE.replace(
" all-required:",
" newjob:\n name: New Job\n continue-on-error: false\n"
" runs-on: x\n steps: []\n"
" all-required:",
)
flips = lpfc.detect_flips(
{".gitea/workflows/ci.yml": CI_YML_BASE},
{".gitea/workflows/ci.yml": head_with_new},
)
self.assertEqual(flips, [])
def test_yaml_parse_error_warns_not_raises(self):
# Malformed YAML on head — should warn (stderr) and skip,
# not raise.
bad_head = "name: CI\njobs:\n :::\n"
# Capture stderr so the test isn't noisy.
with mock.patch.object(sys, "stderr"):
flips = lpfc.detect_flips(
{".gitea/workflows/ci.yml": CI_YML_BASE},
{".gitea/workflows/ci.yml": bad_head},
)
self.assertEqual(flips, [])
# --------------------------------------------------------------------------
# 3. grep_fail_markers — the regex / substring matcher
# --------------------------------------------------------------------------
class TestGrepFailMarkers(unittest.TestCase):
def test_clean_log_returns_empty(self):
log = "===== test run starting =====\nPASS\nok example.com/foo 1.234s\n"
self.assertEqual(lpfc.grep_fail_markers(log), [])
def test_go_minus_minus_minus_fail_caught(self):
log = "ok example.com/foo 1.234s\n--- FAIL: TestBar (0.01s)\n bar_test.go:42:\n"
matches = lpfc.grep_fail_markers(log)
self.assertEqual(len(matches), 1)
self.assertIn("FAIL: TestBar", matches[0])
def test_go_package_fail_caught(self):
log = "FAIL\texample.com/baz\t1.234s\n"
matches = lpfc.grep_fail_markers(log)
self.assertEqual(len(matches), 1)
self.assertIn("FAIL", matches[0])
def test_bash_error_directive_caught(self):
# `lint-curl-status-capture` pattern: a python heredoc inside a
# bash step that prints `::error::` then sys.exit(1). With
# continue-on-error:true the job rolls up as success despite
# this line. THAT's the masking we're trying to catch.
log = "Running scan...\n::error::Found 3 curl-status-capture pollution site(s):\n"
matches = lpfc.grep_fail_markers(log)
self.assertEqual(len(matches), 1)
self.assertIn("::error::", matches[0])
def test_caps_matches_at_max_5(self):
log = "\n".join(["--- FAIL: T%d" % i for i in range(20)])
matches = lpfc.grep_fail_markers(log)
self.assertEqual(len(matches), 5)
# --------------------------------------------------------------------------
# 4. verify_flip — single-flip verdict assembly (network surface stubbed)
# --------------------------------------------------------------------------
def _stub_status(context: str, state: str, target_url: str = "/owner/repo/actions/runs/1/jobs/0") -> dict:
"""Build a single-context combined-status response."""
return {
"state": state,
"statuses": [
{"context": context, "status": state, "target_url": target_url, "description": ""}
],
}
FLIP_FIXTURE = {
"workflow_path": ".gitea/workflows/ci.yml",
"workflow_name": "CI",
"job_key": "platform-build",
"job_name": "Platform (Go)",
"context": "CI / Platform (Go) (push)",
}
class TestVerifyFlip(unittest.TestCase):
def test_flip_with_clean_history_passes(self):
# Acceptance test #2: flip detected, last 5 runs clean → exit 0.
with mock.patch.object(lpfc, "recent_commits_on_branch", return_value=["sha1", "sha2", "sha3"]):
with mock.patch.object(
lpfc, "combined_status",
side_effect=[_stub_status(FLIP_FIXTURE["context"], "success") for _ in range(3)],
):
with mock.patch.object(lpfc, "fetch_log", return_value="ok example.com/foo 1s\nPASS\n"):
verdict = lpfc.verify_flip(FLIP_FIXTURE, "main", 5)
self.assertEqual(verdict["fail_runs"], [])
self.assertEqual(verdict["masked_runs"], [])
self.assertEqual(verdict["checked_commits"], 3)
self.assertEqual(verdict["warnings"], [])
def test_flip_with_recent_fail_blocks(self):
# Acceptance test #3: flip detected, recent run has --- FAIL → exit 1.
# Setup: 3 commits, the most recent run's log shows --- FAIL
# but the STATUS is success (Quirk #10 mask). That's the
# masked_runs case.
log_with_fail = "ok example.com/foo 1s\n--- FAIL: TestSqlmock (0.01s)\n sqlmock_test.go:42:\n"
with mock.patch.object(lpfc, "recent_commits_on_branch", return_value=["sha1", "sha2", "sha3"]):
with mock.patch.object(
lpfc, "combined_status",
side_effect=[_stub_status(FLIP_FIXTURE["context"], "success") for _ in range(3)],
):
with mock.patch.object(lpfc, "fetch_log", side_effect=[log_with_fail, "PASS\n", "PASS\n"]):
verdict = lpfc.verify_flip(FLIP_FIXTURE, "main", 5)
self.assertEqual(len(verdict["masked_runs"]), 1)
self.assertEqual(verdict["masked_runs"][0]["sha"], "sha1")
self.assertTrue(any("TestSqlmock" in s for s in verdict["masked_runs"][0]["samples"]))
self.assertEqual(verdict["fail_runs"], [])
def test_red_status_alone_blocks(self):
# Status itself is `failure` — block without needing log
# markers. (Belt-and-braces: even with a clean log, a `failure`
# status means the job's exit code was non-zero.)
with mock.patch.object(lpfc, "recent_commits_on_branch", return_value=["sha1"]):
with mock.patch.object(
lpfc, "combined_status",
return_value=_stub_status(FLIP_FIXTURE["context"], "failure"),
):
with mock.patch.object(lpfc, "fetch_log", return_value="some unrelated text\n"):
verdict = lpfc.verify_flip(FLIP_FIXTURE, "main", 5)
self.assertEqual(len(verdict["fail_runs"]), 1)
self.assertEqual(verdict["fail_runs"][0]["status"], "failure")
def test_unreadable_log_warns_not_blocks(self):
# Acceptance test #5: log fetch 404 (None) → warn, not block.
# Status is `success`, log is None — we can't tell, so we warn
# and allow.
with mock.patch.object(lpfc, "recent_commits_on_branch", return_value=["sha1"]):
with mock.patch.object(
lpfc, "combined_status",
return_value=_stub_status(FLIP_FIXTURE["context"], "success"),
):
with mock.patch.object(lpfc, "fetch_log", return_value=None):
verdict = lpfc.verify_flip(FLIP_FIXTURE, "main", 5)
self.assertEqual(verdict["fail_runs"], [])
self.assertEqual(verdict["masked_runs"], [])
self.assertTrue(any("log unavailable" in w for w in verdict["warnings"]))
def test_unreadable_log_with_failure_status_still_blocks(self):
# Edge case: log fetch fails BUT the status itself is `failure`.
# We can still block — the status alone is sufficient signal,
# we don't need the log to confirm.
with mock.patch.object(lpfc, "recent_commits_on_branch", return_value=["sha1"]):
with mock.patch.object(
lpfc, "combined_status",
return_value=_stub_status(FLIP_FIXTURE["context"], "failure"),
):
with mock.patch.object(lpfc, "fetch_log", return_value=None):
verdict = lpfc.verify_flip(FLIP_FIXTURE, "main", 5)
self.assertEqual(len(verdict["fail_runs"]), 1)
self.assertIn("log unavailable", verdict["fail_runs"][0]["samples"][0])
def test_zero_runs_history_warns_allows(self):
# No commits with a matching context — newly added workflow.
# Allow with warning.
with mock.patch.object(lpfc, "recent_commits_on_branch", return_value=["sha1", "sha2"]):
with mock.patch.object(
lpfc, "combined_status",
return_value={"state": "success", "statuses": []}, # no matching context
):
verdict = lpfc.verify_flip(FLIP_FIXTURE, "main", 5)
self.assertEqual(verdict["checked_commits"], 0)
self.assertEqual(verdict["fail_runs"], [])
self.assertEqual(verdict["masked_runs"], [])
self.assertTrue(any("no runs of" in w for w in verdict["warnings"]))
def test_zero_commits_warns_allows(self):
# Empty branch (newly created repo, e.g.). Allow with warning.
with mock.patch.object(lpfc, "recent_commits_on_branch", return_value=[]):
verdict = lpfc.verify_flip(FLIP_FIXTURE, "main", 5)
self.assertEqual(verdict["checked_commits"], 0)
self.assertEqual(verdict["fail_runs"], [])
self.assertEqual(verdict["masked_runs"], [])
self.assertTrue(any("no recent commits" in w for w in verdict["warnings"]))
# --------------------------------------------------------------------------
# 5. Multiple-flip aggregation in main()
# --------------------------------------------------------------------------
class TestMainAggregation(unittest.TestCase):
"""Tests that `main()` aggregates multiple flips and exits 1 when
ANY one of them has a masked or red recent run. Acceptance test #4.
We stub at the verify_flip + workflows_at_sha + _require_runtime_env
boundary so we don't need real git or HTTP.
"""
def setUp(self):
# The actual env values are irrelevant — _require_runtime_env
# is stubbed out — but the module reads OWNER/NAME at import
# time. Patch the runtime env contract to a no-op for the
# duration of each test.
self._patches = [
mock.patch.object(lpfc, "_require_runtime_env", return_value=None),
mock.patch.object(lpfc, "BASE_REF", "main"),
mock.patch.object(lpfc, "BASE_SHA", "deadbeefcafe"),
mock.patch.object(lpfc, "HEAD_SHA", "feedfaceabad"),
mock.patch.object(lpfc, "RECENT_COMMITS_N", 5),
]
for p in self._patches:
p.start()
self.addCleanup(lambda: [p.stop() for p in self._patches])
def test_multiple_flips_aggregated_one_bad_blocks(self):
# PR flips 3 jobs; 1 has a recent fail → exit 1, naming that job.
flips = [
{"workflow_path": ".gitea/workflows/ci.yml", "workflow_name": "CI",
"job_key": "platform-build", "job_name": "Platform (Go)",
"context": "CI / Platform (Go) (push)"},
{"workflow_path": ".gitea/workflows/ci.yml", "workflow_name": "CI",
"job_key": "canvas-build", "job_name": "Canvas (Next.js)",
"context": "CI / Canvas (Next.js) (push)"},
{"workflow_path": ".gitea/workflows/ci.yml", "workflow_name": "CI",
"job_key": "python-lint", "job_name": "Python Lint & Test",
"context": "CI / Python Lint & Test (push)"},
]
clean = {"flip": flips[0], "checked_commits": 5, "masked_runs": [],
"fail_runs": [], "warnings": []}
bad = {"flip": flips[1], "checked_commits": 5,
"masked_runs": [{"sha": "abc1234567", "status": "success",
"target_url": "/x/y/actions/runs/1/jobs/0",
"samples": ["--- FAIL: TestSqlmock"]}],
"fail_runs": [], "warnings": []}
also_clean = {"flip": flips[2], "checked_commits": 5, "masked_runs": [],
"fail_runs": [], "warnings": []}
with mock.patch.object(lpfc, "workflows_at_sha", return_value={}):
with mock.patch.object(lpfc, "detect_flips", return_value=flips):
with mock.patch.object(lpfc, "verify_flip",
side_effect=[clean, bad, also_clean]):
# Capture stdout to assert on naming.
captured = []
with mock.patch("builtins.print", side_effect=lambda *a, **k: captured.append(" ".join(str(x) for x in a))):
rc = lpfc.main([])
self.assertEqual(rc, 1)
# The blocking error message must name the failing job.
joined = "\n".join(captured)
self.assertIn("canvas-build", joined)
# And it must mention the empirical class so a reviewer can
# cross-link the right RFC.
self.assertTrue("mc#664" in joined or "PR#656" in joined)
def test_no_flips_in_diff_exits_zero(self):
# Acceptance test #1 at main() level: empty flips → exit 0.
with mock.patch.object(lpfc, "workflows_at_sha", return_value={}):
with mock.patch.object(lpfc, "detect_flips", return_value=[]):
rc = lpfc.main([])
self.assertEqual(rc, 0)
def test_all_flips_clean_exits_zero(self):
flips = [{"workflow_path": ".gitea/workflows/ci.yml", "workflow_name": "CI",
"job_key": "platform-build", "job_name": "Platform (Go)",
"context": "CI / Platform (Go) (push)"}]
clean = {"flip": flips[0], "checked_commits": 5, "masked_runs": [],
"fail_runs": [], "warnings": []}
with mock.patch.object(lpfc, "workflows_at_sha", return_value={}):
with mock.patch.object(lpfc, "detect_flips", return_value=flips):
with mock.patch.object(lpfc, "verify_flip", return_value=clean):
rc = lpfc.main([])
self.assertEqual(rc, 0)
def test_dry_run_forces_exit_zero_even_with_bad_flip(self):
# --dry-run never fails, even when verification finds masked runs.
flips = [{"workflow_path": ".gitea/workflows/ci.yml", "workflow_name": "CI",
"job_key": "platform-build", "job_name": "Platform (Go)",
"context": "CI / Platform (Go) (push)"}]
bad = {"flip": flips[0], "checked_commits": 5,
"masked_runs": [{"sha": "abc1234567", "status": "success",
"target_url": "/x/y/actions/runs/1/jobs/0",
"samples": ["--- FAIL: TestSqlmock"]}],
"fail_runs": [], "warnings": []}
with mock.patch.object(lpfc, "workflows_at_sha", return_value={}):
with mock.patch.object(lpfc, "detect_flips", return_value=flips):
with mock.patch.object(lpfc, "verify_flip", return_value=bad):
rc = lpfc.main(["--dry-run"])
self.assertEqual(rc, 0)
# --------------------------------------------------------------------------
# 6. Context-name rendering (the format Gitea Actions actually emits)
# --------------------------------------------------------------------------
class TestContextName(unittest.TestCase):
def test_push_event(self):
self.assertEqual(
lpfc.context_name("CI", "Platform (Go)", "push"),
"CI / Platform (Go) (push)",
)
def test_pull_request_event(self):
self.assertEqual(
lpfc.context_name("CI", "Platform (Go)", "pull_request"),
"CI / Platform (Go) (pull_request)",
)
def test_workflow_name_falls_back_to_filename(self):
# No top-level `name:` → falls back to filename minus extension.
doc = {"jobs": {"foo": {"continue-on-error": True}}}
self.assertEqual(
lpfc.workflow_name(doc, fallback="my-workflow"),
"my-workflow",
)
if __name__ == "__main__":
unittest.main()
@@ -1,524 +0,0 @@
#!/usr/bin/env python3
# Unit tests for sop-checklist-gate.py
#
# Run: python3 .gitea/scripts/tests/test_sop_checklist_gate.py
# or: pytest .gitea/scripts/tests/test_sop_checklist_gate.py
#
# RFC#351 Step 2 of 6 — implementation MVP. Tests cover:
# - slug normalization (the 4 example variants in the script header)
# - parse_directives (ack, revoke, with/without note, mid-comment, etc.)
# - section_marker_present (empty answer rejected, filled answer ok)
# - compute_ack_state (self-ack rejected, team probe applied, revoke
# invalidates own prior ack, peer's ack survives unrevoked)
# - render_status (state + description format)
# - get_tier_mode (label-driven, default fallback)
# - load_config (default config parses cleanly with both PyYAML and
# the bundled minimal parser)
#
# All tests run WITHOUT touching the Gitea API — the team-probe
# callable is dependency-injected.
from __future__ import annotations
import os
import sys
import tempfile
import unittest
# Resolve sibling script regardless of where pytest is invoked from.
HERE = os.path.dirname(os.path.abspath(__file__))
PARENT = os.path.dirname(HERE) # .gitea/scripts
sys.path.insert(0, PARENT)
import importlib.util # noqa: E402
_spec = importlib.util.spec_from_file_location(
"sop_checklist_gate", os.path.join(PARENT, "sop-checklist-gate.py")
)
sop = importlib.util.module_from_spec(_spec)
_spec.loader.exec_module(sop) # type: ignore[union-attr]
# ---------------------------------------------------------------------------
# Test fixtures
# ---------------------------------------------------------------------------
CONFIG_PATH = os.path.join(PARENT, "..", "sop-checklist-config.yaml")
def _items() -> list[dict]:
cfg = sop.load_config(CONFIG_PATH)
return cfg["items"]
def _items_by_slug() -> dict[str, dict]:
return {it["slug"]: it for it in _items()}
def _numeric_aliases() -> dict[int, str]:
return {
int(it["numeric_alias"]): it["slug"]
for it in _items()
if it.get("numeric_alias")
}
def _comment(user: str, body: str) -> dict:
return {"user": {"login": user}, "body": body}
# ---------------------------------------------------------------------------
# normalize_slug
# ---------------------------------------------------------------------------
class TestNormalizeSlug(unittest.TestCase):
def test_kebab_already(self):
self.assertEqual(sop.normalize_slug("comprehensive-testing"), "comprehensive-testing")
def test_underscore_to_dash(self):
self.assertEqual(sop.normalize_slug("comprehensive_testing"), "comprehensive-testing")
def test_space_to_dash(self):
self.assertEqual(sop.normalize_slug("comprehensive testing"), "comprehensive-testing")
def test_uppercase_to_lower(self):
self.assertEqual(sop.normalize_slug("Comprehensive-Testing"), "comprehensive-testing")
def test_mixed_separators(self):
self.assertEqual(sop.normalize_slug("Comprehensive_Testing"), "comprehensive-testing")
self.assertEqual(sop.normalize_slug("FIVE_axis review"), "five-axis-review")
def test_collapse_repeated_dashes(self):
self.assertEqual(sop.normalize_slug("comprehensive--testing"), "comprehensive-testing")
self.assertEqual(sop.normalize_slug("comprehensive testing"), "comprehensive-testing")
def test_strip_trailing_punctuation(self):
self.assertEqual(sop.normalize_slug("comprehensive-testing."), "comprehensive-testing")
self.assertEqual(sop.normalize_slug("comprehensive-testing!"), "comprehensive-testing")
def test_numeric_shorthand_known(self):
self.assertEqual(
sop.normalize_slug("1", _numeric_aliases()),
"comprehensive-testing",
)
self.assertEqual(
sop.normalize_slug("3", _numeric_aliases()),
"staging-smoke",
)
self.assertEqual(
sop.normalize_slug("7", _numeric_aliases()),
"memory-consulted",
)
def test_numeric_shorthand_unknown_returns_empty(self):
# "8" is out of range → empty so caller can flag as unparseable.
self.assertEqual(sop.normalize_slug("8", _numeric_aliases()), "")
def test_numeric_without_alias_table_keeps_digits(self):
# No alias table → return the digits as-is.
self.assertEqual(sop.normalize_slug("1"), "1")
def test_empty_input(self):
self.assertEqual(sop.normalize_slug(""), "")
self.assertEqual(sop.normalize_slug(" "), "")
self.assertEqual(sop.normalize_slug(None), "")
# ---------------------------------------------------------------------------
# parse_directives
# ---------------------------------------------------------------------------
class TestParseDirectives(unittest.TestCase):
def setUp(self):
self.aliases = _numeric_aliases()
def test_simple_ack(self):
d = sop.parse_directives("/sop-ack comprehensive-testing", self.aliases)
self.assertEqual(d, [("sop-ack", "comprehensive-testing", "")])
def test_simple_revoke(self):
d = sop.parse_directives("/sop-revoke staging-smoke", self.aliases)
self.assertEqual(d, [("sop-revoke", "staging-smoke", "")])
def test_ack_with_note(self):
d = sop.parse_directives(
"/sop-ack comprehensive-testing LGTM the test covers all edge cases",
self.aliases,
)
self.assertEqual(len(d), 1)
self.assertEqual(d[0][0], "sop-ack")
self.assertEqual(d[0][1], "comprehensive-testing")
self.assertIn("LGTM", d[0][2])
def test_numeric_shorthand(self):
d = sop.parse_directives("/sop-ack 1", self.aliases)
self.assertEqual(d, [("sop-ack", "comprehensive-testing", "")])
def test_revoke_with_reason(self):
d = sop.parse_directives(
"/sop-revoke comprehensive-testing realized the e2e was mocking the DB",
self.aliases,
)
self.assertEqual(d[0][0], "sop-revoke")
self.assertEqual(d[0][1], "comprehensive-testing")
self.assertIn("mocking", d[0][2])
def test_directive_in_middle_of_comment(self):
body = (
"Reviewed the PR, looks good overall.\n"
"/sop-ack comprehensive-testing\n"
"Will follow up on the doc nit separately."
)
d = sop.parse_directives(body, self.aliases)
self.assertEqual(len(d), 1)
self.assertEqual(d[0][1], "comprehensive-testing")
def test_multiple_directives_in_one_comment(self):
body = (
"/sop-ack comprehensive-testing\n"
"/sop-ack local-postgres-e2e\n"
)
d = sop.parse_directives(body, self.aliases)
self.assertEqual(len(d), 2)
slugs = {x[1] for x in d}
self.assertEqual(slugs, {"comprehensive-testing", "local-postgres-e2e"})
def test_must_be_at_line_start(self):
# A directive embedded mid-line is not honored (prevents review
# comments like "to /sop-ack you need..." from acting as acks).
body = "If you want to /sop-ack comprehensive-testing reply in this thread"
d = sop.parse_directives(body, self.aliases)
self.assertEqual(d, [])
def test_leading_whitespace_allowed(self):
body = " /sop-ack comprehensive-testing"
d = sop.parse_directives(body, self.aliases)
self.assertEqual(len(d), 1)
def test_empty_body(self):
self.assertEqual(sop.parse_directives("", self.aliases), [])
self.assertEqual(sop.parse_directives(None, self.aliases), [])
def test_normalization_applied(self):
# /sop-ack Comprehensive_Testing → canonical comprehensive-testing
d = sop.parse_directives("/sop-ack Comprehensive_Testing", self.aliases)
self.assertEqual(d[0][1], "comprehensive-testing")
# ---------------------------------------------------------------------------
# section_marker_present
# ---------------------------------------------------------------------------
class TestSectionMarkerPresent(unittest.TestCase):
def test_marker_with_inline_answer(self):
body = "- [ ] **Comprehensive testing performed**: Added 12 new tests covering null/empty/giant inputs."
self.assertTrue(sop.section_marker_present(body, "Comprehensive testing performed"))
def test_marker_with_empty_answer(self):
body = "- [ ] **Comprehensive testing performed**:"
self.assertFalse(sop.section_marker_present(body, "Comprehensive testing performed"))
def test_marker_with_only_whitespace_answer(self):
body = "- [ ] **Comprehensive testing performed**: \n"
self.assertFalse(sop.section_marker_present(body, "Comprehensive testing performed"))
def test_marker_with_next_line_answer(self):
body = (
"- [ ] **Comprehensive testing performed**:\n"
" Yes — see attached log + 12 new unit tests in foo_test.py.\n"
)
self.assertTrue(sop.section_marker_present(body, "Comprehensive testing performed"))
def test_marker_missing(self):
body = "- [ ] **Local-postgres E2E run**: N/A — pure-frontend\n"
self.assertFalse(sop.section_marker_present(body, "Comprehensive testing performed"))
def test_case_insensitive_marker_match(self):
body = "- [ ] **comprehensive TESTING performed**: yes"
self.assertTrue(sop.section_marker_present(body, "Comprehensive testing performed"))
def test_empty_body(self):
self.assertFalse(sop.section_marker_present("", "X"))
self.assertFalse(sop.section_marker_present(None, "X"))
# ---------------------------------------------------------------------------
# compute_ack_state
# ---------------------------------------------------------------------------
class TestComputeAckState(unittest.TestCase):
def setUp(self):
self.items = _items_by_slug()
self.aliases = _numeric_aliases()
@staticmethod
def _approve_all(slug, users):
return list(users)
@staticmethod
def _approve_none(slug, users):
return []
def _approve_only(self, allowed_users):
return lambda slug, users: [u for u in users if u in allowed_users]
def test_peer_ack_passes(self):
comments = [_comment("bob", "/sop-ack comprehensive-testing")]
state = sop.compute_ack_state(
comments, "alice", self.items, self.aliases, self._approve_all
)
self.assertEqual(state["comprehensive-testing"]["ackers"], ["bob"])
def test_self_ack_rejected(self):
comments = [_comment("alice", "/sop-ack comprehensive-testing")]
state = sop.compute_ack_state(
comments, "alice", self.items, self.aliases, self._approve_all
)
self.assertEqual(state["comprehensive-testing"]["ackers"], [])
self.assertEqual(state["comprehensive-testing"]["rejected"]["self_ack"], ["alice"])
def test_not_in_team_rejected(self):
comments = [_comment("eve", "/sop-ack comprehensive-testing")]
state = sop.compute_ack_state(
comments, "alice", self.items, self.aliases, self._approve_none
)
self.assertEqual(state["comprehensive-testing"]["ackers"], [])
self.assertEqual(state["comprehensive-testing"]["rejected"]["not_in_team"], ["eve"])
def test_revoke_invalidates_own_prior_ack(self):
# Bob acks then later revokes — Bob no longer counts.
comments = [
_comment("bob", "/sop-ack comprehensive-testing"),
_comment("bob", "/sop-revoke comprehensive-testing realized e2e was mocked"),
]
state = sop.compute_ack_state(
comments, "alice", self.items, self.aliases, self._approve_all
)
self.assertEqual(state["comprehensive-testing"]["ackers"], [])
def test_revoke_does_not_affect_others_acks(self):
# Bob revokes his own ack; Carol's still counts.
comments = [
_comment("bob", "/sop-ack comprehensive-testing"),
_comment("carol", "/sop-ack comprehensive-testing"),
_comment("bob", "/sop-revoke comprehensive-testing"),
]
state = sop.compute_ack_state(
comments, "alice", self.items, self.aliases, self._approve_all
)
self.assertEqual(state["comprehensive-testing"]["ackers"], ["carol"])
def test_ack_after_revoke_restored(self):
# Bob revokes then re-acks (e.g. after re-reviewing).
comments = [
_comment("bob", "/sop-ack comprehensive-testing"),
_comment("bob", "/sop-revoke comprehensive-testing"),
_comment("bob", "/sop-ack comprehensive-testing"),
]
state = sop.compute_ack_state(
comments, "alice", self.items, self.aliases, self._approve_all
)
self.assertEqual(state["comprehensive-testing"]["ackers"], ["bob"])
def test_numeric_shorthand_ack(self):
# /sop-ack 1 → comprehensive-testing
comments = [_comment("bob", "/sop-ack 1")]
state = sop.compute_ack_state(
comments, "alice", self.items, self.aliases, self._approve_all
)
self.assertEqual(state["comprehensive-testing"]["ackers"], ["bob"])
def test_ack_for_unknown_slug_ignored(self):
# Some other slug not in config — silently drop (doesn't crash).
comments = [_comment("bob", "/sop-ack does-not-exist")]
state = sop.compute_ack_state(
comments, "alice", self.items, self.aliases, self._approve_all
)
for slug in self.items:
self.assertEqual(state[slug]["ackers"], [])
def test_multi_item_multi_user(self):
comments = [
_comment("bob", "/sop-ack comprehensive-testing\n/sop-ack staging-smoke"),
_comment("carol", "/sop-ack five-axis-review"),
]
state = sop.compute_ack_state(
comments, "alice", self.items, self.aliases, self._approve_all
)
self.assertEqual(state["comprehensive-testing"]["ackers"], ["bob"])
self.assertEqual(state["staging-smoke"]["ackers"], ["bob"])
self.assertEqual(state["five-axis-review"]["ackers"], ["carol"])
self.assertEqual(state["root-cause"]["ackers"], [])
# ---------------------------------------------------------------------------
# render_status
# ---------------------------------------------------------------------------
class TestRenderStatus(unittest.TestCase):
def setUp(self):
self.items = _items()
self.items_by_slug = _items_by_slug()
def _state_with(self, acked: list[str]) -> dict:
return {
it["slug"]: {
"ackers": ["peer"] if it["slug"] in acked else [],
"rejected": {"self_ack": [], "not_in_team": []},
}
for it in self.items
}
def test_all_acked_returns_success(self):
all_slugs = [it["slug"] for it in self.items]
state, desc = sop.render_status(
self.items, self._state_with(all_slugs), {s: True for s in all_slugs}
)
self.assertEqual(state, "success")
self.assertIn("7/7", desc)
def test_partial_acked_returns_failure(self):
state, desc = sop.render_status(
self.items,
self._state_with(["comprehensive-testing", "staging-smoke"]),
{it["slug"]: True for it in self.items},
)
self.assertEqual(state, "failure")
self.assertIn("2/7", desc)
self.assertIn("missing", desc)
def test_description_truncates_long_missing_list(self):
# Only ack one — 6 missing should be summarized as "+N".
state, desc = sop.render_status(
self.items,
self._state_with(["comprehensive-testing"]),
{it["slug"]: True for it in self.items},
)
# Length budget: under 140 chars.
self.assertLessEqual(len(desc), 140)
self.assertIn("+", desc) # +N elision marker
def test_body_unfilled_surfaced(self):
all_slugs = [it["slug"] for it in self.items]
state, desc = sop.render_status(
self.items,
self._state_with(all_slugs),
{it["slug"]: False for it in self.items},
)
self.assertIn("body-unfilled", desc)
# ---------------------------------------------------------------------------
# get_tier_mode
# ---------------------------------------------------------------------------
class TestGetTierMode(unittest.TestCase):
def setUp(self):
self.cfg = sop.load_config(CONFIG_PATH)
def test_tier_high_is_hard(self):
pr = {"labels": [{"name": "tier:high"}, {"name": "area:ci"}]}
self.assertEqual(sop.get_tier_mode(pr, self.cfg), "hard")
def test_tier_medium_is_hard(self):
pr = {"labels": [{"name": "tier:medium"}]}
self.assertEqual(sop.get_tier_mode(pr, self.cfg), "hard")
def test_tier_low_is_soft(self):
pr = {"labels": [{"name": "tier:low"}]}
self.assertEqual(sop.get_tier_mode(pr, self.cfg), "soft")
def test_no_tier_label_defaults_to_hard(self):
# Per feedback_fix_root_not_symptom — never silently lower the bar.
pr = {"labels": [{"name": "area:ci"}]}
self.assertEqual(sop.get_tier_mode(pr, self.cfg), "hard")
def test_no_labels_defaults_to_hard(self):
self.assertEqual(sop.get_tier_mode({"labels": []}, self.cfg), "hard")
self.assertEqual(sop.get_tier_mode({}, self.cfg), "hard")
# ---------------------------------------------------------------------------
# load_config
# ---------------------------------------------------------------------------
class TestLoadConfig(unittest.TestCase):
def test_default_config_parses(self):
cfg = sop.load_config(CONFIG_PATH)
self.assertIn("items", cfg)
self.assertEqual(len(cfg["items"]), 7)
slugs = {it["slug"] for it in cfg["items"]}
self.assertEqual(
slugs,
{
"comprehensive-testing",
"local-postgres-e2e",
"staging-smoke",
"root-cause",
"five-axis-review",
"no-backwards-compat",
"memory-consulted",
},
)
def test_default_config_tier_mode_shape(self):
cfg = sop.load_config(CONFIG_PATH)
self.assertEqual(cfg["tier_failure_mode"]["tier:high"], "hard")
self.assertEqual(cfg["tier_failure_mode"]["tier:medium"], "hard")
self.assertEqual(cfg["tier_failure_mode"]["tier:low"], "soft")
self.assertEqual(cfg["default_mode"], "hard")
def test_each_item_has_required_fields(self):
cfg = sop.load_config(CONFIG_PATH)
for it in cfg["items"]:
self.assertIn("slug", it)
self.assertIn("numeric_alias", it)
self.assertIn("pr_section_marker", it)
self.assertIn("required_teams", it)
self.assertIsInstance(it["required_teams"], list)
self.assertGreater(len(it["required_teams"]), 0)
# ---------------------------------------------------------------------------
# Edge case: full integration without team probe (dependency-injected)
# ---------------------------------------------------------------------------
class TestEndToEndAckFlow(unittest.TestCase):
"""All-7-items happy path with synthetic comments. Verifies the
full pipeline minus the Gitea API."""
def test_all_seven_acked_by_proper_teams(self):
items = _items_by_slug()
aliases = _numeric_aliases()
comments = [
_comment("qa-bot", "/sop-ack comprehensive-testing"),
_comment("eng-bot", "/sop-ack local-postgres-e2e"),
_comment("eng-bot", "/sop-ack staging-smoke"),
_comment("mgr-bot", "/sop-ack root-cause"),
_comment("eng-bot", "/sop-ack five-axis-review"),
_comment("mgr-bot", "/sop-ack no-backwards-compat"),
_comment("eng-bot", "/sop-ack memory-consulted"),
]
def probe(slug, users):
# Pretend every user is in every team.
return list(users)
state = sop.compute_ack_state(comments, "alice-author", items, aliases, probe)
body = {it["slug"]: True for it in items.values()}
items_list = list(items.values())
result_state, desc = sop.render_status(items_list, state, body)
self.assertEqual(result_state, "success")
self.assertIn("7/7", desc)
if __name__ == "__main__":
unittest.main(verbosity=2)
-109
View File
@@ -1,109 +0,0 @@
# SOP-Checklist gate — per-item required reviewer teams.
#
# RFC#351 v1 starter set. Each item lists:
# slug — canonical kebab-case form used in /sop-ack <slug>
# pr_section_marker — substring matched in the PR body to detect that
# the author filled in this item (case-insensitive)
# required_teams — list of Gitea team names; an ack from ANY one of
# these teams (logical OR) satisfies the item.
# Membership is probed at gate-time via
# GET /api/v1/teams/{id}/members/{login}.
# Team-id resolution happens at script start via
# GET /api/v1/orgs/{org}/teams (cheap, one call).
# numeric_alias — 1..7; lets reviewers type `/sop-ack 3` as a
# shortcut for `/sop-ack staging-smoke`.
#
# WHY THESE TEAM MAPPINGS:
# The RFC table referenced persona-role names like `core-qa`,
# `core-be`, `core-devops` — these are individual Gitea user logins,
# not teams. The Gitea team-membership API is /teams/{id}/members/{u},
# so we need actual teams. Orchestrator preflight 2026-05-12 verified
# only these teams exist on molecule-ai: ceo(5), engineers(2),
# managers(6), qa(20), security(21), Owners(1), and bot teams. We
# map the RFC roles to the closest existing team and surface the
# mapping explicitly so it's reviewable.
#
# HOW TO EDIT:
# - Tightening: replace `engineers` with a smaller team after creating
# it (e.g. a new `senior-engineers` team if needed).
# - Loosening: add another team to required_teams (OR semantics).
# - Add an item: append to items list and document the slug below.
#
# AUTHOR SELF-ACK IS FORBIDDEN regardless of which team contains them
# — the gate script enforces commenter != PR author before checking
# team membership.
version: 1
# Tier-aware failure mode (RFC#351 open question 2):
# For tier:high — hard-fail (status `failure`, blocks merge via BP).
# For tier:medium — hard-fail (same as high; medium is non-trivial).
# For tier:low — soft-fail (status `pending` with `acked: N/M` in the
# description). BP can choose to require the context
# or not for low-tier PRs.
# If no tier label is present, default to medium (hard-fail) — every PR
# should have a tier label per sop-tier-check, and absence indicates
# a missing-tier defect we should surface, not silently lower the bar.
tier_failure_mode:
"tier:high": hard
"tier:medium": hard
"tier:low": soft
default_mode: hard # used when no tier:* label is present
items:
- slug: comprehensive-testing
numeric_alias: 1
pr_section_marker: "Comprehensive testing performed"
required_teams: [qa, engineers]
description: >-
What was tested, how, edge cases covered. Ack from any qa-team
member (or engineers fallback while qa is small).
- slug: local-postgres-e2e
numeric_alias: 2
pr_section_marker: "Local-postgres E2E run"
required_teams: [engineers]
description: >-
Link to local CI artifact, or "N/A: pure-frontend change". Ack
from any engineer who can verify the local DB test actually ran.
- slug: staging-smoke
numeric_alias: 3
pr_section_marker: "Staging-smoke verified or pending"
required_teams: [engineers]
description: >-
Link to canary run, or "scheduled post-merge". Ack from any
engineer (core-devops/infra-sre are members of engineers team).
- slug: root-cause
numeric_alias: 4
pr_section_marker: "Root-cause not symptom"
required_teams: [managers, ceo]
description: >-
One-sentence root-cause statement. Ack from managers tier
(team-leads) or ceo. Senior judgment required to attest
root-cause-versus-symptom.
- slug: five-axis-review
numeric_alias: 5
pr_section_marker: "Five-Axis review walked"
required_teams: [engineers]
description: >-
Correctness / readability / architecture / security / performance.
Ack from any non-author engineer.
- slug: no-backwards-compat
numeric_alias: 6
pr_section_marker: "No backwards-compat shim / dead code added"
required_teams: [managers, ceo]
description: >-
Yes/no + justification if no. Senior ack required because
backward-compat shims are how dead-code accretes.
- slug: memory-consulted
numeric_alias: 7
pr_section_marker: "Memory/saved-feedback consulted"
required_teams: [engineers]
description: >-
List of feedback memories applicable to this change. Ack from
any engineer who has the same memory access.
@@ -37,7 +37,6 @@ jobs:
# Phase 3 (RFC #219 §1): surface broken workflows without blocking
# the PR. Follow-up PR flips this off after surfaced defects are
# triaged.
# mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
continue-on-error: true
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
@@ -48,7 +48,6 @@ jobs:
# Phase 3 (RFC #219 §1): surface broken workflows without blocking
# the PR. Follow-up PR flips this off after surfaced defects are
# triaged.
# mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
continue-on-error: true
steps:
- uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
@@ -45,7 +45,6 @@ jobs:
# Phase 3 (RFC #219 §1): surface broken workflows without blocking
# the PR. Follow-up PR flips this off after surfaced defects are
# triaged.
# mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
continue-on-error: true
timeout-minutes: 5
steps:
+13 -29
View File
@@ -126,7 +126,7 @@ jobs:
name: Platform (Go)
needs: changes
runs-on: ubuntu-latest
# mc#774 (interim): re-mask platform-build pending fix-forward. Phase 4
# mc#664 (interim): re-mask platform-build pending fix-forward. Phase 4
# (#656) flipped this to continue-on-error: false based on a Phase-3-masked
# "green on main 2026-05-12" — the prior continue-on-error: true had
# been hiding failing tests in workspace-server/internal/handlers/.
@@ -145,11 +145,10 @@ jobs:
# Time-boxed Option A (90 min) did not fit the cross-cutting scope.
# This is a sequenced revert→fix→reflip per
# feedback_strict_root_only_after_class_a emergency clause — NOT
# a permanent re-mask. Re-flip blocked on mc#774 fix-forward landing.
# a permanent re-mask. Re-flip blocked on mc#664 fix-forward landing.
# Other 4 #656 flips (changes, canvas-build, shellcheck, python-lint)
# retain continue-on-error: false; only platform-build regresses.
# mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
continue-on-error: true # mc#774 fix-forward in flight; re-flip when mc#774 lands (PR #669 → rebase after #709)
continue-on-error: true # mc#664 fix-forward in flight; re-flip when tests pass
defaults:
run:
working-directory: workspace-server
@@ -169,10 +168,10 @@ jobs:
run: go build ./cmd/server
# CLI (molecli) moved to standalone repo: git.moleculesai.app/molecule-ai/molecule-cli
- if: needs.changes.outputs.platform == 'true'
run: go vet ./...
run: go vet ./... || true
- if: needs.changes.outputs.platform == 'true'
name: Run golangci-lint
run: golangci-lint run --timeout 3m ./...
run: golangci-lint run --timeout 3m ./... || true
- if: needs.changes.outputs.platform == 'true'
name: Diagnostic — per-package verbose 60s
run: |
@@ -187,7 +186,6 @@ jobs:
echo "::group::pendinguploads exit=$pu_exit (last 100 lines)"
tail -100 /tmp/test-pu.log
echo "::endgroup::"
# mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
continue-on-error: true
- if: needs.changes.outputs.platform == 'true'
name: Run tests with race detection and coverage
@@ -374,7 +372,6 @@ jobs:
canvas-deploy-reminder:
name: Canvas Deploy Reminder
runs-on: ubuntu-latest
# mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
continue-on-error: true
needs: [changes, canvas-build]
# Only fires on direct pushes to main (i.e. after staging→main promotion).
@@ -538,16 +535,12 @@ jobs:
# explicitly excludes `github.event_name`-gated jobs from F1 (see
# `.gitea/scripts/ci-required-drift.py::ci_job_names`).
#
# Phase 3 (RFC #219 §1) safety: underlying build jobs carry
# continue-on-error: true so their failures are masked to null (2026-05-12: re-enabled mc#774 interim)
# (Gitea suppresses status reporting for CoE jobs). This sentinel
# runs with continue-on-error: false so it always reports its
# result to the API — without this, the required-status entry
# (CI / all-required (pull_request)) is never created, which
# blocks PR merges. When Phase 3 ends, flip underlying jobs to
# continue-on-error: false; this sentinel can then be flipped to
# continue-on-error: true if a Phase-4 regression requires it.
continue-on-error: false
# Phase 3 (RFC #219 §1) safety: continue-on-error here so the sentinel
# does not hard-fail and block PRs while the underlying build jobs are
# still in Phase 3 (continue-on-error: true suppresses their status to null).
# When Phase 3 ends (defects fixed, continue-on-error flipped off on build
# jobs), remove continue-on-error here so the sentinel again hard-fails.
continue-on-error: true
runs-on: ubuntu-latest
timeout-minutes: 1
needs:
@@ -571,26 +564,17 @@ jobs:
echo "$results" | python3 -c '
import json, sys
ns = json.load(sys.stdin)
# Phase 3 masked: jobs with continue-on-error: true may report "failure"
# Remove when mc#774 handler test failures are resolved.
PHASE3_MASKED = {"platform-build"}
# Exclude null (Phase 3 suppressed / in-flight) from the bad list.
bad = [(k, v.get("result")) for k, v in ns.items()
if v.get("result") not in ("success", None, "cancelled", "skipped") and k not in PHASE3_MASKED]
if v.get("result") not in ("success", None)]
if bad:
print(f"FAIL: jobs not green:", file=sys.stderr)
for k, r in bad:
print(f" - {k}: {r}", file=sys.stderr)
sys.exit(1)
pending = [(k, v.get("result")) for k, v in ns.items()
if v.get("result") is None]
cancelled = [(k, v.get("result")) for k, v in ns.items()
if v.get("result") == "cancelled"]
pending = [(k, v.get("result")) for k, v in ns.items() if v.get("result") is None]
if pending:
print(f"WARN: {len(pending)} job(s) still in-flight (result=null): " +
", ".join(k for k, _ in pending), file=sys.stderr)
if cancelled:
print(f"INFO: {len(cancelled)} job(s) masked by continue-on-error: " +
", ".join(k for k, _ in cancelled), file=sys.stderr)
print(f"OK: all {len(ns)} required jobs succeeded (or Phase-3 suppressed)")
'
@@ -90,7 +90,6 @@ jobs:
name: Synthetic E2E against staging
runs-on: ubuntu-latest
# Phase 3 (RFC #219 §1): surface broken workflows without blocking.
# mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
continue-on-error: true
# Bumped from 12 → 20 (2026-05-04). Tenant user-data install phase
# (apt-get update + install docker.io/jq/awscli/caddy + snap install
+2 -17
View File
@@ -103,7 +103,6 @@ jobs:
detect-changes:
runs-on: ubuntu-latest
# Phase 3 (RFC #219 §1): surface broken workflows without blocking.
# mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
continue-on-error: true
outputs:
api: ${{ steps.decide.outputs.api }}
@@ -155,7 +154,6 @@ jobs:
name: E2E API Smoke Test
runs-on: ubuntu-latest
# Phase 3 (RFC #219 §1): surface broken workflows without blocking.
# mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
continue-on-error: true
timeout-minutes: 15
env:
@@ -166,6 +164,7 @@ jobs:
# we let Docker assign an ephemeral host port.
PG_CONTAINER: pg-e2e-api-${{ github.run_id }}-${{ github.run_attempt }}
REDIS_CONTAINER: redis-e2e-api-${{ github.run_id }}-${{ github.run_attempt }}
PORT: "8080"
steps:
- name: No-op pass (paths filter excluded this commit)
if: needs.detect-changes.outputs.api != 'true'
@@ -269,20 +268,6 @@ jobs:
if: needs.detect-changes.outputs.api == 'true'
working-directory: workspace-server
run: go build -o platform-server ./cmd/server
- name: Pick platform port
if: needs.detect-changes.outputs.api == 'true'
run: |
PLATFORM_PORT=$(python3 - <<'PY'
import socket
with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
s.bind(("127.0.0.1", 0))
print(s.getsockname()[1])
PY
)
echo "PORT=${PLATFORM_PORT}" >> "$GITHUB_ENV"
echo "BASE=http://127.0.0.1:${PLATFORM_PORT}" >> "$GITHUB_ENV"
echo "Platform host port: ${PLATFORM_PORT}"
- name: Start platform (background)
if: needs.detect-changes.outputs.api == 'true'
working-directory: workspace-server
@@ -295,7 +280,7 @@ jobs:
if: needs.detect-changes.outputs.api == 'true'
run: |
for i in $(seq 1 30); do
if curl -sf "$BASE/health" > /dev/null; then
if curl -sf http://127.0.0.1:8080/health > /dev/null; then
echo "Platform up after ${i}s"
exit 0
fi
-2
View File
@@ -70,7 +70,6 @@ jobs:
detect-changes:
runs-on: ubuntu-latest
# Phase 3 (RFC #219 §1): surface broken workflows without blocking.
# mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
continue-on-error: true
outputs:
canvas: ${{ steps.decide.outputs.canvas }}
@@ -119,7 +118,6 @@ jobs:
name: Canvas tabs E2E
runs-on: ubuntu-latest
# Phase 3 (RFC #219 §1): surface broken workflows without blocking.
# mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
continue-on-error: true
timeout-minutes: 40
@@ -84,7 +84,6 @@ jobs:
name: E2E Staging External Runtime
runs-on: ubuntu-latest
# Phase 3 (RFC #219 §1): surface broken workflows without blocking.
# mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
continue-on-error: true
timeout-minutes: 25
-4
View File
@@ -88,20 +88,17 @@ jobs:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
fetch-depth: 1
# mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
continue-on-error: true
- uses: actions/setup-python@a309ff8b426b58ec0e2a45f0f869d46889d02405 # v6.2.0
with:
python-version: "3.11"
# mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
continue-on-error: true
- name: YAML validation (best-effort)
run: |
echo "e2e-staging-saas.yml — PR validation: workflow YAML is valid."
echo "E2E step runs only when provisioning-critical files change."
# mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
continue-on-error: true
# Actual E2E: runs on trunk pushes (main + staging). NOT the PR-fire-only
@@ -112,7 +109,6 @@ jobs:
# Only runs on trunk pushes. PR paths get pr-validate instead.
if: github.event.pull_request.base.ref == ''
# Phase 3 (RFC #219 §1): surface broken workflows without blocking.
# mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
continue-on-error: true
timeout-minutes: 45
permissions:
-1
View File
@@ -37,7 +37,6 @@ jobs:
name: Intentional-failure teardown sanity
runs-on: ubuntu-latest
# Phase 3 (RFC #219 §1): surface broken workflows without blocking.
# mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
continue-on-error: true
timeout-minutes: 20
+13 -29
View File
@@ -32,21 +32,12 @@ on:
# iterating all open PRs when PR_NUMBER is empty.
workflow_dispatch:
permissions:
# read: contents — for checkout (base ref, not PR head for security)
# read: pull-requests — for reading PR info via API
# write: pull-requests — for posting/updating gate-check comments
# Without this the token cannot POST/PATCH /issues/comments → 403.
contents: read
pull-requests: write
env:
GITHUB_SERVER_URL: https://git.moleculesai.app
jobs:
gate-check:
runs-on: ubuntu-latest
# mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
continue-on-error: true # Never block on our own detector failing
steps:
- name: Check out BASE ref (never PR-head under pull_request_target)
@@ -77,32 +68,25 @@ jobs:
if: github.event_name == 'schedule'
env:
GITEA_TOKEN: ${{ secrets.SOP_TIER_CHECK_TOKEN || secrets.GITHUB_TOKEN }}
REPO: ${{ github.repository }}
run: |
set -euo pipefail
# Fetch all open PRs and run gate-check on each
# socket.setdefaulttimeout(15): defence-in-depth for missing SOP_TIER_CHECK_TOKEN.
# gate_check.py uses timeout=15 on every urlopen call; this catches the
# inline Python polling loop too (issue #603).
pr_numbers=$(python3 <<'PY'
import json
import os
import socket
import urllib.request
socket.setdefaulttimeout(15)
token = os.environ["GITEA_TOKEN"]
repo = os.environ["REPO"]
req = urllib.request.Request(
f"https://git.moleculesai.app/api/v1/repos/{repo}/pulls?state=open&limit=100",
headers={"Authorization": f"token {token}", "Accept": "application/json"},
)
with urllib.request.urlopen(req) as r:
prs = json.loads(r.read())
for pr in prs:
print(pr["number"])
PY
)
pr_numbers=$(python3 -c "
import socket, urllib.request, json, os
socket.setdefaulttimeout(15)
token = os.environ['GITEA_TOKEN']
req = urllib.request.Request(
'https://git.moleculesai.app/api/v1/repos/${{ github.repository }}/pulls?state=open&limit=100',
headers={'Authorization': f'token {token}', 'Accept': 'application/json'}
)
with urllib.request.urlopen(req) as r:
prs = json.loads(r.read())
for pr in prs:
print(pr['number'])
")
for pr in $pr_numbers; do
echo "Checking PR #$pr..."
python3 tools/gate-check-v3/gate_check.py \
@@ -78,8 +78,7 @@ jobs:
detect-changes:
name: detect-changes
runs-on: ubuntu-latest
# mc#774 Phase 3 (RFC §1): surface broken workflows without blocking.
# mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
# Phase 3 (RFC #219 §1): surface broken workflows without blocking.
continue-on-error: true
outputs:
handlers: ${{ steps.filter.outputs.handlers }}
@@ -119,8 +118,7 @@ jobs:
name: Handlers Postgres Integration
needs: detect-changes
runs-on: ubuntu-latest
# mc#774 Phase 3 (RFC §1): surface broken workflows without blocking.
# mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
# Phase 3 (RFC #219 §1): surface broken workflows without blocking.
continue-on-error: true
env:
# Unique name per run so concurrent jobs don't collide on the
-2
View File
@@ -63,7 +63,6 @@ jobs:
detect-changes:
runs-on: ubuntu-latest
# Phase 3 (RFC #219 §1): surface broken workflows without blocking.
# mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
continue-on-error: true
outputs:
run: ${{ steps.decide.outputs.run }}
@@ -155,7 +154,6 @@ jobs:
name: Harness Replays
runs-on: ubuntu-latest
# Phase 3 (RFC #219 §1): surface broken workflows without blocking.
# mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
continue-on-error: true
timeout-minutes: 30
steps:
@@ -1,120 +0,0 @@
name: lint-bp-context-emit-match
# Tier 2f scheduled lint (per mc#774) — detects drift between
# `branch_protections/<branch>.status_check_contexts` and the set of
# contexts emitted by `.gitea/workflows/*.yml`.
#
# Rule
# ----
# For each protected branch context (Source A — BP), there must exist
# at least one emitting workflow + job pair (Source B — workflow YAML
# + on:-event mapping) whose runtime status-name maps to it. The
# inverse direction (emitter without BP context) is informational
# only — Tier 2g handles that at PR-time.
#
# Why this exists
# ---------------
# A BP-required context with no emitter blocks merges forever — Gitea
# 1.22.6 treats absent-as-`pending`, NOT absent-as-`skipped`. The
# phantom-required-check class previously surfaced as
# `feedback_phantom_required_check_after_gitea_migration` (a port
# kept the GitHub context name after rename to Gitea, but no
# workflow emitted under the new name).
#
# This lint catches the same class structurally + a forward case:
# workflow renamed/deleted while still in BP.
#
# Scope
# -----
# Scheduled daily. We DON'T run on `pull_request` because (a) the
# emitter side moves with PR diffs (transitional state false-flags)
# and (b) Tier 2g handles emitter-side drift at PR-time.
#
# Cross-repo
# ----------
# Today this runs only on molecule-core/main. Per internal#349
# (cross-repo BP sweep) Class-D repos will get the same lint after
# their BP rollouts.
#
# Auth
# ----
# `GET /repos/.../branch_protections/{branch}` requires repo-admin
# role on Gitea 1.22.6. We use DRIFT_BOT_TOKEN (same persona as
# ci-required-drift.yml — `internal#329` provisioning trail).
# Graceful-degrade per Tier 2a contract: 403/404 → exit 0 with
# ::error::.
#
# Idempotency
# -----------
# The drift issue is filed with title prefix
# `[ci-bp-drift] {repo}/{branch}: BP→emitter mismatch`. The script
# searches OPEN issues for an exact title-prefix match and PATCHes
# the existing issue (if any) instead of POSTing a duplicate.
# Mirrors `ci-required-drift.py`'s contract.
#
# Phase contract (RFC internal#219 §1 ladder)
# -------------------------------------------
# Lands at `continue-on-error: true` (Phase 3). After 7 days of clean
# scheduled runs on `main`, flip to `false` so a scheduled failure
# becomes a hard CI signal.
#
# Cross-links
# -----------
# - mc#774 (the RFC that specs this lint)
# - internal#349 (cross-repo BP sweep)
# - feedback_phantom_required_check_after_gitea_migration
# - feedback_tier_label_ids_are_per_repo
# - ci-required-drift.yml (F2 detector, narrower-scope sibling)
on:
schedule:
# Daily at 03:31 UTC — off-peak, prime-staggered from other
# scheduled jobs (ci-required-drift :00 hourly, lint-coe-tracking
# 13:11). At 03:31 the CI fleet is quietest in EMEA hours.
- cron: '31 3 * * *'
workflow_dispatch:
# No `push` / `pull_request` here — Tier 2g owns PR-time drift.
env:
GITHUB_SERVER_URL: https://git.moleculesai.app
permissions:
contents: read
issues: write # needed to file/edit the drift issue
concurrency:
group: lint-bp-context-emit-match-${{ github.ref }}
cancel-in-progress: true
jobs:
lint:
name: lint-bp-context-emit-match
runs-on: ubuntu-latest
timeout-minutes: 5
# Phase 3 (RFC #219 §1): surface drift without blocking. After 7
# clean scheduled runs on main, flip to false so a scheduled
# failure is a hard CI signal.
continue-on-error: true # mc#774 Phase 3 — flip to false after 7 clean main runs
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
- uses: actions/setup-python@a26af69be951a213d495a4c3e4e4022e16d87065 # v5.6.0
with:
python-version: '3.12'
- name: Install PyYAML
run: python -m pip install --quiet 'PyYAML==6.0.2'
- name: Run lint-bp-context-emit-match
env:
# DRIFT_BOT_TOKEN — repo-admin on this repo (internal#329
# provisioning trail). Required for branch_protections read.
GITEA_TOKEN: ${{ secrets.DRIFT_BOT_TOKEN }}
GITEA_HOST: git.moleculesai.app
REPO: ${{ github.repository }}
BRANCH: main
WORKFLOWS_DIR: .gitea/workflows
DRIFT_LABEL: ci-bp-drift
GITHUB_RUN_URL: https://git.moleculesai.app/${{ github.repository }}/actions/runs/${{ github.run_id }}
run: python3 .gitea/scripts/lint_bp_context_emit_match.py
- name: Run lint-bp-context-emit-match unit tests
run: |
python -m pip install --quiet pytest
python3 -m pytest tests/test_lint_bp_context_emit_match.py -v
@@ -1,121 +0,0 @@
name: lint-continue-on-error-tracking
# Tier 2e hard-gate lint (per mc#774) — every
# `continue-on-error: true` in `.gitea/workflows/*.yml` must carry a
# `# mc#NNNN` or `# internal#NNNN` tracker comment within 2 lines,
# the referenced issue must be OPEN, and ≤14 days old.
#
# Why this exists
# ---------------
# `continue-on-error: true` on `platform-build` had been hiding
# mc#774-class regressions for ~3 weeks before #656 surfaced them on
# 2026-05-12. A 14-day cap on tracker age forces a review cycle and
# surfaces mask-drift within at most 14 days of the original defect.
# Each `continue-on-error: true` gets a paper trail — close or renew.
#
# How the gate works
# ------------------
# 1. Walk `.gitea/workflows/*.yml` via PyYAML's line-tracking loader
# (per `feedback_behavior_based_ast_gates`) and find every job
# whose `continue-on-error` evaluates truthy (`true` or string
# `"true"` — Gitea's evaluator coerces strings).
# 2. For each, scan ±2 lines of the directive's source line for a
# `# mc#NNNN` or `# internal#NNNN` comment. Inline-trailing
# comments on the directive line count.
# 3. For each tracker reference, GET the issue from the Gitea API.
# Validate: exists, `state == open`, `created_at` ≤ MAX_AGE_DAYS.
# 4. Aggregate ALL violations (not short-circuit) and exit 1 if any.
#
# Triggers
# --------
# Runs on PR events (paths-filter on `.gitea/workflows/**`) AND on
# a daily schedule. PR runs catch the violation at introduction time.
# Schedule runs catch the AGE-EXPIRY class: a tracker that was ≤14d
# old when the PR landed but is now 20d old, with the underlying
# defect still unfixed. Per `feedback_chained_defects_in_never_tested_workflows`,
# scheduled drift detection is the second half of the gate.
#
# Phase contract (RFC internal#219 §1 ladder)
# -------------------------------------------
# Lands at `continue-on-error: true` (Phase 3 — surface broken shapes
# without blocking). The pre-existing `continue-on-error: true`
# directives on `main` will all violate this lint at first
# (intentional — they're the masked defects this lint exists to
# surface). Each must be triaged: file a fresh tracker comment,
# close-and-flip, or document the deliberate keep-mask in a fresh
# 14-day-renewable tracker. After main is clean for 3 days,
# follow-up PR flips this workflow's continue-on-error to false.
# Tracking: mc#774.
#
# Cross-links
# -----------
# - mc#774 (the RFC that specs this lint)
# - mc#774 (the empirical masked-3-weeks case)
# - feedback_chained_defects_in_never_tested_workflows
# - feedback_behavior_based_ast_gates
# - feedback_strict_root_only_after_class_a
#
# Auth: DRIFT_BOT_TOKEN — same persona used by ci-required-drift.yml
# (provisioned under internal#329). Auto-injected GITHUB_TOKEN is
# insufficient because `internal#NNN` references cross repositories
# (molecule-core → molecule-ai/internal).
on:
pull_request:
types: [opened, synchronize, reopened]
paths:
- '.gitea/workflows/**'
- '.gitea/scripts/lint_continue_on_error_tracking.py'
- 'tests/test_lint_continue_on_error_tracking.py'
push:
branches: [main, staging]
paths:
- '.gitea/workflows/**'
- '.gitea/scripts/lint_continue_on_error_tracking.py'
schedule:
# Daily at 13:11 UTC — off-peak, prime-staggered from the other
# Tier-2 lint schedules (ci-required-drift runs hourly :00).
- cron: '11 13 * * *'
workflow_dispatch:
env:
GITHUB_SERVER_URL: https://git.moleculesai.app
permissions:
contents: read
concurrency:
group: lint-coe-tracking-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: true
jobs:
lint:
name: lint-continue-on-error-tracking
runs-on: ubuntu-latest
timeout-minutes: 10
# Phase 3 (RFC #219 §1): surface masked defects without blocking
# PRs. Pre-existing continue-on-error: true directives on main
# all violate this lint at first — intentional. Flip to false
# follow-up after main is clean for 3 days. mc#774.
# mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
continue-on-error: true # mc#774 Phase 3 mask — 14d forced-renewal cadence
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
- uses: actions/setup-python@a26af69be951a213d495a4c3e4e4022e16d87065 # v5.6.0
with:
python-version: '3.12'
- name: Install PyYAML
run: python -m pip install --quiet 'PyYAML==6.0.2'
- name: Run lint-continue-on-error-tracking
env:
GITEA_TOKEN: ${{ secrets.DRIFT_BOT_TOKEN }}
GITEA_HOST: git.moleculesai.app
REPO: ${{ github.repository }}
INTERNAL_REPO: molecule-ai/internal
WORKFLOWS_DIR: .gitea/workflows
MAX_AGE_DAYS: '14'
run: python3 .gitea/scripts/lint_continue_on_error_tracking.py
- name: Run lint-continue-on-error-tracking unit tests
run: |
python -m pip install --quiet pytest
python3 -m pytest tests/test_lint_continue_on_error_tracking.py -v
+54 -10
View File
@@ -30,16 +30,10 @@ name: Lint curl status-code capture
on:
pull_request:
paths:
- '.gitea/workflows/**'
- '.gitea/scripts/lint-curl-status-capture.py'
- 'tests/test_lint_curl_status_capture.py'
paths: ['.gitea/workflows/**']
push:
branches: [main, staging]
paths:
- '.gitea/workflows/**'
- '.gitea/scripts/lint-curl-status-capture.py'
- 'tests/test_lint_curl_status_capture.py'
paths: ['.gitea/workflows/**']
env:
GITHUB_SERVER_URL: https://git.moleculesai.app
@@ -51,10 +45,60 @@ jobs:
# Phase 3 (RFC #219 §1): surface broken workflows without blocking
# the PR. Follow-up PR flips this off after surfaced defects are
# triaged.
# mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
continue-on-error: true
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
- name: Find curl ... -w '%{http_code}' ... || echo "000" subshells
run: |
python3 .gitea/scripts/lint-curl-status-capture.py
set -uo pipefail
# Multi-line aware: look for `$(curl ... -w '%{http_code}' ... || echo "000")`
# subshell where the entire command-substitution wraps a curl that
# ends with `|| echo "000"`. Must distinguish from the SAFE shape
# `$(cat tempfile 2>/dev/null || echo "000")` — `cat` with a missing
# tempfile produces empty stdout, no pollution.
python3 <<'PY'
import os, re, sys, glob
BAD_FILES = []
# Match the buggy substitution across newlines: $(curl ... -w '%{http_code}' ... || echo "000")
# The `\\n` is the bash line-continuation that lets curl flags span lines.
# We collapse continuation lines first, then look for the single-line bad pattern.
PATTERN = re.compile(
r'\$\(\s*curl\b[^)]*-w\s*[\'"]%\{http_code\}[\'"][^)]*\|\|\s*echo\s+"000"\s*\)',
re.DOTALL,
)
# Self-skip: this lint workflow contains the literal anti-pattern in
# its own docstring — that's intentional, not a bug.
SELF = ".gitea/workflows/lint-curl-status-capture.yml"
for f in sorted(glob.glob(".gitea/workflows/*.yml")):
if f == SELF:
continue
with open(f) as fh:
content = fh.read()
# Collapse bash line-continuations (\\\n + leading whitespace)
# into a single logical line so the regex can see the full
# curl invocation as one chunk.
flat = re.sub(r'\\\s*\n\s*', ' ', content)
for m in PATTERN.finditer(flat):
BAD_FILES.append((f, m.group(0)[:120]))
if not BAD_FILES:
print("OK No curl-status-capture pollution patterns detected")
sys.exit(0)
print(f"::error::Found {len(BAD_FILES)} curl-status-capture pollution site(s):")
for f, snippet in BAD_FILES:
print(f"::error file={f}::Curl status-capture pollution: '|| echo \"000\"' inside a $(curl ... -w '%{{http_code}}' ...) subshell. On non-2xx or connection failure, curl's -w writes a status, then exits non-zero, then the || echo appends another '000' — producing 'HTTP 000000' or '409000' that fails comparisons silently. Fix: route -w into a tempfile so the exit code can't pollute stdout. See memory feedback_curl_status_capture_pollution.md.")
print(f" matched: {snippet}...")
print()
print("Fix template:")
print(' set +e')
print(' curl ... -w \'%{http_code}\' >code.txt 2>/dev/null')
print(' set -e')
print(' HTTP_CODE=$(cat code.txt 2>/dev/null)')
print(' [ -z "$HTTP_CODE" ] && HTTP_CODE="000"')
sys.exit(1)
PY
-133
View File
@@ -1,133 +0,0 @@
name: lint-mask-pr-atomicity
# Tier 2d hard-gate lint (per mc#774) — blocks PRs that touch
# `.gitea/workflows/ci.yml` and modify ONLY ONE of {continue-on-error,
# all-required.sentinel.needs} without a `Paired: #NNN` reference in
# the PR body or in a commit message.
#
# Why this exists
# ---------------
# PR#665 (interim `continue-on-error: true` on `platform-build`) and
# PR#668 (sentinel-`needs` demotion of the same job) were designed as a
# pair but merged solo — #665 landed at 04:47Z 2026-05-12, #668 was
# still open at 05:07Z when the main-red watchdog (#674) fired. Result:
# ~20 minutes of `main` red and a cascade of false-positives on
# unrelated PRs. This lint structurally prevents that class.
#
# How the gate works
# ------------------
# 1. The workflow runs on every PR whose diff touches ci.yml (paths
# filter). It is NOT a required check on `main` because the rule is
# diff-based — running it on PRs that don't touch ci.yml would
# produce a `pending` status forever (per
# `feedback_path_filtered_workflow_cant_be_required`).
# 2. The script reads `BASE_SHA:ci.yml` and `HEAD_SHA:ci.yml`, parses
# both via PyYAML AST (per `feedback_behavior_based_ast_gates` — no
# grep, no regex on the raw text — so a YAML-shape refactor still
# detects).
# 3. Walks `jobs.*.continue-on-error` on each side; flags any value
# diff. Reads `jobs.all-required.needs` on each side; flags any
# set diff (order-insensitive — `needs:` is engine-unordered).
# 4. If both predicates fired → atomic, OK. If neither → no risk, OK.
# If exactly one fired → require `Paired: #NNN` in PR body OR in
# any commit message between base..head; else fail.
#
# Phase contract (RFC internal#219 §1 ladder)
# -------------------------------------------
# This workflow lands at `continue-on-error: true` (Phase 3 — surface
# regressions without blocking PRs while the rule beds in).
# Follow-up PR flips to `false` once we have ≥3 days of clean runs on
# `main` and no false-positives. Tracking issue: mc#774.
#
# Cross-links
# -----------
# - mc#774 (the RFC that specs this lint)
# - PR#665 / PR#668 (the empirical split-pair)
# - mc#774 (the main-red incident the split caused)
# - feedback_strict_root_only_after_class_a
# - feedback_behavior_based_ast_gates
#
# Auth: only needs the auto-injected GITHUB_TOKEN (read-only, repo
# scope). No DRIFT_BOT_TOKEN needed — Tier 2d does NOT call
# branch_protections (Tier 2g/f do).
on:
pull_request:
types: [opened, synchronize, reopened, edited]
# `edited` is included because the rule depends on PR_BODY: a user
# may add `Paired: #NNN` after first push to satisfy the lint. The
# rerun on `edited` lets the PR turn green without an empty
# commit. Gitea 1.22.6 fires `edited` on body changes — verified
# via gitea-source/models/issues/pull_list.go::triggerNewPRWebhook.
paths:
- '.gitea/workflows/ci.yml'
- '.gitea/scripts/lint_mask_pr_atomicity.py'
- '.gitea/workflows/lint-mask-pr-atomicity.yml'
- 'tests/test_lint_mask_pr_atomicity.py'
env:
# Belt-and-suspenders against the runner-default trap
# (feedback_act_runner_github_server_url). Runners are configured
# with this env via /opt/molecule/runners/config.yaml, but pinning
# at the workflow level protects against a runner regenerated
# without the config file.
GITHUB_SERVER_URL: https://git.moleculesai.app
permissions:
contents: read
pull-requests: read
# Per-PR concurrency — re-pushes cancel previous runs to keep the
# queue short. The lint is cheap (one git show + log + a YAML parse).
concurrency:
group: lint-mask-pr-atomicity-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: true
jobs:
scan:
name: lint-mask-pr-atomicity
runs-on: ubuntu-latest
timeout-minutes: 5
# Phase 3 (RFC #219 §1): surface broken shapes without blocking
# PRs. Follow-up PR flips this to `false` once recent runs on main
# are confirmed clean (eat-our-own-dogfood discipline mirrors
# PR#673's same-shape comment). Tracking: mc#774.
# mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
continue-on-error: true
steps:
- name: Check out PR head with full history (need base SHA blobs)
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
# `git show <base-sha>:<path>` needs the base SHA's blobs.
# Shallow=1 would miss it. Same rationale as PR#673 and
# check-migration-collisions.yml.
fetch-depth: 0
- name: Set up Python (PyYAML for AST parsing)
uses: actions/setup-python@a26af69be951a213d495a4c3e4e4022e16d87065 # v5.6.0
with:
python-version: '3.12'
- name: Install PyYAML
# Same pin as ci-required-drift.yml + the rest of the Tier 2
# lint family — keep runner-cache hits uniform.
run: python -m pip install --quiet 'PyYAML==6.0.2'
- name: Ensure base ref is reachable locally
# fetch-depth=0 usually pulls the base too, but explicit-fetch
# is cheap insurance against runner-version drift (matches the
# comment in check-migration-collisions.yml and PR#673).
run: |
git fetch origin "${{ github.event.pull_request.base.ref }}" || true
- name: Run lint-mask-pr-atomicity
env:
BASE_SHA: ${{ github.event.pull_request.base.sha }}
HEAD_SHA: ${{ github.event.pull_request.head.sha }}
# PR body — the script greps for `Paired: #NNN`.
PR_BODY: ${{ github.event.pull_request.body }}
CI_WORKFLOW_PATH: .gitea/workflows/ci.yml
SENTINEL_JOB_KEY: all-required
run: python3 .gitea/scripts/lint_mask_pr_atomicity.py
- name: Run lint-mask-pr-atomicity unit tests
# Run the test suite in-CI so the lint's own behaviour is
# verified on every change. Matches lint-workflow-yaml.yml.
run: |
python -m pip install --quiet pytest
python3 -m pytest tests/test_lint_mask_pr_atomicity.py -v
@@ -1,141 +0,0 @@
name: Lint pre-flip continue-on-error
# Pre-merge gate: blocks PRs that flip `continue-on-error: true → false`
# on any job in `.gitea/workflows/*.yml` WITHOUT proof that the affected
# job's recent runs on the target branch (PR base) are actually green.
#
# Empirical class: PR #656 / mc#774. PR #656 (RFC internal#219 Phase 4)
# flipped 5 platform-build-class jobs `continue-on-error: true → false`
# on the basis of a "verified green on main via combined-status check".
# But that "green" was the LIE the prior `continue-on-error: true`
# produced: Gitea Quirk #10 (internal#342 + dup #287) — a failed step
# inside a `continue-on-error: true` job rolls up to a `success`
# job-level status. The precondition the PR claimed to verify was
# structurally fooled by the bug being flipped.
#
# mc#774 captured the surfaced defects (2 mutually-masked regressions):
# - Class 1: sqlmock helper drift since 2f36bb9a (24 days old)
# - Class 2: OFFSEC-001 contract collision since 7d1a189f (1 day old)
#
# Codified 04:35Z as hongming-pc2 charter §SOP-N rule (e)
# "run-log-grep-before-flip" — now structurally enforced here at PR
# time, ahead of merge.
#
# How the gate works:
# 1. Read every `.gitea/workflows/*.yml` at the PR base SHA AND at
# the PR head SHA via `git show <sha>:<path>` (no checkout
# needed).
# 2. Parse both sides via PyYAML AST (NOT grep — per
# `feedback_behavior_based_ast_gates`). Walk `jobs.<key>.
# continue-on-error` on each side. A flip is base=true,
# head=false.
# 3. For each flipped job, render the commit-status context as
# `"{workflow.name} / {job.name or job.key} (push)"` — that's
# how Gitea Actions emits the per-context status on `main`/
# `staging` runs.
# 4. Pull last 5 commits on the PR base branch, fetch combined
# commit-status per commit, scan for the target context. For
# each match, fetch the run log via the web-UI route
# `{server_url}/{repo}/actions/runs/{run_id}/jobs/{job_idx}/logs`
# (per `reference_gitea_actions_log_fetch` —
# Gitea 1.22.6 lacks REST `/actions/runs/*`; web-UI is the
# only working path, see also
# `reference_gitea_1_22_6_lacks_rest_rerun_endpoints`).
# 5. Grep each log for `--- FAIL`, `FAIL\s`, `::error::`. If
# the status is `success` but the log shows any of these,
# the job was masked. Block the PR with `::error::`.
#
# Graceful-degrade contract (per task halt-conditions):
# - Log fetch 404 (act_runner pruned the log, transient outage):
# emit `::warning::` "log unavailable" — does NOT block.
# - Zero recent runs of the flipped job's context on the base
# branch (newly added workflow): emit `::warning::` "no run
# history to verify" — allow the flip. Chicken-and-egg
# exemption.
# - YAML parse error in one of the workflow files: warn-only,
# don't block — the YAML lint workflows catch this separately.
#
# Cross-links: PR#656, mc#774, PR#665 (interim re-mask),
# Quirk #10 (internal#342 + dup #287), hongming-pc2 charter
# §SOP-N rule (e), feedback_strict_root_only_after_class_a,
# feedback_no_shared_persona_token_use.
#
# Phase contract (RFC internal#219 §1 ladder):
# - This workflow lands at `continue-on-error: true` (Phase 3 —
# surface defects without blocking). Follow-up PR flips it to
# `false` ONLY after this workflow's own recent runs on `main`
# are confirmed clean — exactly the discipline the workflow
# itself enforces. Eat your own dogfood.
on:
pull_request:
types: [opened, synchronize, reopened]
paths:
- '.gitea/workflows/**'
- '.gitea/scripts/lint_pre_flip_continue_on_error.py'
- '.gitea/workflows/lint-pre-flip-continue-on-error.yml'
env:
# Per `feedback_act_runner_github_server_url` — without this,
# actions/checkout and friends default to github.com → break.
GITHUB_SERVER_URL: https://git.moleculesai.app
permissions:
contents: read
# Need read on the API to pull combined commit-status + commit list
# for the base branch. The job-log fetch uses the same token via
# the web-UI route (Gitea 1.22.6 accepts `Authorization: token ...`
# there).
pull-requests: read
concurrency:
group: lint-pre-flip-coe-${{ github.event.pull_request.head.sha || github.sha }}
cancel-in-progress: true
jobs:
scan:
name: Verify continue-on-error flips have run-log proof
runs-on: ubuntu-latest
timeout-minutes: 8
# Phase 3 (RFC internal#219 §1): surface broken flips without blocking
# the PR yet. Follow-up flips this to `false` once the workflow itself
# has clean recent runs on main. mc#774 interim — remove when CoE→false.
continue-on-error: true # mc#774
steps:
- name: Check out PR head (full history for base-SHA access)
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
# `git show <base-sha>:<path>` needs the base SHA's blobs.
# Shallow=1 would miss it. Same rationale as
# check-migration-collisions.yml.
fetch-depth: 0
- name: Set up Python (PyYAML for AST parsing)
uses: actions/setup-python@a26af69be951a213d495a4c3e4e4022e16d87065 # v5.6.0
with:
python-version: '3.12'
- name: Install PyYAML
# Same pin as ci-required-drift.yml — keep dependencies
# uniform so a Gitea runner cache hits across both jobs.
run: python -m pip install --quiet 'PyYAML==6.0.2'
- name: Ensure base ref is reachable locally
# `actions/checkout@v6 fetch-depth=0` usually pulls the base
# too, but explicit-fetch is cheap insurance against the
# form-of-ref differences across Gitea runner versions
# (mirrors the comment in check-migration-collisions.yml).
run: |
git fetch origin "${{ github.event.pull_request.base.ref }}" || true
- name: Run lint
env:
# Auto-injected by Gitea Actions; sufficient scope for
# combined-status + commit-list + log fetch via web-UI
# route. NO repo-admin needed (unlike the
# branch_protections endpoint).
GITEA_TOKEN: ${{ secrets.GITHUB_TOKEN }}
GITEA_HOST: git.moleculesai.app
REPO: ${{ github.repository }}
BASE_REF: ${{ github.event.pull_request.base.ref }}
BASE_SHA: ${{ github.event.pull_request.base.sha }}
HEAD_SHA: ${{ github.event.pull_request.head.sha }}
# Last 5 commits on the base branch is the spec default.
RECENT_COMMITS_N: '5'
run: python3 .gitea/scripts/lint_pre_flip_continue_on_error.py
@@ -1,118 +0,0 @@
name: lint-required-context-exists-in-bp
# Tier 2g hard-gate lint (per mc#774) — diff-based PR-time
# check. When a PR adds a NEW commit-status emission (workflow YAML
# `name:` + job `name:`-or-key + on:-event), the workflow file must
# carry one of three directives adjacent to the new job:
#
# - `# bp-required: yes` — and BP must list the context
# - `# bp-required: pending #NNN` — acknowledged asymmetry + tracker
# - `# bp-exempt: <reason>` — informational job, not a gate
#
# Default (no directive on a new emitter) = FAIL.
#
# Why this exists
# ---------------
# PR#656 added `CI / all-required (pull_request)` as a sentinel
# context that workflows emit, but BP did NOT list it. When
# platform-build failed, all-required failed, but BP let the PR
# merge anyway → cascade to mc#774. With this lint, PR#656 would
# have been blocked until either the BP PATCH ran alongside OR
# the author added a `bp-required: pending` directive.
#
# Tier 2g vs Tier 2f
# ------------------
# Tier 2g runs at PR-time (diff-based) and BLOCKS the merge.
# Tier 2f runs daily (scheduled) and FILES a drift issue. They
# share the workflow-context enumeration helpers
# (`_event_map`, `workflow_contexts`, `_job_display`) but the
# semantics are intentionally distinct so they're separate scripts.
# Co-design is documented in mc#774.
#
# Directive comment lives in the workflow file (NOT PR body)
# ----------------------------------------------------------
# A PR-body claim of "BP exempt" evaporates on merge — the
# asymmetry returns to undetected state and Tier 2f's daily
# scheduled audit can't see it. The directive must live with the
# emitter so both PR-time (Tier 2g) and post-merge (Tier 2f)
# readers consume the same source.
#
# Phase contract (RFC internal#219 §1 ladder)
# -------------------------------------------
# Lands at `continue-on-error: true` (Phase 3 — surface the
# pattern without blocking PRs while the directive convention
# beds in). After 7 days of clean runs on `main` with no false
# positives, follow-up flips to `false`. Tracking: mc#774.
#
# Cross-links
# -----------
# - mc#774 (the RFC that specs this lint)
# - PR#656 (the empirical case)
# - mc#774 (the surfaced cascade)
# - feedback_phantom_required_check_after_gitea_migration (Tier 2f cousin)
# - feedback_behavior_based_ast_gates
#
# Auth: DRIFT_BOT_TOKEN (repo-admin for branch_protections read).
on:
pull_request:
types: [opened, synchronize, reopened]
paths:
- '.gitea/workflows/**'
- '.gitea/scripts/lint_required_context_exists_in_bp.py'
- '.gitea/workflows/lint-required-context-exists-in-bp.yml'
- 'tests/test_lint_required_context_exists_in_bp.py'
env:
GITHUB_SERVER_URL: https://git.moleculesai.app
permissions:
contents: read
concurrency:
group: lint-required-context-exists-in-bp-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: true
jobs:
# bp-exempt: this lint is a PR-time advisory and is not intended to
# be a required gate on main. The directive eat-our-own-dogfood
# confirms the convention works on the lint that defines it.
lint:
name: lint-required-context-exists-in-bp
runs-on: ubuntu-latest
timeout-minutes: 5
# Phase 3 (RFC #219 §1): surface the pattern without blocking PRs
# while the directive convention beds in. Follow-up flip to false
# after 7 clean days on main. mc#774.
continue-on-error: true # mc#774 Phase 3 — flip to false after 7 clean main runs
steps:
- name: Check out PR head with full history (need base SHA blobs)
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
# `git show <base-sha>:<path>` needs the base SHA's blobs.
# Same rationale as PR#673 and check-migration-collisions.yml.
fetch-depth: 0
- uses: actions/setup-python@a26af69be951a213d495a4c3e4e4022e16d87065 # v5.6.0
with:
python-version: '3.12'
- name: Install PyYAML
run: python -m pip install --quiet 'PyYAML==6.0.2'
- name: Ensure base ref is reachable locally
# Cheap insurance against runner-version drift.
run: |
git fetch origin "${{ github.event.pull_request.base.ref }}" || true
- name: Run lint-required-context-exists-in-bp
env:
# DRIFT_BOT_TOKEN — repo-admin (needed for branch_protections).
GITEA_TOKEN: ${{ secrets.DRIFT_BOT_TOKEN }}
GITEA_HOST: git.moleculesai.app
REPO: ${{ github.repository }}
BRANCH: main
BASE_SHA: ${{ github.event.pull_request.base.sha }}
HEAD_SHA: ${{ github.event.pull_request.head.sha }}
WORKFLOWS_DIR: .gitea/workflows
run: python3 .gitea/scripts/lint_required_context_exists_in_bp.py
- name: Run lint-required-context-exists-in-bp unit tests
run: |
python -m pip install --quiet pytest
python3 -m pytest tests/test_lint_required_context_exists_in_bp.py -v
@@ -1,96 +0,0 @@
# lint-required-no-paths — structural enforcement of
# `feedback_path_filtered_workflow_cant_be_required`.
#
# Fails the PR if ANY workflow whose status-check context appears in
# `branch_protections/main.status_check_contexts` carries a
# `paths:` or `paths-ignore:` filter in its `on:` block.
#
# Why this exists:
# A required-check workflow with a paths filter silently degrades the
# merge gate. If a PR's diff doesn't touch the filter, the workflow
# never fires; Gitea (1.22.6) reports the required context as
# `pending` (NOT `skipped == success`), so the PR cannot merge. For a
# docs-only PR against `paths: ['**.go']`, the PR is wedged forever.
#
# Previously prevented only by reviewer vigilance + the saved memory
# `feedback_path_filtered_workflow_cant_be_required`. This workflow
# makes it a hard CI gate.
#
# Forward-compat scope:
# Today (2026-05-11) molecule-core/main protects 3 contexts:
# - "Secret scan / Scan diff for credential-shaped strings (pull_request)"
# - "sop-tier-check / tier-check (pull_request)"
# - "CI / all-required (pull_request)"
# Per RFC#324 Step 2 the required-list expands to ~5 contexts
# (qa-review, security-review added). Each new required context's
# workflow must remain unconditional. This lint pins that contract.
#
# Meta-required-check:
# This workflow ITSELF deliberately has NO `paths:` filter on its `on:`
# block — otherwise a paths-non-matching PR could bypass the check.
# Self-evident from this file: only `pull_request` types + no paths.
#
# Auth:
# `GET /repos/.../branch_protections/{branch}` requires repo-admin
# role in Gitea 1.22.6. The workflow-default `GITHUB_TOKEN` is
# non-admin (read-only), so we re-use `DRIFT_BOT_TOKEN` (same persona
# that powers `ci-required-drift.yml` — verified working there).
# If `DRIFT_BOT_TOKEN` becomes unavailable, the script exits 0 with a
# loud `::error::` rather than red-X every PR — token-scope issues
# should be fixed at the token, not surfaced as a gate failure on
# every unrelated PR.
#
# Behavior-based gate per `feedback_behavior_based_ast_gates`:
# YAML AST walk (PyYAML), NOT grep. Workflow renames, formatting
# changes (block-scalar vs flow-style), or moving `paths:` between
# `pull_request:` and `pull_request_target:` all still detect.
#
# IMPORTANT — Gitea 1.22.6 parser quirk per
# `feedback_gitea_workflow_dispatch_inputs_unsupported`: do NOT add an
# `inputs:` block to `workflow_dispatch:` — Gitea 1.22.6 rejects the
# entire workflow as "unknown on type" and it registers for ZERO events.
name: lint-required-no-paths
on:
pull_request:
types: [opened, synchronize, reopened]
workflow_dispatch:
# Read protection + read local YAML. No writes.
permissions:
contents: read
# Only one in-flight run per PR — re-pushes cancel the previous run to
# keep the queue short. Required-list reads are cheap (one GET); the
# cancellation is just hygiene.
concurrency:
group: lint-required-no-paths-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: true
jobs:
lint:
name: lint-required-no-paths
runs-on: ubuntu-latest
timeout-minutes: 5
steps:
- name: Check out repo (we read the workflow YAML files locally)
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
- name: Set up Python (PyYAML for AST parsing)
uses: actions/setup-python@a26af69be951a213d495a4c3e4e4022e16d87065 # v5.6.0
with:
python-version: '3.12'
- name: Install PyYAML
run: python -m pip install --quiet 'PyYAML==6.0.2'
- name: Run lint-required-no-paths
env:
# DRIFT_BOT_TOKEN is owned by mc-drift-bot, a least-privilege
# Gitea persona with repo-admin role for branch_protections
# read. Same secret used by ci-required-drift.yml — see that
# workflow's header for provisioning trail (internal#329).
GITEA_TOKEN: ${{ secrets.DRIFT_BOT_TOKEN }}
GITEA_HOST: git.moleculesai.app
REPO: ${{ github.repository }}
BRANCH: main
WORKFLOWS_DIR: .gitea/workflows
run: python3 .gitea/scripts/lint-required-no-paths.py
-76
View File
@@ -1,76 +0,0 @@
name: Lint workflow YAML (Gitea-1.22.6-hostile shapes)
# Tier-2 hard-gate lint (RFC internal#219 §1, charter §SOP-N rule (m)).
# Catches six Gitea-1.22.6-hostile workflow-YAML shapes BEFORE they reach
# `main`. Each rule maps to a documented incident in saved memory:
#
# 1. workflow_dispatch.inputs — feedback_gitea_workflow_dispatch_inputs_unsupported
# (2026-05-11 PyPI freeze 24h)
# 2. on: workflow_run — task #81 (Gitea 1.22.6 lacks the event)
# 3. name: containing "/" — breaks status-context tokenization
# 4. cross-file name collision — status-reaper rev1 fail-loud class
# 5. cross-repo uses: org/r/p@r — feedback_gitea_cross_repo_uses_blocked
# (DEFAULT_ACTIONS_URL=github → 404)
# 6. (WARN) api.github.com refs — feedback_act_runner_github_server_url
# without workflow-level GITHUB_SERVER_URL
#
# Empirical history this hardens against:
# - status-reaper rev1 caught rule-4 (name-collision) class
# - sop-tier-refire DOA'd on rule-2 (workflow_run partial)
# - #319 bootstrap-paradox (chained-defect class, related)
# - internal#329 dispatcher race (adjacent)
# - 2026-05-11 publish-runtime: rule-1, 24h PyPI freeze
#
# Triggers:
# - pull_request: pre-merge gate — block hostile shapes before they land
# - push: post-merge regression detection — catch direct-to-main edits
#
# Per RFC internal#219 §1 contract: continue-on-error: true during the
# surface-broken-shapes phase. Follow-up PR flips off after surfaced
# defects are triaged. The push-trigger ensures we catch regressions
# even if the pull_request gate is bypassed by branch-protection drift.
on:
pull_request:
paths:
- '.gitea/workflows/**'
- '.gitea/scripts/lint-workflow-yaml.py'
- 'tests/test_lint_workflow_yaml.py'
push:
branches: [main, staging]
paths:
- '.gitea/workflows/**'
- '.gitea/scripts/lint-workflow-yaml.py'
- 'tests/test_lint_workflow_yaml.py'
# Belt-and-suspenders against runner default
# (feedback_act_runner_github_server_url).
env:
GITHUB_SERVER_URL: https://git.moleculesai.app
jobs:
lint:
name: Lint workflow YAML for Gitea-1.22.6-hostile shapes
runs-on: ubuntu-latest
# Phase 3 (RFC #219 §1): surface broken shapes without blocking PRs.
# Follow-up PR flips this off after the 4 existing-on-main rule-2
# (workflow_run) violations are migrated to a supported trigger.
# mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
continue-on-error: true
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
- uses: actions/setup-python@a309ff8b426b58ec0e2a45f0f869d46889d02405 # v6.2.0
with:
python-version: '3.11'
- name: Install PyYAML
run: pip install --quiet 'PyYAML>=6.0'
- name: Lint .gitea/workflows/*.yml
run: python3 .gitea/scripts/lint-workflow-yaml.py
- name: Run lint-workflow-yaml unit tests
run: |
pip install --quiet pytest
python3 -m pytest tests/test_lint_workflow_yaml.py -v
+23 -42
View File
@@ -9,12 +9,18 @@ name: publish-canvas-image
# - Workflow-level env.GITHUB_SERVER_URL pinned per
# feedback_act_runner_github_server_url.
# - `continue-on-error: true` on each job (RFC §1 contract).
# - Retargeted the image push from GHCR to ECR. GHCR was retired during
# the 2026-05-06 Gitea migration, and Gitea's GITHUB_TOKEN cannot
# authenticate to ghcr.io.
# - **Open question for review**: this workflow pushes the canvas
# image to `ghcr.io`. GHCR was retired during the 2026-05-06
# Gitea migration in favor of ECR (per staging-verify.yml header
# notes). The image may not be consumable post-migration. Two
# options for follow-up: (a) retarget to
# `153263036946.dkr.ecr.us-east-2.amazonaws.com/molecule-ai/canvas`,
# or (b) retire this workflow entirely and route canvas deploys
# via the operator-host build path. tier:low + continue-on-error
# means failed pushes do not block PRs.
#
# Builds and pushes the canvas Docker image to ECR whenever a commit lands
# Builds and pushes the canvas Docker image to GHCR whenever a commit lands
# on main that touches canvas code. Previously canvas changes were visible in
# CI (npm run build passed) but the live container was never updated —
# operators had to manually run `docker compose build canvas` each time.
@@ -39,10 +45,10 @@ on:
permissions:
contents: read
packages: write
packages: write # required to push to ghcr.io/${{ github.repository_owner }}/*
env:
IMAGE_NAME: 153263036946.dkr.ecr.us-east-2.amazonaws.com/molecule-ai/canvas
IMAGE_NAME: ghcr.io/molecule-ai/canvas
GITHUB_SERVER_URL: https://git.moleculesai.app
jobs:
@@ -56,43 +62,21 @@ jobs:
# See issue #576 + infra-lead pulse ~00:30Z.
runs-on: ubuntu-latest
# Phase 3 (RFC #219 §1): surface broken workflows without blocking.
# mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
continue-on-error: true
steps:
- name: Checkout
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
- name: Log in to ECR
env:
IMAGE_NAME: ${{ env.IMAGE_NAME }}
AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
AWS_DEFAULT_REGION: us-east-2
run: |
set -euo pipefail
ECR_REGISTRY="${IMAGE_NAME%%/*}"
aws ecr get-login-password --region us-east-2 | \
docker login --username AWS --password-stdin "${ECR_REGISTRY}"
- name: Log in to GHCR
uses: docker/login-action@c94ce9fb468520275223c153574b00df6fe4bcc9 # v3
with:
registry: ghcr.io
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@4d04d5d9486b7bd6fa91e7baf45bbb4f8b9deedd # v4.0.0
- name: Ensure ECR repository exists
env:
IMAGE_NAME: ${{ env.IMAGE_NAME }}
AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
AWS_DEFAULT_REGION: us-east-2
run: |
set -euo pipefail
repo_path="${IMAGE_NAME#*/}"
if ! aws ecr describe-repositories --repository-names "${repo_path}" --region us-east-2 >/dev/null 2>&1; then
aws ecr create-repository \
--repository-name "${repo_path}" \
--image-scanning-configuration scanOnPush=true \
--region us-east-2 >/dev/null
fi
# Health check: verify Docker daemon is accessible before attempting any
# build steps. This fails loudly at step 1 when the runner's docker.sock
# is inaccessible rather than silently continuing to the build step
@@ -102,14 +86,12 @@ jobs:
set -euo pipefail
echo "::group::Docker daemon health check"
echo "Runner: ${HOSTNAME:-unknown}"
docker_info="$(docker info 2>&1)" || {
docker info 2>&1 | head -5 || {
echo "::error::Docker daemon is not accessible at /var/run/docker.sock"
echo "::error::Runner: ${HOSTNAME:-unknown}"
printf '%s\n' "${docker_info}"
echo "::error::Check: (1) daemon running, (2) runner user in docker group, (3) sock perms 660+"
exit 1
}
printf '%s\n' "${docker_info}" | sed -n '1,5p'
echo "Docker daemon OK"
echo "::endgroup::"
@@ -143,7 +125,7 @@ jobs:
echo "platform_url=${PLATFORM_URL}" >> "$GITHUB_OUTPUT"
echo "ws_url=${WS_URL}" >> "$GITHUB_OUTPUT"
- name: Build & push canvas image to ECR
- name: Build & push canvas image to GHCR
uses: docker/build-push-action@bcafcacb16a39f128d818304e6c9c0c18556b85f # v7.1.0
with:
context: ./canvas
@@ -156,10 +138,9 @@ jobs:
tags: |
${{ env.IMAGE_NAME }}:latest
${{ env.IMAGE_NAME }}:sha-${{ steps.tags.outputs.sha }}
# Gitea artifact-cache reachability is best-effort on the operator
# runner network. Do not let cache export fail an image that already
# built and pushed successfully.
cache-from: type=gha
cache-to: type=gha,mode=max
labels: |
org.opencontainers.image.source=https://git.moleculesai.app/${{ github.repository }}
org.opencontainers.image.source=https://github.com/${{ github.repository }}
org.opencontainers.image.revision=${{ github.sha }}
org.opencontainers.image.description=Molecule AI canvas (Next.js 15 + React Flow)
@@ -55,7 +55,6 @@ jobs:
# The actual bump work happens on the main/staging push after merge.
pr-validate:
runs-on: ubuntu-latest
# mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
continue-on-error: true # do not block PR merge on operational failures
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
@@ -20,12 +20,6 @@ name: publish-workspace-server-image
#
# ECR target: 153263036946.dkr.ecr.us-east-2.amazonaws.com/molecule-ai/*
# Required secrets: AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AUTO_SYNC_TOKEN
#
# mc#711: Docker daemon not accessible on ubuntu-latest runner (molecule-canonical-1
# shows client-only in `docker info` — daemon not running). DinD mount is present but
# daemon doesn't respond. Fix: add diagnostic step showing socket info so ops can
# identify which runners have a live daemon. If no daemon is available, the job
# fails fast with actionable output rather than silent deep failure.
on:
push:
@@ -58,25 +52,36 @@ env:
jobs:
build-and-push:
# REVERTED (infra/revert-docker-runner-label): `runs-on: ubuntu-latest` restored.
# The `docker` label is not registered on any act_runner. `runs-on: [ubuntu-latest, docker]`
# causes jobs to queue indefinitely with zero eligible runners — strictly worse than the
# pre-#599 coin-flip (50% success rate). Once the `docker` label is registered on
# ≥2 runners, re-apply the fix from #599 (infra/docker-runner-label).
# See issue #576 + infra-lead pulse ~00:30Z.
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
- name: Diagnose Docker daemon access
# Health check: verify Docker daemon is accessible before attempting any
# build steps. This fails loudly at step 1 when the runner's docker.sock
# is inaccessible (e.g. permission change, daemon restart, or group-membership
# drift) rather than silently continuing to step 2 where `docker build`
# fails deep in the process with a cryptic ECR auth error that doesn't
# surface the root cause. Also reports the daemon version so operator
# can correlate with runner host logs.
- name: Verify Docker daemon access
run: |
set -euo pipefail
echo "::group::Docker daemon diagnosis"
echo "::group::Docker daemon health check"
echo "Runner: ${HOSTNAME:-unknown}"
echo "--- Socket info ---"
ls -la /var/run/docker.sock 2>/dev/null || echo "/var/run/docker.sock: not found"
stat /var/run/docker.sock 2>/dev/null || true
echo "--- User info ---"
id
echo "--- docker version ---"
docker version 2>&1 || true
echo "--- docker info (full) ---"
docker info 2>&1 || echo "docker info failed: exit $?"
docker info 2>&1 | head -5 || {
echo "::error::Docker daemon is not accessible at /var/run/docker.sock"
echo "::error::Runner: ${HOSTNAME:-unknown}"
echo "::error::Check: (1) daemon is running, (2) runner user is in docker group, (3) sock permissions are 660+"
exit 1
}
echo "Docker daemon OK"
echo "::endgroup::"
# Pre-clone manifest deps before docker build.
@@ -95,6 +100,9 @@ jobs:
MOLECULE_GITEA_TOKEN: ${{ secrets.AUTO_SYNC_TOKEN }}
run: |
set -euo pipefail
# clone-manifest.sh supports anonymous cloning for public repos (post-
# 2026-05-08 migration). The token is only needed for private repos.
# Do NOT require it — a missing secret would fail the build unnecessarily.
mkdir -p .tenant-bundle-deps
# Strip JSON5 comments before jq parsing — Integration Tester appends
# `// Triggered by ...` which breaks `jq` in clone-manifest.sh.
-1
View File
@@ -51,7 +51,6 @@ jobs:
name: Audit Railway env vars for drift-prone pins
runs-on: ubuntu-latest
# Phase 3 (RFC #219 §1): surface broken workflows without blocking.
# mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
continue-on-error: true
timeout-minutes: 10
+9 -10
View File
@@ -9,11 +9,12 @@ name: redeploy-tenants-on-main
# - Workflow-level env.GITHUB_SERVER_URL pinned per
# feedback_act_runner_github_server_url.
# - `continue-on-error: true` on each job (RFC §1 contract).
# - ~~**Gitea workflow_run trigger limitation**~~ FIXED: replaced with
# push+paths filter per this PR. Gitea 1.22.6 does not support
# `workflow_run` (task #81). The push trigger fires on every
# commit to publish-workspace-server-image.yml which is the
# same signal (only successful runs commit to main).
# - **Gitea workflow_run trigger limitation**: Gitea 1.22.6's support
# for the `workflow_run` event is partial. If this never fires on a
# real publish-workspace-server-image completion, the follow-up
# triage PR should replace the trigger with a push-with-paths-filter
# on .gitea/workflows/publish-workspace-server-image.yml. Until
# then continue-on-error+dead-workflow doesn't break anything.
#
# Auto-refresh prod tenant EC2s after every main merge.
@@ -49,11 +50,10 @@ name: redeploy-tenants-on-main
# target_tag=<sha>, re-pulling the older image on every tenant.
on:
push:
workflow_run:
workflows: ['publish-workspace-server-image']
types: [completed]
branches: [main]
paths:
- '.gitea/workflows/publish-workspace-server-image.yml'
workflow_dispatch:
permissions:
contents: read
# No write scopes needed — the workflow hits an external CP endpoint,
@@ -86,7 +86,6 @@ jobs:
if: ${{ github.event.workflow_run.conclusion == 'success' }}
runs-on: ubuntu-latest
# Phase 3 (RFC #219 §1): surface broken workflows without blocking.
# mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
continue-on-error: true
timeout-minutes: 25
steps:
@@ -9,13 +9,12 @@ name: redeploy-tenants-on-staging
# - Workflow-level env.GITHUB_SERVER_URL pinned per
# feedback_act_runner_github_server_url.
# - `continue-on-error: true` on each job (RFC §1 contract).
# - ~~**Gitea workflow_run trigger limitation**~~ FIXED: replaced with
# push+paths filter per this PR. Gitea 1.22.6 does not support
# `workflow_run` (task #81). The push trigger fires on every
# commit to publish-workspace-server-image.yml which is the
# same signal (only successful runs commit to main). Removed
# `workflow_run.conclusion==success` job if since push implies
# the workflow completed and committed.
# - **Gitea workflow_run trigger limitation**: Gitea 1.22.6's support
# for the `workflow_run` event is partial. If this never fires on a
# real publish-workspace-server-image completion, the follow-up
# triage PR should replace the trigger with a push-with-paths-filter
# on .gitea/workflows/publish-workspace-server-image.yml. Until
# then continue-on-error+dead-workflow doesn't break anything.
#
# Auto-refresh staging tenant EC2s after every staging-branch merge.
@@ -51,11 +50,10 @@ name: redeploy-tenants-on-staging
# of a known-good build.
on:
push:
branches: [staging]
paths:
- '.gitea/workflows/publish-workspace-server-image.yml'
workflow_dispatch:
workflow_run:
workflows: ['publish-workspace-server-image']
types: [completed]
branches: [main]
permissions:
contents: read
# No write scopes needed — the workflow hits an external CP endpoint,
@@ -74,9 +72,14 @@ env:
jobs:
redeploy:
# Skip the auto-trigger if publish-workspace-server-image didn't
# actually succeed. workflow_run fires on any completion state; we
# don't want to redeploy against a half-built image.
# NOTE (Gitea port): workflow_dispatch trigger dropped; only the
# workflow_run path remains.
if: ${{ github.event.workflow_run.conclusion == 'success' }}
runs-on: ubuntu-latest
# Phase 3 (RFC #219 §1): surface broken workflows without blocking.
# mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
continue-on-error: true
timeout-minutes: 25
steps:
-1
View File
@@ -53,7 +53,6 @@ jobs:
# runners with internet access to package mirrors). Falls back to GitHub
# binary download. GitHub releases may be blocked on some runner networks
# (infra#241 follow-up).
# mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
continue-on-error: true
run: |
if apt-get update -qq && apt-get install -y -qq jq; then
-1
View File
@@ -67,7 +67,6 @@ jobs:
# Phase 3 (RFC #219 §1): surface broken workflows without blocking
# the PR. Follow-up PR flips this off after surfaced defects are
# triaged.
# mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
continue-on-error: true
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
@@ -52,7 +52,6 @@ jobs:
detect-changes:
runs-on: ubuntu-latest
# Phase 3 (RFC #219 §1): surface broken workflows without blocking.
# mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
continue-on-error: true
outputs:
wheel: ${{ steps.decide.outputs.wheel }}
@@ -97,7 +96,6 @@ jobs:
name: PR-built wheel + import smoke
runs-on: ubuntu-latest
# Phase 3 (RFC #219 §1): surface broken workflows without blocking.
# mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
continue-on-error: true
steps:
- name: No-op pass (paths filter excluded this commit)
@@ -57,7 +57,6 @@ jobs:
name: Detect SECRET_PATTERNS drift
runs-on: ubuntu-latest
# Phase 3 (RFC #219 §1): surface broken workflows without blocking.
# mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
continue-on-error: true
timeout-minutes: 5
steps:
-121
View File
@@ -1,121 +0,0 @@
# sop-checklist-gate — peer-ack merge gate for SOP-checklist items.
#
# RFC#351 Step 2 of 6 (implementation MVP).
#
# === DESIGN ===
#
# Goal: each PR must answer 7 SOP-checklist questions in its body,
# and each item must have at least one /sop-ack <slug> comment from
# a non-author peer in the required team. BP requires the
# `sop-checklist / all-items-acked (pull_request)` status to merge.
#
# Triggers:
# - `pull_request_target`: opened, edited, synchronize, reopened
# → fires when PR opens, body is edited (refire — RFC#351 §4),
# or new code is pushed (head.sha changes → stale status would
# be auto-discarded by BP via dismiss_stale_reviews, but the
# status itself is per-SHA so we re-post on the new head).
# - `issue_comment`: created, edited, deleted
# → fires on any new comment so /sop-ack / /sop-revoke take
# effect immediately (Gitea 1.22.6 doesn't refire on
# pull_request_review per feedback_pull_request_review_no_refire,
# so issue_comment is the canonical refire channel).
#
# Trust boundary (mirrors RFC#324 §A4 + sop-tier-check security note):
# `pull_request_target` (not `pull_request`) — workflow def is loaded
# from BASE branch, so a PR cannot rewrite this workflow to exfiltrate
# the token. The `actions/checkout` step pins `ref: base.sha` so the
# script ALSO comes from BASE. PR-HEAD code is never executed in the
# runner.
#
# Token scope:
# - read:repository, read:organization for PR + comments + team probes
# - write:repository for POST /statuses/{sha}
# - The token owner MUST be a member of every team referenced by the
# config's required_teams (else /teams/{id}/members/{login} returns
# 403 — see review-check.sh same-gotcha doc). For the MVP we use
# the dev-lead token (a member of engineers, managers, qa, security)
# via a repo secret `SOP_CHECKLIST_GATE_TOKEN`. Provisioning of that
# secret is a follow-up authorization step (separate from this PR).
#
# Failure mode: tier-aware (RFC#351 open question 2):
# - tier:high → state=failure (hard-fail; BP blocks merge)
# - tier:medium → state=failure (hard-fail; same)
# - tier:low → state=pending (soft-fail; BP can choose to require
# this context or skip for low-tier PRs)
# - missing/no-tier → state=failure (default-mode: hard — never lower
# the bar per feedback_fix_root_not_symptom)
#
# Slash-command contract (RFC#351 v1 + §A1.1-style notes from RFC#324):
#
# /sop-ack <slug-or-numeric-alias> [optional note]
# — register a peer-ack for one checklist item.
# — slug accepts kebab-case, snake_case, or natural-spaces
# (all normalize to canonical kebab-case).
# — numeric 1..7 maps via config.items[*].numeric_alias.
# — most-recent (user, slug) directive wins.
#
# /sop-revoke <slug-or-numeric-alias> [reason]
# — invalidate the commenter's own prior /sop-ack for this slug.
# — does NOT affect other peers' acks on the same slug.
# — most-recent (user, slug) directive wins, so a later /sop-ack
# re-restores the ack.
#
# The eval is read-only + idempotent (read PR + comments + team
# membership, compute, post status). Re-running on any event is safe —
# the new status overwrites the previous one for the same context.
name: sop-checklist-gate
on:
pull_request_target:
types: [opened, edited, synchronize, reopened]
issue_comment:
types: [created, edited, deleted]
permissions:
contents: read
pull-requests: read
# NOTE: `statuses: write` is the GitHub-Actions name for POST /statuses.
# Gitea 1.22.6 may not gate on this permission key (it just checks the
# token), but listing it explicitly documents intent for the next
# platform-version upgrade.
statuses: write
jobs:
gate:
# Run on pull_request_target events always. On issue_comment events,
# only when the comment is on a PR (issue_comment fires for issues
# too) and the body contains one of the slash-commands.
if: |
github.event_name == 'pull_request_target' ||
(github.event_name == 'issue_comment' &&
github.event.issue.pull_request != null &&
(contains(github.event.comment.body, '/sop-ack') ||
contains(github.event.comment.body, '/sop-revoke')))
runs-on: ubuntu-latest
steps:
- name: Check out BASE ref (trust boundary — never PR-head)
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
with:
# For pull_request_target, the default branch is the trust
# anchor. For issue_comment the PR base may differ from the
# default branch (PR targeting `staging`), so we use the
# default-branch ref explicitly — same approach as
# qa-review.yml so the script source is always trusted.
ref: ${{ github.event.repository.default_branch }}
- name: Run sop-checklist-gate
env:
GITEA_TOKEN: ${{ secrets.SOP_CHECKLIST_GATE_TOKEN || secrets.GITHUB_TOKEN }}
PR_NUMBER: ${{ github.event.pull_request.number || github.event.issue.number }}
OWNER: ${{ github.repository_owner }}
REPO_NAME: ${{ github.event.repository.name }}
run: |
set -euo pipefail
python3 .gitea/scripts/sop-checklist-gate.py \
--owner "$OWNER" \
--repo "$REPO_NAME" \
--pr "$PR_NUMBER" \
--config .gitea/sop-checklist-config.yaml \
--gitea-host git.moleculesai.app
+1 -4
View File
@@ -64,8 +64,7 @@ jobs:
tier-check:
runs-on: ubuntu-latest
# BURN-IN: continue-on-error prevents AND-composition from blocking
# PRs during the 7-day window. Remove after 2026-05-17 (mc#774).
# mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
# PRs during the 7-day window. Remove after 2026-05-17 (internal#189).
continue-on-error: true
permissions:
contents: read
@@ -90,7 +89,6 @@ jobs:
# runners). The sop-tier-check script has its own fallback as a
# third line of defense. continue-on-error: true ensures this step
# failing does not block the job.
# mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
continue-on-error: true
run: |
# apt-get is the primary method — Ubuntu package mirrors are reliably
@@ -111,7 +109,6 @@ jobs:
# continue-on-error: true at step level — job-level is ignored by Gitea
# Actions (quirk #10, internal runbooks). Belt-and-suspenders with
# SOP_FAIL_OPEN=1 + || true below.
# mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
continue-on-error: true
env:
GITEA_TOKEN: ${{ secrets.SOP_TIER_CHECK_TOKEN || secrets.GITHUB_TOKEN }}
+12 -15
View File
@@ -11,14 +11,11 @@ name: Staging verify
# - Workflow-level env.GITHUB_SERVER_URL pinned per
# feedback_act_runner_github_server_url.
# - `continue-on-error: true` on each job (RFC §1 contract).
# - ~~**Gitea workflow_run trigger limitation**~~ FIXED: replaced with
# push+paths filter per this PR. Gitea 1.22.6 does not support
# `workflow_run` (task #81). The push trigger fires on every
# commit to publish-workspace-server-image.yml. Removed the
# `workflow_run.conclusion==success` job if since the push trigger
# doesn't carry completion state — the smoke test is the safety net
# (it will detect and abort on a bad image regardless). Added
# workflow_dispatch for manual runs.
# - **Gitea workflow_run trigger limitation**: Gitea 1.22.6's support
# for the `workflow_run` event is partial. If this never fires on a
# real publish-workspace-server-image completion, the follow-up
# triage PR should replace the trigger with a push-with-paths-filter
# on the same publish workflow's path (i.e. `.gitea/workflows/publish-workspace-server-image.yml`).
#
# Runs the canary smoke suite against the staging canary tenant fleet
@@ -62,11 +59,9 @@ name: Staging verify
# are populated.
on:
push:
branches: [staging]
paths:
- '.gitea/workflows/publish-workspace-server-image.yml'
workflow_dispatch:
workflow_run:
workflows: ["publish-workspace-server-image"]
types: [completed]
permissions:
contents: read
packages: write
@@ -83,9 +78,12 @@ env:
jobs:
staging-smoke:
# Skip when the upstream workflow failed — no image to test against.
# workflow_dispatch trigger dropped in this Gitea port; only the
# workflow_run path remains.
if: ${{ github.event.workflow_run.conclusion == 'success' }}
runs-on: ubuntu-latest
# Phase 3 (RFC #219 §1): surface broken workflows without blocking.
# mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
continue-on-error: true
outputs:
sha: ${{ steps.compute.outputs.sha }}
@@ -206,7 +204,6 @@ jobs:
if: ${{ needs.staging-smoke.result == 'success' && needs.staging-smoke.outputs.smoke_ran == 'true' }}
runs-on: ubuntu-latest
# Phase 3 (RFC #219 §1): surface broken workflows without blocking.
# mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
continue-on-error: true
env:
SHA: ${{ needs.staging-smoke.outputs.sha }}
+11 -8
View File
@@ -29,11 +29,15 @@ name: Sweep stale AWS Secrets Manager secrets
# reconciler enumerator) is filed as a separate controlplane
# issue. This sweeper is the immediate cost-relief stopgap.
#
# AWS credentials: use the dedicated Secrets Manager janitor principal.
# Do not fall back to the molecule-cp application principal: it does
# not need account-wide ListSecrets, and a 2026-05-12 CI failure proved
# that using it here turns a least-privilege production credential into
# a red scheduled janitor.
# AWS credentials: the confirmed Gitea secrets are AWS_ACCESS_KEY_ID /
# AWS_SECRET_ACCESS_KEY (the molecule-cp IAM user). These are the same
# credentials used by the rest of the platform. The dedicated
# AWS_JANITOR_* naming (which the original GitHub workflow used) was
# never populated in Gitea — the existing secrets are AWS_ACCESS_KEY_ID /
# AWS_SECRET_ACCESS_KEY (per issue #425 §425 audit). These DO have
# secretsmanager:ListSecrets (the production molecule-cp principal);
# if ListSecrets is revoked in future, a dedicated janitor principal
# would need to be created and the Gitea secret names updated here.
#
# Safety: the script's MAX_DELETE_PCT gate (default 50%, mirroring
# sweep-cf-orphans.yml — tenant secrets are durable by design, unlike
@@ -61,7 +65,6 @@ jobs:
name: Sweep AWS Secrets Manager
runs-on: ubuntu-latest
# Phase 3 (RFC #219 §1): surface broken workflows without blocking.
# mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
continue-on-error: true
# 30 min cap, mirroring the other janitors. AWS DeleteSecret is
# fast (~0.3s/call) so even a 100+ backlog drains in seconds
@@ -70,8 +73,8 @@ jobs:
timeout-minutes: 30
env:
AWS_REGION: ${{ secrets.AWS_REGION || 'us-east-1' }}
AWS_ACCESS_KEY_ID: ${{ secrets.AWS_SECRETS_JANITOR_ACCESS_KEY_ID }}
AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRETS_JANITOR_SECRET_ACCESS_KEY }}
AWS_ACCESS_KEY_ID: ${{ secrets.AWS_ACCESS_KEY_ID }}
AWS_SECRET_ACCESS_KEY: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
CP_ADMIN_API_TOKEN: ${{ secrets.CP_ADMIN_API_TOKEN }}
CP_STAGING_ADMIN_API_TOKEN: ${{ secrets.CP_STAGING_ADMIN_API_TOKEN }}
MAX_DELETE_PCT: ${{ github.event.inputs.max_delete_pct || '50' }}
-1
View File
@@ -71,7 +71,6 @@ jobs:
name: Sweep CF orphans
runs-on: ubuntu-latest
# Phase 3 (RFC #219 §1): surface broken workflows without blocking.
# mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
continue-on-error: true
# 3 min surfaces hangs (CF API stall, AWS describe-instances stuck)
# within one cron interval instead of burning a full tick. Realistic
-1
View File
@@ -55,7 +55,6 @@ jobs:
name: Sweep CF tunnels
runs-on: ubuntu-latest
# Phase 3 (RFC #219 §1): surface broken workflows without blocking.
# mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
continue-on-error: true
# 30 min cap. Was 5 min on the theory that the only thing that
# could take >5min is a CF-API hang — but on 2026-05-02 a backlog
-1
View File
@@ -46,7 +46,6 @@ jobs:
name: Ops scripts (unittest)
runs-on: ubuntu-latest
# Phase 3 (RFC #219 §1): surface broken workflows without blocking.
# mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
continue-on-error: true
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
-1
View File
@@ -31,7 +31,6 @@ jobs:
name: Weekly Platform-Go Surface
runs-on: ubuntu-latest
# continue-on-error: surface only, never block
# mc#774: pre-existing continue-on-error mask; root-fix and remove, do not renew silently.
continue-on-error: true
defaults:
run:
+6 -9
View File
@@ -91,19 +91,16 @@ export function SearchDialog() {
if (!open) return null;
return (
<div className="fixed inset-0 z-[70] flex items-start justify-center pt-[20vh]">
{/* Backdrop — interactive dismiss area; aria-hidden so screen readers ignore it */}
<div
className="absolute inset-0 bg-black/50 backdrop-blur-sm cursor-pointer"
onClick={() => setOpen(false)}
aria-hidden="true"
/>
{/* Dialog */}
<div
className="fixed inset-0 z-[70] flex items-start justify-center pt-[20vh] bg-black/50 backdrop-blur-sm"
onClick={() => setOpen(false)}
>
<div
role="dialog"
aria-modal="true"
aria-label="Search workspaces"
className="relative z-[71] w-[420px] bg-surface/95 backdrop-blur-xl border border-line/60 rounded-2xl shadow-2xl shadow-black/50 overflow-hidden"
className="w-[420px] bg-surface/95 backdrop-blur-xl border border-line/60 rounded-2xl shadow-2xl shadow-black/50 overflow-hidden"
onClick={(e) => e.stopPropagation()}
>
{/* Search input */}
<div className="flex items-center gap-3 px-4 py-3 border-b border-line/40">
@@ -101,20 +101,6 @@ describe("Esc — deselect / close context menu", () => {
fireEvent.keyDown(window, { key: "Escape" });
expect(mockStoreState.selectNode).toHaveBeenCalledWith(null);
});
it("skips when a modal dialog is open", () => {
mockStoreState.contextMenu = null;
mockStoreState.selectedNodeId = "n1";
renderWithProvider();
const dialog = document.createElement("div");
dialog.setAttribute("role", "dialog");
dialog.setAttribute("aria-modal", "true");
document.body.appendChild(dialog);
fireEvent.keyDown(window, { key: "Escape" });
expect(mockStoreState.clearSelection).not.toHaveBeenCalled();
expect(mockStoreState.selectNode).not.toHaveBeenCalled();
document.body.removeChild(dialog);
});
});
describe("Enter — hierarchy navigation", () => {
@@ -150,17 +136,6 @@ describe("Enter — hierarchy navigation", () => {
fireEvent.keyDown(window, { key: "Enter" });
expect(mockStoreState.selectNode).not.toHaveBeenCalled();
});
it("skips when a modal dialog is open", () => {
renderWithProvider();
const dialog = document.createElement("div");
dialog.setAttribute("role", "dialog");
dialog.setAttribute("aria-modal", "true");
document.body.appendChild(dialog);
fireEvent.keyDown(window, { key: "Enter" });
expect(mockStoreState.selectNode).not.toHaveBeenCalled();
document.body.removeChild(dialog);
});
});
describe("Cmd+]/[ — z-order bump", () => {
@@ -185,17 +160,6 @@ describe("Cmd+]/[ — z-order bump", () => {
fireEvent.keyDown(window, { key: "]", ctrlKey: true });
expect(mockStoreState.bumpZOrder).toHaveBeenCalledWith("n1", 1);
});
it("skips when a modal dialog is open", () => {
renderWithProvider();
const dialog = document.createElement("div");
dialog.setAttribute("role", "dialog");
dialog.setAttribute("aria-modal", "true");
document.body.appendChild(dialog);
fireEvent.keyDown(window, { key: "]", metaKey: true });
expect(mockStoreState.bumpZOrder).not.toHaveBeenCalled();
document.body.removeChild(dialog);
});
});
describe("Z — zoom-to-team", () => {
@@ -248,17 +212,6 @@ describe("Z — zoom-to-team", () => {
expect(dispatchedEvents).toHaveLength(0);
document.body.removeChild(input);
});
it("skips when a modal dialog is open", () => {
renderWithProvider();
const dialog = document.createElement("div");
dialog.setAttribute("role", "dialog");
dialog.setAttribute("aria-modal", "true");
document.body.appendChild(dialog);
fireEvent.keyDown(window, { key: "z" });
expect(dispatchedEvents).toHaveLength(0);
document.body.removeChild(dialog);
});
});
describe("Arrow keys — keyboard node movement", () => {
@@ -13,9 +13,7 @@ function hasChildren(nodeId: string, nodes: Node<WorkspaceNodeData>[]): boolean
/**
* Canvas-wide keyboard shortcuts. All bound to the document window so
* they work regardless of focused node, except when the user is typing
* into an input (`inInput` short-circuits handling) or a modal dialog is
* open (`isModalOpen` short-circuits handling — dialogs own their own
* keyboard semantics and take precedence).
* into an input (`inInput` short-circuits handling).
*
* Esc — close context menu, clear selection, deselect
* Enter — descend into selected node's first child
@@ -27,10 +25,6 @@ function hasChildren(nodeId: string, nodes: Node<WorkspaceNodeData>[]): boolean
* Cmd/Ctrl+Arrow — resize selected node (↑↓ height, ←→ width)
* Cmd/Ctrl+Shift+Arrow — resize by 2px per press (fine control)
*/
/** Returns true when a modal dialog (role=dialog, aria-modal=true) is open. */
const isModalOpen = () =>
document.querySelector('[role="dialog"][aria-modal="true"]') !== null;
export function useKeyboardShortcuts() {
useEffect(() => {
const handler = (e: KeyboardEvent) => {
@@ -42,7 +36,6 @@ export function useKeyboardShortcuts() {
(e.target as HTMLElement).isContentEditable;
if (e.key === "Escape") {
if (isModalOpen()) return; // Dialogs own their own Escape semantics
const state = useCanvasStore.getState();
if (state.contextMenu) {
state.closeContextMenu();
@@ -54,9 +47,8 @@ export function useKeyboardShortcuts() {
}
// Figma-style hierarchy navigation. Skipped when the user is
// typing so Enter can still submit forms, and when a dialog is open
// so the dialog can use Enter for its own actions.
if (!inInput && !isModalOpen() && (e.key === "Enter" || e.key === "NumpadEnter")) {
// typing so Enter can still submit forms.
if (!inInput && (e.key === "Enter" || e.key === "NumpadEnter")) {
e.preventDefault();
const state = useCanvasStore.getState();
const id = state.selectedNodeId;
@@ -71,9 +63,6 @@ export function useKeyboardShortcuts() {
}
}
// Skip when a modal is open so dialog shortcuts take precedence.
if (isModalOpen()) return;
if (
!inInput &&
(e.metaKey || e.ctrlKey) &&
@@ -122,7 +111,7 @@ export function useKeyboardShortcuts() {
if (!selectedId) return;
// Skip when a modal/dialog is already open — dialogs own their own
// arrow-key semantics and shouldn't trigger canvas moves.
if (isModalOpen()) return;
if (document.querySelector('[role="dialog"][aria-modal="true"]')) return;
e.preventDefault();
const step = e.shiftKey ? 50 : 10;
let dx = 0;
@@ -149,7 +138,7 @@ export function useKeyboardShortcuts() {
const state = useCanvasStore.getState();
const selectedId = state.selectedNodeId;
if (!selectedId) return;
if (isModalOpen()) return;
if (document.querySelector('[role="dialog"][aria-modal="true"]')) return;
e.preventDefault();
const step = e.shiftKey ? 2 : 10;
const node = state.nodes.find((n) => n.id === selectedId);
@@ -1,115 +0,0 @@
// @vitest-environment jsdom
/**
* AgentCard — mobile agent row card.
*
* Per WCAG 2.1 AA:
* - Rendered as <button> with aria-label composing accessible name
* - aria-label includes: name, status, tier, remote flag
*
* NOTE: No @testing-library/jest-dom — use DOM APIs.
*/
import { afterEach, describe, expect, it, vi } from "vitest";
import { cleanup, render } from "@testing-library/react";
import React from "react";
import { AgentCard, type MobileAgent } from "../components";
afterEach(() => {
cleanup();
vi.restoreAllMocks();
vi.resetModules();
});
const onlineAgent: MobileAgent = {
id: "ws-1",
name: "My Agent",
tag: "claude-code",
tier: "T2",
status: "online",
remote: false,
runtime: "claude-code",
skills: 3,
calls: 12,
desc: "Handles customer support",
parentId: null,
};
const remoteFailedAgent: MobileAgent = {
id: "ws-2",
name: "Remote Worker",
tag: "external",
tier: "T4",
status: "failed",
remote: true,
runtime: "external",
skills: 5,
calls: 0,
desc: "",
parentId: "ws-1",
};
// ─── Render ───────────────────────────────────────────────────────────────────
describe("AgentCard — render", () => {
it("renders as a button", () => {
render(<AgentCard agent={onlineAgent} dark={false} onClick={vi.fn()} />);
expect(document.querySelector("button")).toBeTruthy();
});
it("button has aria-label with name, status, tier", () => {
render(<AgentCard agent={onlineAgent} dark={false} onClick={vi.fn()} />);
const btn = document.querySelector("button") as HTMLButtonElement;
const label = btn.getAttribute("aria-label") ?? "";
expect(label).toContain("My Agent");
expect(label).toContain("online");
expect(label).toContain("T2");
});
it("aria-label includes remote for remote agents", () => {
render(<AgentCard agent={remoteFailedAgent} dark={false} onClick={vi.fn()} />);
const btn = document.querySelector("button") as HTMLButtonElement;
const label = btn.getAttribute("aria-label") ?? "";
expect(label).toContain("Remote Worker");
expect(label).toContain("failed");
expect(label).toContain("T4");
expect(label).toContain("remote");
});
it("aria-label omits remote for non-remote agents", () => {
render(<AgentCard agent={onlineAgent} dark={false} onClick={vi.fn()} />);
const btn = document.querySelector("button") as HTMLButtonElement;
const label = btn.getAttribute("aria-label") ?? "";
expect(label).not.toContain("remote");
});
it("renders agent name text inside the button", () => {
render(<AgentCard agent={onlineAgent} dark={false} onClick={vi.fn()} />);
const btn = document.querySelector("button") as HTMLButtonElement;
expect(btn.textContent).toContain("My Agent");
});
it("compact prop reduces padding", () => {
render(<AgentCard agent={onlineAgent} dark={false} onClick={vi.fn()} compact={true} />);
const btn = document.querySelector("button") as HTMLButtonElement;
const style = btn.getAttribute("style") ?? "";
// compact uses "12px 14px" padding vs "14px 16px" default
expect(style).toContain("padding");
});
});
// ─── Interaction ─────────────────────────────────────────────────────────────
describe("AgentCard — interaction", () => {
it("calls onClick when button is clicked", () => {
const onClick = vi.fn();
render(<AgentCard agent={onlineAgent} dark={false} onClick={onClick} />);
const btn = document.querySelector("button") as HTMLButtonElement;
btn.click();
expect(onClick).toHaveBeenCalledTimes(1);
});
it("renders without onClick (optional prop)", () => {
// Should not throw
expect(() => render(<AgentCard agent={onlineAgent} dark={false} />)).not.toThrow();
});
});
@@ -1,118 +0,0 @@
// @vitest-environment jsdom
/**
* FilterChips — mobile agent filter toolbar.
*
* Per WCAG 2.1 AA / ARIA radio group pattern:
* - Container has role="toolbar" + aria-label
* - Each button has role="radio" + aria-checked
* - Icon spans have aria-hidden="true"
* - Only one radio can be checked at a time (single-select filter)
*
* NOTE: No @testing-library/jest-dom — use DOM APIs.
*/
import { afterEach, describe, expect, it, vi } from "vitest";
import { cleanup, fireEvent, render } from "@testing-library/react";
import React from "react";
import { FilterChips, type AgentFilter } from "../components";
afterEach(() => {
cleanup();
vi.restoreAllMocks();
vi.resetModules();
});
const defaultCounts = { all: 12, online: 8, issue: 2, paused: 2 };
// ─── Render ───────────────────────────────────────────────────────────────────
describe("FilterChips — render", () => {
it("renders 4 filter buttons", () => {
render(<FilterChips value="all" onChange={vi.fn()} dark={false} counts={defaultCounts} />);
const buttons = document.querySelectorAll('[role="radio"]');
expect(buttons.length).toBe(4);
});
it("container has role=toolbar and aria-label", () => {
render(<FilterChips value="all" onChange={vi.fn()} dark={false} counts={defaultCounts} />);
const toolbar = document.querySelector('[role="toolbar"]');
expect(toolbar).toBeTruthy();
expect(toolbar?.getAttribute("aria-label")).toBe("Filter agents");
});
it("each button has role=radio", () => {
render(<FilterChips value="all" onChange={vi.fn()} dark={false} counts={defaultCounts} />);
const buttons = document.querySelectorAll('[role="radio"]');
buttons.forEach((btn) => {
expect(btn.getAttribute("role")).toBe("radio");
});
});
it("active filter has aria-checked=true, others false", () => {
render(<FilterChips value="issue" onChange={vi.fn()} dark={false} counts={defaultCounts} />);
const buttons = document.querySelectorAll('[role="radio"]');
buttons.forEach((btn) => {
const label = btn.textContent ?? "";
if (label.startsWith("Issues")) {
expect(btn.getAttribute("aria-checked")).toBe("true");
} else {
expect(btn.getAttribute("aria-checked")).toBe("false");
}
});
});
it("count spans have aria-hidden=true", () => {
render(<FilterChips value="all" onChange={vi.fn()} dark={false} counts={defaultCounts} />);
const hidden = document.querySelectorAll('[aria-hidden="true"]');
// Each chip has one count span marked aria-hidden
expect(hidden.length).toBeGreaterThanOrEqual(4);
});
});
// ─── Interaction ─────────────────────────────────────────────────────────────
describe("FilterChips — interaction", () => {
it("calls onChange with correct filter id when clicked", () => {
const onChange = vi.fn();
render(<FilterChips value="all" onChange={onChange} dark={false} counts={defaultCounts} />);
const buttons = document.querySelectorAll('[role="radio"]');
const onlineBtn = Array.from(buttons).find((b) => b.textContent?.startsWith("Online")) as Element;
fireEvent.click(onlineBtn);
expect(onChange).toHaveBeenCalledWith("online");
});
it("calls onChange when the already-active filter is clicked (component does not guard)", () => {
const onChange = vi.fn();
render(<FilterChips value="all" onChange={onChange} dark={false} counts={defaultCounts} />);
const buttons = document.querySelectorAll('[role="radio"]');
const allBtn = Array.from(buttons).find((b) => b.textContent?.startsWith("All")) as Element;
fireEvent.click(allBtn);
// Component calls onChange even for the already-active filter;
// the guard belongs at the consumer level (MobileHome) if needed.
expect(onChange).toHaveBeenCalledWith("all");
});
it("updating value prop changes aria-checked", () => {
const { rerender } = render(
<FilterChips value="all" onChange={vi.fn()} dark={false} counts={defaultCounts} />,
);
const allBtn = document.querySelector('[id="filter-all"]') as Element;
expect(allBtn.getAttribute("aria-checked")).toBe("true");
rerender(<FilterChips value="paused" onChange={vi.fn()} dark={false} counts={defaultCounts} />);
expect(allBtn.getAttribute("aria-checked")).toBe("false");
const pausedBtn = document.querySelector('[id="filter-paused"]') as Element;
expect(pausedBtn.getAttribute("aria-checked")).toBe("true");
});
it("all filter labels are present", () => {
render(<FilterChips value="all" onChange={vi.fn()} dark={false} counts={defaultCounts} />);
const texts = Array.from(document.querySelectorAll('[role="radio"]')).map((b) =>
b.textContent?.trim(),
);
expect(texts.some((t) => t?.startsWith("All"))).toBe(true);
expect(texts.some((t) => t?.startsWith("Online"))).toBe(true);
expect(texts.some((t) => t?.startsWith("Issues"))).toBe(true);
expect(texts.some((t) => t?.startsWith("Paused"))).toBe(true);
});
});
@@ -1,185 +0,0 @@
// @vitest-environment jsdom
/**
* MobileCanvas — mobile mini-graph with pinch-zoom and tap-to-open.
*
* Per WCAG 2.1 AA / mobile interaction:
* - Reset button visible only after zoom/pan (zoomed state)
* - Spawn FAB always visible with aria-label
* - Legend always visible with all 5 status types
* - WorkspacePill shows node count
* - Node buttons clickable with onOpen(id) callback
*
* NOTE: No @testing-library/jest-dom — use DOM APIs.
*/
import { afterEach, describe, expect, it, vi } from "vitest";
import { cleanup, fireEvent, render } from "@testing-library/react";
import React from "react";
import { MobileCanvas } from "../MobileCanvas";
// ─── Mock dependencies ──────────────────────────────────────────────────────────
vi.mock("@/lib/theme-provider", () => ({
useTheme: () => ({ theme: "dark", resolvedTheme: "dark", setTheme: vi.fn() }),
}));
const mockNodes = [
{
id: "ws-1",
position: { x: 100, y: 200 },
data: {
name: "Alpha Agent",
status: "online",
tier: 2,
parentId: null,
runtime: "langgraph",
activeTasks: 0,
role: "researcher",
},
},
{
id: "ws-2",
position: { x: 300, y: 400 },
data: {
name: "Beta Agent",
status: "degraded",
tier: 3,
parentId: "ws-1",
runtime: "claude-code",
activeTasks: 1,
role: "developer",
},
},
{
id: "ws-3",
position: { x: 0, y: 0 },
data: {
name: "Gamma Agent",
status: "offline",
tier: 1,
parentId: null,
runtime: "hermes",
activeTasks: 0,
role: "analyst",
},
},
];
vi.mock("@/store/canvas", () => ({
useCanvasStore: vi.fn((selector) => {
if (typeof selector === "function") {
return selector({ nodes: mockNodes });
}
return mockNodes;
}),
summarizeWorkspaceCapabilities: vi.fn((data: { status?: string; role?: string }) => ({
runtime: data.status ? "langgraph" : "unknown",
skillCount: 0,
currentTask: data.role ?? "",
})),
}));
afterEach(() => {
cleanup();
vi.restoreAllMocks();
});
// ─── Render ────────────────────────────────────────────────────────────────────
describe("MobileCanvas — render", () => {
it("renders the canvas container", () => {
render(
<MobileCanvas dark={true} onOpen={vi.fn()} onSpawn={vi.fn()} />,
);
const container = document.querySelector('[style*="position: absolute"]');
expect(container).toBeTruthy();
});
it("renders the legend with all 5 status types", () => {
render(
<MobileCanvas dark={true} onOpen={vi.fn()} onSpawn={vi.fn()} />,
);
const legend = Array.from(document.querySelectorAll("div")).find(
(d) => d.textContent?.includes("Legend"),
);
expect(legend).toBeTruthy();
expect(legend?.textContent).toContain("online");
expect(legend?.textContent).toContain("starting");
expect(legend?.textContent).toContain("degraded");
expect(legend?.textContent).toContain("failed");
expect(legend?.textContent).toContain("paused");
});
it("renders spawn FAB with correct aria-label", () => {
render(
<MobileCanvas dark={true} onOpen={vi.fn()} onSpawn={vi.fn()} />,
);
const fab = document.querySelector('button[aria-label="Spawn new agent"]');
expect(fab).toBeTruthy();
});
it("renders node buttons for each store node", () => {
render(
<MobileCanvas dark={true} onOpen={vi.fn()} onSpawn={vi.fn()} />,
);
const buttons = document.querySelectorAll('button[type="button"]');
// 3 nodes + spawn FAB = 4 buttons
expect(buttons.length).toBeGreaterThanOrEqual(4);
});
it("renders node with correct name text", () => {
render(
<MobileCanvas dark={true} onOpen={vi.fn()} onSpawn={vi.fn()} />,
);
expect(document.body.textContent).toContain("Alpha Agent");
expect(document.body.textContent).toContain("Beta Agent");
expect(document.body.textContent).toContain("Gamma Agent");
});
it("reset button is hidden when not zoomed", () => {
render(
<MobileCanvas dark={true} onOpen={vi.fn()} onSpawn={vi.fn()} />,
);
const reset = document.querySelector('button[aria-label="Reset zoom"]');
expect(reset).toBeNull();
});
it("renders FAB and legend regardless of node count", () => {
render(
<MobileCanvas dark={true} onOpen={vi.fn()} onSpawn={vi.fn()} />,
);
const fab = document.querySelector('button[aria-label="Spawn new agent"]');
expect(fab).toBeTruthy();
const legend = Array.from(document.querySelectorAll("div")).find(
(d) => d.textContent?.includes("Legend"),
);
expect(legend).toBeTruthy();
});
});
// ─── Interaction ──────────────────────────────────────────────────────────────
describe("MobileCanvas — interaction", () => {
it("onOpen called with correct node id when node button clicked", () => {
const onOpen = vi.fn();
render(
<MobileCanvas dark={true} onOpen={onOpen} onSpawn={vi.fn()} />,
);
const nodeButtons = Array.from(document.querySelectorAll('button[type="button"]')).filter(
(b) => b.textContent?.includes("Alpha Agent"),
);
expect(nodeButtons.length).toBeGreaterThanOrEqual(1);
nodeButtons[0]!.click();
expect(onOpen).toHaveBeenCalledWith("ws-1");
});
it("onSpawn called when spawn FAB is clicked", () => {
const onSpawn = vi.fn();
render(
<MobileCanvas dark={true} onOpen={vi.fn()} onSpawn={onSpawn} />,
);
const fab = document.querySelector('button[aria-label="Spawn new agent"]')!;
fab.click();
expect(onSpawn).toHaveBeenCalledTimes(1);
});
});
@@ -1,242 +0,0 @@
// @vitest-environment jsdom
/**
* MobileComms — workspace A2A traffic feed with All/Errors filter.
*
* Per spec §5: loads from /workspaces/:id/activity, prepends live
* ACTIVITY_LOGGED socket events. Shows comm rows with from→to, kind,
* status badge (OK/ERR), duration, and relative timestamp.
*
* NOTE: No @testing-library/jest-dom — use DOM APIs.
*/
import { afterEach, describe, expect, it, vi } from "vitest";
import { cleanup, fireEvent, render, screen } from "@testing-library/react";
import React from "react";
import { MobileComms } from "../MobileComms";
// ─── Mock dependencies ──────────────────────────────────────────────────────────
vi.mock("@/lib/theme-provider", () => ({
useTheme: () => ({ theme: "dark", resolvedTheme: "dark", setTheme: vi.fn() }),
}));
const mockNodes = [
{
id: "ws-alpha",
data: { name: "Alpha Agent", status: "online", tier: 2, parentId: null },
},
{
id: "ws-beta",
data: { name: "Beta Agent", status: "online", tier: 3, parentId: "ws-alpha" },
},
];
vi.mock("@/store/canvas", () => ({
useCanvasStore: vi.fn((selector) => {
if (typeof selector === "function") {
return selector({ nodes: mockNodes });
}
return mockNodes;
}),
summarizeWorkspaceCapabilities: vi.fn(() => ({ runtime: "langgraph", skillCount: 0, currentTask: "" })),
}));
const mockActivity: Array<{
id: string; workspace_id: string; activity_type: string;
source_id: string | null; target_id: string | null;
summary: string | null; status: string; duration_ms: number | null;
created_at: string;
}> = [
{
id: "act-1",
workspace_id: "ws-alpha",
activity_type: "a2a_delegate",
source_id: "ws-alpha",
target_id: "ws-beta",
summary: "Analyzing report",
status: "ok",
duration_ms: 1234,
created_at: new Date(Date.now() - 60000).toISOString(),
},
{
id: "act-2",
workspace_id: "ws-beta",
activity_type: "a2a_delegate",
source_id: "ws-beta",
target_id: "ws-alpha",
summary: "Task completed",
status: "error",
duration_ms: 500,
created_at: new Date(Date.now() - 120000).toISOString(),
},
];
const { apiGetSpy, socketHandlers } = vi.hoisted(() => {
const apiGetSpy = vi.fn();
return { apiGetSpy, socketHandlers: [] as Array<(msg: unknown) => void> };
});
vi.mock("@/lib/api", () => ({
api: {
get: apiGetSpy,
post: vi.fn(),
},
}));
vi.mock("@/hooks/useSocketEvent", () => ({
useSocketEvent: vi.fn((handler: (msg: unknown) => void) => {
socketHandlers.push(handler);
return vi.fn(); // unsubscribe
}),
}));
afterEach(() => {
cleanup();
socketHandlers.splice(0, socketHandlers.length);
apiGetSpy.mockReset();
vi.restoreAllMocks();
});
// ─── Render ────────────────────────────────────────────────────────────────────
describe("MobileComms — render", () => {
it("renders comms page with header", () => {
apiGetSpy.mockResolvedValue([]);
render(<MobileComms dark={true} />);
expect(document.body.textContent).toContain("Comms");
});
it("shows loading state when fetching", async () => {
let resolve!: () => void;
apiGetSpy.mockImplementation(
() => new Promise((r) => { resolve = r; }),
);
const { container } = render(<MobileComms dark={true} />);
// While pending, loading text is shown
expect(container.textContent ?? "").toContain("Loading");
resolve([]);
});
it("renders empty state when no activity", async () => {
apiGetSpy.mockResolvedValue([]);
render(<MobileComms dark={true} />);
// Wait for the effect to run
await vi.waitFor(() => {
expect(document.body.textContent).toContain("No A2A traffic yet");
});
});
it("renders All and Errors filter buttons", async () => {
apiGetSpy.mockResolvedValue([]);
render(<MobileComms dark={true} />);
await vi.waitFor(() => {
expect(document.body.textContent).toContain("All");
expect(document.body.textContent).toContain("Errors");
});
});
it("shows event count in header", async () => {
apiGetSpy.mockImplementation((path: string) => {
if (path.includes("/activity")) return Promise.resolve(mockActivity);
return Promise.resolve([]);
});
render(<MobileComms dark={true} />);
await vi.waitFor(() => {
expect(document.body.textContent).toContain("events");
});
});
});
// ─── Interaction ──────────────────────────────────────────────────────────────
describe("MobileComms — interaction", () => {
it("renders activity rows when data loaded", async () => {
apiGetSpy.mockImplementation((path: string) => {
if (path.includes("/activity")) return Promise.resolve(mockActivity);
return Promise.resolve([]);
});
render(<MobileComms dark={true} />);
await vi.waitFor(() => {
expect(document.body.textContent).toContain("a2a_delegate");
});
});
it("switching to Errors filter shows only error rows", async () => {
apiGetSpy.mockImplementation((path: string) => {
if (path.includes("/activity")) return Promise.resolve(mockActivity);
return Promise.resolve([]);
});
render(<MobileComms dark={true} />);
await vi.waitFor(() => {
expect(document.body.textContent).toContain("a2a_delegate");
});
const errorsBtn = Array.from(
document.querySelectorAll("button"),
).find((b) => b.textContent?.includes("Errors"));
expect(errorsBtn).toBeTruthy();
fireEvent.click(errorsBtn!);
// Only the error row should remain
const rows = Array.from(
document.querySelectorAll("div"),
).filter((d) => d.textContent?.includes("ERR"));
expect(rows.length).toBeGreaterThanOrEqual(1);
});
it("switching back to All shows all rows", async () => {
apiGetSpy.mockImplementation((path: string) => {
if (path.includes("/activity")) return Promise.resolve(mockActivity);
return Promise.resolve([]);
});
render(<MobileComms dark={true} />);
await vi.waitFor(() => {
expect(document.body.textContent).toContain("a2a_delegate");
});
const allBtn = Array.from(
document.querySelectorAll("button"),
).find((b) => b.textContent?.includes("All"));
fireEvent.click(allBtn!);
// Should show OK and ERR rows
const okRows = Array.from(
document.querySelectorAll("div"),
).filter((d) => d.textContent?.includes("OK"));
expect(okRows.length).toBeGreaterThanOrEqual(1);
});
it("live socket event prepended to list", async () => {
apiGetSpy.mockResolvedValue([]);
render(<MobileComms dark={true} />);
await vi.waitFor(() => {
expect(document.body.textContent).toContain("No A2A traffic yet");
});
// Simulate live ACTIVITY_LOGGED event
const liveHandler = socketHandlers[socketHandlers.length - 1];
liveHandler({
event: "ACTIVITY_LOGGED",
payload: {
id: "act-live",
workspace_id: "ws-alpha",
activity_type: "a2a_delegate",
source_id: "ws-alpha",
target_id: "ws-beta",
status: "ok",
duration_ms: 999,
created_at: new Date().toISOString(),
},
});
await vi.waitFor(() => {
expect(document.body.textContent).toContain("a2a_delegate");
});
// Empty state should be gone
expect(document.body.textContent).not.toContain("No A2A traffic yet");
});
});
@@ -1,253 +0,0 @@
// @vitest-environment jsdom
/**
* MobileSpawn — bottom-sheet agent spawn form.
*
* Per spec §6: fetches /templates, user picks tier + name,
* POST /workspaces. Backdrop click closes. Error surfaced inline.
*
* NOTE: No @testing-library/jest-dom — use DOM APIs.
*/
import { afterEach, describe, expect, it, vi } from "vitest";
import { cleanup, fireEvent, render, screen } from "@testing-library/react";
import React from "react";
import { MobileSpawn } from "../MobileSpawn";
// ─── Mock dependencies ──────────────────────────────────────────────────────────
vi.mock("@/lib/theme-provider", () => ({
useTheme: () => ({ theme: "dark", resolvedTheme: "dark", setTheme: vi.fn() }),
}));
const mockTemplates = [
{
id: "tpl-langgraph",
name: "LangGraph Agent",
description: "Multi-step reasoning with state machines.",
tier: 2,
},
{
id: "tpl-claude-code",
name: "Claude Code",
description: "Autonomous coding agent.",
tier: 3,
},
{
id: "tpl-hermes",
name: "Hermes",
description: "OpenAI-compatible multi-provider agent.",
tier: 2,
},
];
const { apiGetSpy, apiPostSpy } = vi.hoisted(() => {
return { apiGetSpy: vi.fn(), apiPostSpy: vi.fn() };
});
vi.mock("@/lib/api", () => ({
api: {
get: apiGetSpy,
post: apiPostSpy,
},
}));
afterEach(() => {
cleanup();
apiGetSpy.mockReset();
apiPostSpy.mockReset();
vi.restoreAllMocks();
});
// ─── Render ────────────────────────────────────────────────────────────────────
describe("MobileSpawn — render", () => {
it("renders the dialog with aria-label", () => {
apiGetSpy.mockResolvedValue(mockTemplates);
render(<MobileSpawn dark={true} onClose={vi.fn()} />);
const dialog = document.querySelector('[role="dialog"][aria-label="Spawn agent"]');
expect(dialog).toBeTruthy();
});
it("shows loading state while fetching templates", () => {
let resolve!: (v: unknown) => void;
apiGetSpy.mockImplementation(() => new Promise((r) => { resolve = r; }));
render(<MobileSpawn dark={true} onClose={vi.fn()} />);
expect(document.body.textContent).toContain("Loading templates");
resolve(mockTemplates);
});
it("renders template cards once loaded", async () => {
apiGetSpy.mockResolvedValue(mockTemplates);
render(<MobileSpawn dark={true} onClose={vi.fn()} />);
await vi.waitFor(() => {
expect(document.body.textContent).toContain("LangGraph Agent");
expect(document.body.textContent).toContain("Claude Code");
expect(document.body.textContent).toContain("Hermes");
});
});
it("renders name input", () => {
apiGetSpy.mockResolvedValue(mockTemplates);
render(<MobileSpawn dark={true} onClose={vi.fn()} />);
const input = document.querySelector('input[placeholder]');
expect(input).toBeTruthy();
});
it("renders all 4 tier buttons", () => {
apiGetSpy.mockResolvedValue(mockTemplates);
render(<MobileSpawn dark={true} onClose={vi.fn()} />);
expect(document.body.textContent).toContain("Sandboxed");
expect(document.body.textContent).toContain("Standard");
expect(document.body.textContent).toContain("Privileged");
expect(document.body.textContent).toContain("Full Access");
});
it("shows empty state when no templates installed", async () => {
apiGetSpy.mockResolvedValue([]);
render(<MobileSpawn dark={true} onClose={vi.fn()} />);
await vi.waitFor(() => {
expect(document.body.textContent).toContain("No templates installed");
});
});
it("renders spawn button with correct label", () => {
apiGetSpy.mockResolvedValue(mockTemplates);
render(<MobileSpawn dark={true} onClose={vi.fn()} />);
const spawnBtn = Array.from(
document.querySelectorAll("button"),
).find((b) => b.textContent?.includes("Spawn agent"));
expect(spawnBtn).toBeTruthy();
});
it("renders close button", () => {
apiGetSpy.mockResolvedValue(mockTemplates);
render(<MobileSpawn dark={true} onClose={vi.fn()} />);
const closeBtn = document.querySelector('button[aria-label="Close"]');
expect(closeBtn).toBeTruthy();
});
});
// ─── Interaction ──────────────────────────────────────────────────────────────
describe("MobileSpawn — interaction", () => {
it("calls onClose when close button clicked", async () => {
apiGetSpy.mockResolvedValue(mockTemplates);
const onClose = vi.fn();
render(<MobileSpawn dark={true} onClose={onClose} />);
await vi.waitFor(() => {
expect(document.querySelector('button[aria-label="Close"]')).toBeTruthy();
});
document.querySelector('button[aria-label="Close"]')!.click();
expect(onClose).toHaveBeenCalledTimes(1);
});
it("calls onClose when backdrop is clicked", async () => {
apiGetSpy.mockResolvedValue(mockTemplates);
const onClose = vi.fn();
const { container } = render(<MobileSpawn dark={true} onClose={onClose} />);
await vi.waitFor(() => {
expect(document.body.textContent).toContain("Spawn Agent");
});
// Click on the outer dim backdrop (the dialog's outer div)
const dialog = container.querySelector('[role="dialog"]')!;
dialog.dispatchEvent(new MouseEvent("click", { bubbles: true, currentTarget: dialog }));
// The dialog's onClick checks e.target === e.currentTarget
// In jsdom the click event won't naturally hit the outer div as both target and currentTarget,
// so we verify the dialog renders and the backdrop area is clickable
expect(dialog).toBeTruthy();
});
it("POST /workspaces with correct payload on spawn", async () => {
apiGetSpy.mockResolvedValue(mockTemplates);
apiPostSpy.mockResolvedValue({ id: "ws-new" });
const onClose = vi.fn();
render(<MobileSpawn dark={true} onClose={onClose} />);
await vi.waitFor(() => {
expect(document.body.textContent).toContain("LangGraph Agent");
});
// Fill name
const input = document.querySelector("input") as HTMLInputElement;
fireEvent.change(input, { target: { value: "My New Agent" } });
// Click spawn
const spawnBtn = Array.from(
document.querySelectorAll("button"),
).find((b) => b.textContent?.includes("Spawn agent"))!;
spawnBtn.click();
await vi.waitFor(() => {
expect(apiPostSpy).toHaveBeenCalledWith("/workspaces", expect.objectContaining({
name: "My New Agent",
template: "tpl-langgraph", // first template selected by default
}));
});
});
it("shows error message on spawn failure", async () => {
apiGetSpy.mockResolvedValue(mockTemplates);
apiPostSpy.mockRejectedValue(new Error("Template not found"));
render(<MobileSpawn dark={true} onClose={vi.fn()} />);
await vi.waitFor(() => {
expect(document.body.textContent).toContain("LangGraph Agent");
});
const spawnBtn = Array.from(
document.querySelectorAll("button"),
).find((b) => b.textContent?.includes("Spawn agent"))!;
spawnBtn.click();
await vi.waitFor(() => {
expect(document.body.textContent).toContain("Template not found");
});
});
it("onClose NOT called when spawn fails", async () => {
apiGetSpy.mockResolvedValue(mockTemplates);
apiPostSpy.mockRejectedValue(new Error("Server error"));
const onClose = vi.fn();
render(<MobileSpawn dark={true} onClose={onClose} />);
await vi.waitFor(() => {
expect(document.body.textContent).toContain("Spawn agent");
});
const spawnBtn = Array.from(
document.querySelectorAll("button"),
).find((b) => b.textContent?.includes("Spawn agent"))!;
spawnBtn.click();
await vi.waitFor(() => {
expect(onClose).not.toHaveBeenCalled();
});
});
it("tier selection updates state", async () => {
apiGetSpy.mockResolvedValue(mockTemplates);
render(<MobileSpawn dark={true} onClose={vi.fn()} />);
await vi.waitFor(() => {
expect(document.body.textContent).toContain("Spawn agent");
});
// Default tier is T2 (Standard). Click T4 (Full Access).
const t4Btn = Array.from(
document.querySelectorAll("button"),
).find((b) => b.textContent?.includes("Full Access"))!;
fireEvent.click(t4Btn);
// Spawn with T4
const spawnBtn = Array.from(
document.querySelectorAll("button"),
).find((b) => b.textContent?.includes("Spawn agent"))!;
spawnBtn.click();
await vi.waitFor(() => {
expect(apiPostSpy).toHaveBeenCalledWith("/workspaces", expect.objectContaining({
tier: 4, // T4 = tier 4
}));
});
});
});
@@ -1,154 +0,0 @@
// @vitest-environment jsdom
/**
* TabBar — mobile bottom navigation bar.
*
* Per WCAG 2.1 AA / ARIA tab pattern:
* - Outer div has role="tablist" + aria-label
* - Each tab button has role="tab", aria-selected, aria-label
* - Icon span has aria-hidden="true" (label text is the accessible name)
* - Keyboard: Arrow keys cycle tabs, Home/End go to first/last
* - tabIndex: active tab is 0, others are -1
*
* NOTE: No @testing-library/jest-dom — use DOM APIs.
*/
import { afterEach, describe, expect, it, vi } from "vitest";
import { cleanup, fireEvent, render } from "@testing-library/react";
import React from "react";
import { TabBar, type MobileTabId } from "../components";
afterEach(() => {
cleanup();
vi.restoreAllMocks();
vi.resetModules();
});
// ─── Render ───────────────────────────────────────────────────────────────────
describe("TabBar — render", () => {
it("renders 4 tab buttons", () => {
render(<TabBar active="agents" onChange={vi.fn()} dark={false} />);
const tabs = document.querySelectorAll('[role="tab"]');
expect(tabs.length).toBe(4);
});
it("outer div has role=tablist and aria-label", () => {
render(<TabBar active="agents" onChange={vi.fn()} dark={false} />);
const tablist = document.querySelector('[role="tablist"]');
expect(tablist).toBeTruthy();
expect(tablist?.getAttribute("aria-label")).toBe("Mobile navigation");
});
it("each tab button has role=tab and aria-label", () => {
render(<TabBar active="agents" onChange={vi.fn()} dark={false} />);
const tabs = document.querySelectorAll('[role="tab"]');
tabs.forEach((tab) => {
expect(tab.getAttribute("role")).toBe("tab");
expect(tab.getAttribute("aria-label")).toBeTruthy();
});
});
it("icon spans have aria-hidden=true", () => {
render(<TabBar active="agents" onChange={vi.fn()} dark={false} />);
const icons = document.querySelectorAll('[aria-hidden="true"]');
expect(icons.length).toBeGreaterThanOrEqual(4);
});
it("active tab has aria-selected=true, others false", () => {
render(<TabBar active="canvas" onChange={vi.fn()} dark={false} />);
const tabs = document.querySelectorAll('[role="tab"]');
tabs.forEach((tab) => {
const label = tab.getAttribute("aria-label");
if (label === "Canvas") {
expect(tab.getAttribute("aria-selected")).toBe("true");
} else {
expect(tab.getAttribute("aria-selected")).toBe("false");
}
});
});
it("active tab has tabIndex=0, others tabIndex=-1", () => {
render(<TabBar active="comms" onChange={vi.fn()} dark={false} />);
const tabs = document.querySelectorAll('[role="tab"]');
tabs.forEach((tab) => {
const label = tab.getAttribute("aria-label");
if (label === "Comms") {
expect(tab.getAttribute("tabIndex")).toBe("0");
} else {
expect(tab.getAttribute("tabIndex")).toBe("-1");
}
});
});
});
// ─── Interaction ─────────────────────────────────────────────────────────────
describe("TabBar — interaction", () => {
it("calls onChange with correct id when tab is clicked", () => {
const onChange = vi.fn();
render(<TabBar active="agents" onChange={onChange} dark={false} />);
const tabs = document.querySelectorAll('[role="tab"]');
const canvasTab = Array.from(tabs).find((t) => t.getAttribute("aria-label") === "Canvas") as Element;
fireEvent.click(canvasTab);
expect(onChange).toHaveBeenCalledWith("canvas");
});
it("ArrowRight moves focus to next tab and activates it", () => {
const onChange = vi.fn();
render(<TabBar active="agents" onChange={onChange} dark={false} />);
const tabs = document.querySelectorAll('[role="tab"]');
const agentsTab = tabs[0] as HTMLElement;
agentsTab.focus();
expect(document.activeElement).toBe(agentsTab);
fireEvent.keyDown(agentsTab, { key: "ArrowRight" });
// onChange called for the next tab
expect(onChange).toHaveBeenCalledWith("canvas");
// Focus should move to the canvas tab
// Use setTimeout(0) trick — after state update, focus moves
});
it("ArrowLeft on first tab wraps to last", () => {
const onChange = vi.fn();
render(<TabBar active="agents" onChange={onChange} dark={false} />);
const tabs = document.querySelectorAll('[role="tab"]');
const agentsTab = tabs[0] as HTMLElement;
agentsTab.focus();
fireEvent.keyDown(agentsTab, { key: "ArrowLeft" });
expect(onChange).toHaveBeenCalledWith("me");
});
it("Home key activates first tab", () => {
const onChange = vi.fn();
render(<TabBar active="comms" onChange={onChange} dark={false} />);
const tabs = document.querySelectorAll('[role="tab"]');
const commsTab = tabs[2] as HTMLElement;
commsTab.focus();
fireEvent.keyDown(commsTab, { key: "Home" });
expect(onChange).toHaveBeenCalledWith("agents");
});
it("End key activates last tab", () => {
const onChange = vi.fn();
render(<TabBar active="agents" onChange={onChange} dark={false} />);
const tabs = document.querySelectorAll('[role="tab"]');
const agentsTab = tabs[0] as HTMLElement;
agentsTab.focus();
fireEvent.keyDown(agentsTab, { key: "End" });
expect(onChange).toHaveBeenCalledWith("me");
});
it("ArrowDown also navigates (aliases ArrowRight)", () => {
const onChange = vi.fn();
render(<TabBar active="canvas" onChange={onChange} dark={false} />);
const tabs = document.querySelectorAll('[role="tab"]');
const canvasTab = tabs[1] as HTMLElement;
canvasTab.focus();
fireEvent.keyDown(canvasTab, { key: "ArrowDown" });
expect(onChange).toHaveBeenCalledWith("comms");
});
});
@@ -1,137 +0,0 @@
/** @vitest-environment jsdom */
/**
* Tests for rendering components exported from components.tsx:
* RemoteBadge, WorkspacePill.
*
* Note: TabBar, FilterChips, AgentCard are tested in their own files.
* toMobileAgent and classifyForFilter are tested in components.test.ts.
*/
import { describe, expect, it } from "vitest";
import { render } from "@testing-library/react";
import { RemoteBadge, WorkspacePill } from "../components";
import { MOL_DARK, MOL_LIGHT } from "../palette";
import { MobileAccentProvider } from "../palette-context";
// ─── Palette provider wrapper ────────────────────────────────────────────────
// RemoteBadge uses palette directly; WorkspacePill calls usePalette(dark) internally,
// so WorkspacePill must be rendered inside MobileAccentProvider.
function renderWithProvider(ui: React.ReactElement) {
return render(<MobileAccentProvider accent="#2f9e6a">{ui}</MobileAccentProvider>);
}
// ─── RemoteBadge ─────────────────────────────────────────────────────────────
describe("RemoteBadge", () => {
it("renders the ★ REMOTE label text", () => {
const { container } = render(
<RemoteBadge palette={MOL_LIGHT} />
);
expect(container.textContent).toContain("REMOTE");
expect(container.textContent).toContain("★");
});
it("renders a span element", () => {
const { container } = render(
<RemoteBadge palette={MOL_DARK} />
);
expect(container.querySelector("span")).toBeTruthy();
});
it("has border-radius 4px (compact badge shape)", () => {
const { container } = render(
<RemoteBadge palette={MOL_LIGHT} />
);
const span = container.querySelector("span") as HTMLSpanElement;
expect(span.style.borderRadius).toBe("4px");
});
it("applies the palette's remote color as text color", () => {
const { container } = render(
<RemoteBadge palette={MOL_DARK} />
);
const span = container.querySelector("span") as HTMLSpanElement;
expect(span.style.color).toBeTruthy();
});
it("applies the palette's remoteBg as background", () => {
const { container } = render(
<RemoteBadge palette={MOL_LIGHT} />
);
const span = container.querySelector("span") as HTMLSpanElement;
expect(span.style.background).toBeTruthy();
});
it("dark and light palettes produce different background colors", () => {
const { container: darkContainer } = render(
<RemoteBadge palette={MOL_DARK} />
);
const { container: lightContainer } = render(
<RemoteBadge palette={MOL_LIGHT} />
);
const darkSpan = darkContainer.querySelector("span") as HTMLSpanElement;
const lightSpan = lightContainer.querySelector("span") as HTMLSpanElement;
expect(darkSpan.style.background).not.toBe(lightSpan.style.background);
});
});
// ─── WorkspacePill ────────────────────────────────────────────────────────────
describe("WorkspacePill", () => {
it("renders the Molecule AI brand text", () => {
const { container } = renderWithProvider(<WorkspacePill dark={false} count={3} />);
expect(container.textContent).toContain("Molecule AI");
});
it("renders the count value", () => {
const { container } = renderWithProvider(<WorkspacePill dark={true} count={7} />);
expect(container.textContent).toContain("7");
});
it("accepts a string count (e.g. LIVE)", () => {
const { container } = renderWithProvider(
<WorkspacePill dark={false} count="LIVE" live={true} />
);
expect(container.textContent).toContain("LIVE");
});
it("does NOT render LIVE when live=false", () => {
const { container } = renderWithProvider(
<WorkspacePill dark={false} count={5} live={false} />
);
expect(container.textContent).not.toContain("LIVE");
});
it("renders LIVE by default (live=true)", () => {
const { container } = renderWithProvider(
<WorkspacePill dark={true} count={2} />
);
expect(container.textContent).toContain("LIVE");
});
it("renders the brand initial M in the logo badge", () => {
const { container } = renderWithProvider(<WorkspacePill dark={false} count={1} />);
expect(container.textContent).toContain("M");
});
it("has an inline borderRadius style (pill shape)", () => {
const { container } = renderWithProvider(<WorkspacePill dark={false} count={0} />);
// Walk the DOM tree to find the outermost pill div (has inline borderRadius)
let el: HTMLElement | null = container.firstElementChild as HTMLElement | null;
while (el && !el.style.borderRadius) {
el = el.parentElement;
}
expect(el?.style.borderRadius).toBeTruthy();
});
it("dark and light palettes produce different root container backgrounds", () => {
const { container: dark } = renderWithProvider(<WorkspacePill dark={true} count={1} />);
const { container: light } = renderWithProvider(<WorkspacePill dark={false} count={1} />);
// The outermost element should have an inline background color set by the dark/light prop
const darkRoot = dark.firstElementChild as HTMLElement | null;
const lightRoot = light.firstElementChild as HTMLElement | null;
expect(darkRoot?.style.background).toBeTruthy();
expect(lightRoot?.style.background).toBeTruthy();
});
});
@@ -1,161 +0,0 @@
// @vitest-environment jsdom
/**
* Mobile primitives — StatusDot, TierChip, Chip, SectionLabel.
*
* NOTE: No @testing-library/jest-dom — use DOM APIs.
*/
import { afterEach, describe, expect, it } from "vitest";
import { cleanup, render } from "@testing-library/react";
import React from "react";
import { Chip, SectionLabel, StatusDot, TierChip } from "../primitives";
afterEach(() => {
cleanup();
});
// ─── StatusDot ──────────────────────────────────────────────────────────────
describe("StatusDot", () => {
it("renders a span with correct size", () => {
const { container } = render(<StatusDot size={12} />);
const span = container.querySelector("span") as HTMLSpanElement;
expect(span).toBeTruthy();
expect(span.style.width).toBe("12px");
expect(span.style.height).toBe("12px");
});
it("has border-radius 999 (circle)", () => {
const { container } = render(<StatusDot size={8} />);
const span = container.querySelector("span") as HTMLSpanElement;
expect(span.style.borderRadius).toBe("999px");
});
it("has flexShrink: 0 to prevent collapsing in flex rows", () => {
const { container } = render(<StatusDot size={6} />);
const span = container.querySelector("span") as HTMLSpanElement;
expect(span.style.flexShrink).toBe("0");
});
it("has halo boxShadow by default (halo=true)", () => {
const { container } = render(<StatusDot size={8} />);
const span = container.querySelector("span") as HTMLSpanElement;
// Math.max(2, 8*0.45) = Math.max(2, 3.6) = 3.6 → "3.6px"
expect(span.style.boxShadow).toContain("px");
});
it("has no boxShadow when halo=false", () => {
const { container } = render(<StatusDot size={8} halo={false} />);
const span = container.querySelector("span") as HTMLSpanElement;
expect(span.style.boxShadow).toBe("none");
});
it("renders with default props (size=8, halo=true, status=online)", () => {
const { container } = render(<StatusDot />);
const span = container.querySelector("span") as HTMLSpanElement;
expect(span.style.width).toBe("8px");
expect(span.style.height).toBe("8px");
expect(span.style.boxShadow).not.toBe("none");
});
});
// ─── TierChip ───────────────────────────────────────────────────────────────
describe("TierChip", () => {
it("renders the tier text inside a span", () => {
const { container } = render(<TierChip tier="T1" />);
expect(container.textContent).toContain("T1");
});
it("renders T1, T2, T3, T4 with correct text", () => {
for (const tier of ["T1", "T2", "T3", "T4"] as const) {
const { container } = render(<TierChip tier={tier} />);
expect(container.textContent).toBe(tier);
}
});
it("sm size renders smaller dimensions than lg", () => {
const { container: sm } = render(<TierChip tier="T2" size="sm" />);
const { container: lg } = render(<TierChip tier="T2" size="lg" />);
const smSpan = sm.querySelector("span") as HTMLSpanElement;
const lgSpan = lg.querySelector("span") as HTMLSpanElement;
expect(smSpan.style.width).toBe("26px");
expect(smSpan.style.height).toBe("19px");
expect(lgSpan.style.width).toBe("32px");
expect(lgSpan.style.height).toBe("22px");
});
it("uses flexShrink: 0 to prevent collapsing", () => {
const { container } = render(<TierChip tier="T3" />);
const span = container.querySelector("span") as HTMLSpanElement;
expect(span.style.flexShrink).toBe("0");
});
it("renders with default props (tier=T2, size=sm)", () => {
const { container } = render(<TierChip />);
expect(container.textContent).toBe("T2");
const span = container.querySelector("span") as HTMLSpanElement;
expect(span.style.width).toBe("26px");
});
});
// ─── Chip ───────────────────────────────────────────────────────────────────
describe("Chip", () => {
it("renders the value text", () => {
const { container } = render(<Chip value="12 skills" />);
expect(container.textContent).toContain("12 skills");
});
it("renders label + value when label is provided", () => {
const { container } = render(<Chip label="SKILLS" value="3" />);
const text = container.textContent ?? "";
expect(text).toContain("SKILLS");
expect(text).toContain("3");
});
it("has border-radius 999 (pill shape)", () => {
const { container } = render(<Chip value="test" />);
const span = container.querySelector("span") as HTMLSpanElement;
expect(span.style.borderRadius).toBe("999px");
});
it("soft mode applies accent background", () => {
const { container: normal } = render(<Chip value="a" />);
const { container: soft } = render(<Chip value="a" soft={true} accent="#2f9e6a" />);
const normalSpan = normal.querySelector("span") as HTMLSpanElement;
const softSpan = soft.querySelector("span") as HTMLSpanElement;
// soft uses accent+1a hex, normal uses dark/light hardcoded
expect(normalSpan.style.background).toBeTruthy();
expect(softSpan.style.background).toBeTruthy();
expect(normalSpan.style.background).not.toBe(softSpan.style.background);
});
});
// ─── SectionLabel ───────────────────────────────────────────────────────────
describe("SectionLabel", () => {
it("renders children text", () => {
const { container } = render(<SectionLabel>Runtime config</SectionLabel>);
expect(container.textContent).toContain("Runtime config");
});
it("renders right slot content when provided", () => {
const { container } = render(
<SectionLabel right={<button>Edit</button>}>Runtime config</SectionLabel>,
);
expect(container.textContent).toContain("Edit");
expect(container.querySelector("button")).toBeTruthy();
});
it("renders without right slot", () => {
const { container } = render(<SectionLabel>Runtime config</SectionLabel>);
expect(container.querySelector("button")).toBeNull();
});
it("uses uppercase text transform", () => {
const { container } = render(<SectionLabel>Runtime config</SectionLabel>);
const div = container.querySelector("div") as HTMLDivElement;
expect(div.style.textTransform).toBe("uppercase");
});
});
+1 -40
View File
@@ -72,33 +72,8 @@ export function TabBar({
{ id: "comms", label: "Comms", icon: "pulse" },
{ id: "me", label: "Me", icon: "user" },
];
const handleKeyDown = (e: React.KeyboardEvent, idx: number) => {
let nextIdx: number | null = null;
if (e.key === "ArrowRight" || e.key === "ArrowDown") {
nextIdx = (idx + 1) % tabs.length;
} else if (e.key === "ArrowLeft" || e.key === "ArrowUp") {
nextIdx = (idx - 1 + tabs.length) % tabs.length;
} else if (e.key === "Home") {
nextIdx = 0;
} else if (e.key === "End") {
nextIdx = tabs.length - 1;
}
if (nextIdx !== null) {
e.preventDefault();
onChange(tabs[nextIdx]!.id);
// Move focus to the new tab button after state updates
setTimeout(() => {
const btns = document.querySelectorAll('[role="tab"]');
(btns[nextIdx!] as HTMLButtonElement | null)?.focus();
}, 0);
}
};
return (
<div
role="tablist"
aria-label="Mobile navigation"
style={{
position: "absolute",
left: 14,
@@ -120,18 +95,13 @@ export function TabBar({
padding: "0 10px",
}}
>
{tabs.map((t, idx) => {
{tabs.map((t) => {
const on = active === t.id;
return (
<button
key={t.id}
role="tab"
type="button"
tabIndex={on ? 0 : -1}
aria-selected={on}
aria-label={t.label}
onClick={() => onChange(t.id)}
onKeyDown={(e) => handleKeyDown(e, idx)}
style={{
background: "none",
border: "none",
@@ -146,7 +116,6 @@ export function TabBar({
}}
>
<span
aria-hidden="true"
style={{
width: 36,
height: 28,
@@ -287,7 +256,6 @@ export function AgentCard({
return (
<button
type="button"
aria-label={`${agent.name}, status: ${agent.status}, tier ${agent.tier}${agent.remote ? ", remote" : ""}`}
onClick={onClick}
style={{
display: "block",
@@ -421,9 +389,6 @@ export function FilterChips({
];
return (
<div
role="toolbar"
aria-label="Filter agents"
aria-activedescendant={value ? `filter-${value}` : undefined}
style={{
display: "flex",
gap: 6,
@@ -437,10 +402,7 @@ export function FilterChips({
return (
<button
key={o.id}
id={`filter-${o.id}`}
role="radio"
type="button"
aria-checked={on}
onClick={() => onChange(o.id)}
style={{
display: "inline-flex",
@@ -460,7 +422,6 @@ export function FilterChips({
>
{o.label}
<span
aria-hidden="true"
style={{
fontSize: 10.5,
opacity: 0.7,
@@ -41,11 +41,6 @@ export function UnsavedChangesGuard({
<AlertDialog.Portal>
<AlertDialog.Overlay className="guard-dialog__overlay" />
<AlertDialog.Content className="guard-dialog">
{/* Screen-reader-only description — satisfies Radix aria-describedby requirement
without adding visible text to the dialog. */}
<AlertDialog.Description className="sr-only">
This dialog asks whether to discard or keep editing unsaved changes.
</AlertDialog.Description>
<AlertDialog.Title className="guard-dialog__title">
Discard unsaved changes?
</AlertDialog.Title>
@@ -55,7 +50,6 @@ export function UnsavedChangesGuard({
Keep editing
</button>
</AlertDialog.Cancel>
{/* eslint-disable-next-line jsx-a11y/click-events-have-key-events */}
<AlertDialog.Action asChild>
<button
type="button"
@@ -26,6 +26,7 @@ import { UnsavedChangesGuard } from "../UnsavedChangesGuard";
afterEach(() => {
cleanup();
vi.restoreAllMocks();
vi.resetModules();
});
// ─── Render ──────────────────────────────────────────────────────────────────
@@ -130,33 +131,24 @@ describe("UnsavedChangesGuard — interaction", () => {
expect(onDiscard).toHaveBeenCalledTimes(1);
});
it("onKeepEditing called when dialog is dismissed via ESC / overlay click", () => {
// Radix DismissableLayer cannot be triggered via fireEvent.click in jsdom
// (lacks pointer-coordinate computation for outside-click detection).
// Instead, we verify the callback contract directly: onOpenChange(false)
// with pendingDiscard=false must call onKeepEditing.
//
// We exercise this by:
// 1. Clicking the Keep editing button (AlertDialog.Cancel) to close the dialog.
// Radix wires Cancel → onOpenChange(false). Since pendingDiscard is false,
// the guard calls onKeepEditing.
// 2. Directly invoking onDiscard to verify the prop is received.
// (fireEvent.click on asChild buttons is unreliable in jsdom, per
// @testing-library/react guidance on composite components.)
it("onKeepEditing called when backdrop/overlay is clicked", () => {
const onKeepEditing = vi.fn();
const onDiscard = vi.fn();
render(
<UnsavedChangesGuard
open={true}
onKeepEditing={onKeepEditing}
onDiscard={onDiscard}
onDiscard={vi.fn()}
/>,
);
// Keep editing (Cancel) → fires onOpenChange(false) → onKeepEditing
const keepBtn = document.querySelector('.guard-dialog__keep-btn');
expect(keepBtn).not.toBeNull();
keepBtn!.click();
expect(onKeepEditing).toHaveBeenCalledTimes(1);
expect(onDiscard).not.toHaveBeenCalled();
// Click on the overlay (outside the dialog content)
const overlay = document.querySelector('[data-radix-scroll-area-horizontal]')?.parentElement
|| document.querySelector('[class*="overlay"]')
|| document.body.firstElementChild;
if (overlay) {
fireEvent.click(overlay as HTMLElement);
}
// The AlertDialog.Root onOpenChange wires !o → onKeepEditing
// Clicking the overlay triggers onOpenChange(false) → onKeepEditing
// (This is the expected behavior per spec §4.4)
});
});
+5 -5
View File
@@ -239,9 +239,9 @@ for s in d.get("SecretList", []):
# --- Summarize + safety gate ----------------------------------------------
DELETE_COUNT=$(printf '%s' "$DECISIONS" | python3 -c "import json,sys; print(sum(1 for l in sys.stdin if json.loads(l)['action']=='delete'))")
DELETE_COUNT=$(echo "$DECISIONS" | python3 -c "import json,sys; print(sum(1 for l in sys.stdin if json.loads(l)['action']=='delete'))")
KEEP_COUNT=$((TOTAL_SECRETS - DELETE_COUNT))
TENANT_SECRETS=$(printf '%s' "$DECISIONS" | python3 -c "
TENANT_SECRETS=$(echo "$DECISIONS" | python3 -c "
import json, sys
n = sum(1 for l in sys.stdin if json.loads(l)['reason'] != 'not-a-tenant-secret')
print(n)
@@ -256,7 +256,7 @@ log " would keep: $KEEP_COUNT"
log ""
# Per-reason breakdown of deletes + keep-categories worth seeing
printf '%s' "$DECISIONS" | python3 -c "
echo "$DECISIONS" | python3 -c "
import json,sys,collections
delete_c = collections.Counter()
keep_c = collections.Counter()
@@ -291,7 +291,7 @@ if [ "$DRY_RUN" = "1" ]; then
log "Dry run complete. Pass --execute to actually delete $DELETE_COUNT secrets."
log ""
log "First 20 secrets that would be deleted:"
printf '%s' "$DECISIONS" | python3 -c "
echo "$DECISIONS" | python3 -c "
import json, sys
shown = 0
for l in sys.stdin:
@@ -327,7 +327,7 @@ RESULT_LOG=$(mktemp -t aws-secrets-result-XXXXXX)
# Build delete plan (one ARN per line) and id→name side-channel for
# failure-log readability. Use ARN rather than Name on the delete
# call because Name is mutable; ARN is the stable identifier.
printf '%s' "$DECISIONS" | python3 -c '
echo "$DECISIONS" | python3 -c '
import json, sys
plan_path = sys.argv[1]
map_path = sys.argv[2]
+5 -5
View File
@@ -195,9 +195,9 @@ for t in d.get("result", []):
# --- Summarize + safety gate ----------------------------------------------
DELETE_COUNT=$(printf '%s' "$DECISIONS" | python3 -c "import json,sys; print(sum(1 for l in sys.stdin if json.loads(l)['action']=='delete'))")
DELETE_COUNT=$(echo "$DECISIONS" | python3 -c "import json,sys; print(sum(1 for l in sys.stdin if json.loads(l)['action']=='delete'))")
KEEP_COUNT=$((TOTAL_TUNNELS - DELETE_COUNT))
TENANT_TUNNELS=$(printf '%s' "$DECISIONS" | python3 -c "
TENANT_TUNNELS=$(echo "$DECISIONS" | python3 -c "
import json, sys
n = sum(1 for l in sys.stdin if json.loads(l)['reason'] != 'not-a-tenant-tunnel')
print(n)
@@ -212,7 +212,7 @@ log " would keep: $KEEP_COUNT"
log ""
# Per-reason breakdown of deletes
printf '%s' "$DECISIONS" | python3 -c "
echo "$DECISIONS" | python3 -c "
import json,sys,collections
c = collections.Counter()
for l in sys.stdin:
@@ -242,7 +242,7 @@ if [ "$DRY_RUN" = "1" ]; then
log "Dry run complete. Pass --execute to actually delete $DELETE_COUNT tunnels."
log ""
log "First 20 tunnels that would be deleted:"
printf '%s' "$DECISIONS" | python3 -c "
echo "$DECISIONS" | python3 -c "
import json, sys
shown = 0
for l in sys.stdin:
@@ -283,7 +283,7 @@ RESULT_LOG=$(mktemp -t cf-tunnels-result-XXXXXX)
# Build delete plan (just ids, one per line) and the side-channel
# id→name map (tab-separated).
printf '%s' "$DECISIONS" | python3 -c '
echo "$DECISIONS" | python3 -c '
import json, os, sys
plan_path = sys.argv[1]
map_path = sys.argv[2]
-431
View File
@@ -1,431 +0,0 @@
#!/usr/bin/env bash
# scripts/promote-tenant-image.sh
#
# Codified ECR :<source-tag> → :<dest-tag> promote + tenant fleet redeploy.
# Replaces the manual 4-step runbook in
# `reference_manual_ecr_promote_procedure.md` (memory) and closes
# molecule-ai/molecule-core#660.
#
# Default flow (no flags):
# 1. PREFLIGHT: aws auth ok, repo exists, source-tag exists, all tenant
# slugs resolve to live EC2 + CP admin endpoint reachable.
# 2. SNAPSHOT: save current dest-tag manifest as :<dest>-prev-YYYYMMDD
# (idempotent — if today's snapshot already exists, skip).
# 3. PROMOTE: copy <source-tag> manifest → <dest-tag>. Records the new
# digest so step 5 can verify.
# 4. REDEPLOY: per-tenant POST /cp/admin/tenants/<slug>/redeploy. On
# 403 (stale-ECR-auth on tenant EC2), SSM-refresh docker login and
# retry once. Hard-fail if both attempts fail.
# 5. VERIFY: per-tenant curl /buildinfo + /health. /buildinfo.git_sha
# MUST match the promoted manifest's source SHA (extracted from
# either ECR image labels or the .git_sha tag annotation).
#
# On any failure after step 3, attempts auto-rollback: re-promote
# :<dest>-prev-YYYYMMDD → :<dest-tag>, then redeploy + verify. Exits non-zero
# even after successful rollback (so callers know promotion was aborted).
#
# Usage:
# scripts/promote-tenant-image.sh \
# --source-tag staging-latest \
# --dest-tag latest \
# --tenants chloe-dong,hongming \
# [--repo molecule-ai/platform-tenant] \
# [--region us-east-2] \
# [--cp-base https://api.moleculesai.app] \
# [--cp-token-env CP_TOKEN] \
# [--dry-run] \
# [--skip-rollback] \
# [--mock-dir <dir>]
#
# Test harness (referenced by scripts/test-promote-tenant-image.sh and CI):
# --mock-dir <dir> Read canned external-tool outputs from <dir> instead
# of running aws/curl/ssm. Each function reads from a
# filename matching the function name. Stdout of the
# mock files is returned verbatim; a `.rc` sidecar file
# controls exit code. Mock dir is the only way to
# exercise the failure branches in unit tests.
#
# Exit codes:
# 0 promote + redeploy + verify all green
# 1 preflight failed (no mutations performed)
# 2 promote step failed (no rollback needed — snapshot intact)
# 3 redeploy/verify failed; rollback succeeded
# 4 redeploy/verify failed; rollback ALSO failed (paging-level)
# 64 argument/usage error
set -euo pipefail
# ─────────────────────────────────────────────────────────────────────────────
# Argument parsing
# ─────────────────────────────────────────────────────────────────────────────
SOURCE_TAG=""
DEST_TAG=""
TENANTS=""
REPO="${MOLECULE_TENANT_REPO:-molecule-ai/platform-tenant}"
REGION="${AWS_REGION:-us-east-2}"
CP_BASE="${CP_BASE_URL:-https://api.moleculesai.app}"
CP_TOKEN_ENV="${CP_TOKEN_ENV:-CP_TOKEN}"
DRY_RUN="false"
SKIP_ROLLBACK="false"
MOCK_DIR=""
usage() {
sed -n '3,40p' "${BASH_SOURCE[0]}" | sed 's/^# \{0,1\}//'
exit 64
}
while [[ $# -gt 0 ]]; do
case "$1" in
--source-tag) SOURCE_TAG="$2"; shift 2 ;;
--dest-tag) DEST_TAG="$2"; shift 2 ;;
--tenants) TENANTS="$2"; shift 2 ;;
--repo) REPO="$2"; shift 2 ;;
--region) REGION="$2"; shift 2 ;;
--cp-base) CP_BASE="$2"; shift 2 ;;
--cp-token-env) CP_TOKEN_ENV="$2"; shift 2 ;;
--dry-run) DRY_RUN="true"; shift ;;
--skip-rollback) SKIP_ROLLBACK="true"; shift ;;
--mock-dir) MOCK_DIR="$2"; shift 2 ;;
-h|--help) usage ;;
*) printf 'unknown argument: %s\n' "$1" >&2; exit 64 ;;
esac
done
[[ -z "$SOURCE_TAG" || -z "$DEST_TAG" || -z "$TENANTS" ]] && {
printf 'required: --source-tag, --dest-tag, --tenants\n' >&2
exit 64
}
[[ "$SOURCE_TAG" == "$DEST_TAG" ]] && {
printf 'source-tag and dest-tag must differ\n' >&2
exit 64
}
# Snapshot/rollback tag (deterministic — same script run on same UTC date
# is idempotent; cross-day reruns get distinct rollback points).
TODAY="${NOW_OVERRIDE_DATE:-$(date -u +%Y%m%d)}"
ROLLBACK_TAG="${DEST_TAG}-prev-${TODAY}"
# ─────────────────────────────────────────────────────────────────────────────
# Mockable external calls
# ─────────────────────────────────────────────────────────────────────────────
#
# Every function that touches the network/CLI is wrapped so tests can swap
# the implementation. In --mock-dir mode each function reads from a file
# named after itself (e.g. `aws_ecr_get_image`); stdout is the mock body,
# and a sibling `<name>.rc` sets the return code. Calls are also logged
# to $MOCK_DIR/.calls (one line per call: <fn> <args…>) so tests can
# assert on the call sequence.
_mock_call() {
local fn="$1"; shift
if [[ -n "$MOCK_DIR" ]]; then
printf '%s %s\n' "$fn" "$*" >> "$MOCK_DIR/.calls"
local body="$MOCK_DIR/$fn"
local rc_file="$MOCK_DIR/$fn.rc"
[[ -f "$body" ]] || { printf 'mock missing: %s\n' "$body" >&2; return 127; }
cat "$body"
[[ -f "$rc_file" ]] && return "$(cat "$rc_file")"
return 0
fi
return 99 # signal: no mock, caller should run real impl
}
aws_ecr_get_image() {
# args: <tag>
local tag="$1"
_mock_call aws_ecr_get_image "$tag"; local _mrc=$?
[[ $_mrc -ne 99 ]] && return $_mrc
aws ecr batch-get-image \
--repository-name "$REPO" \
--region "$REGION" \
--image-ids "imageTag=$tag" \
--query 'images[0].imageManifest' \
--output text 2>/dev/null
}
aws_ecr_put_image() {
# args: <tag> <manifest-file>
local tag="$1" mfile="$2"
_mock_call aws_ecr_put_image "$tag" "$mfile"; local _mrc=$?
[[ $_mrc -ne 99 ]] && return $_mrc
aws ecr put-image \
--repository-name "$REPO" \
--region "$REGION" \
--image-tag "$tag" \
--image-manifest "file://$mfile" \
--image-manifest-media-type "application/vnd.oci.image.index.v1+json" \
>/dev/null
}
aws_ecr_describe_image() {
# args: <tag>; prints the SHA256 digest
local tag="$1"
_mock_call aws_ecr_describe_image "$tag"; local _mrc=$?
[[ $_mrc -ne 99 ]] && return $_mrc
aws ecr describe-images \
--repository-name "$REPO" \
--region "$REGION" \
--image-ids "imageTag=$tag" \
--query 'imageDetails[0].imageDigest' \
--output text 2>/dev/null
}
cp_redeploy_tenant() {
# args: <slug> <tag>
# exit codes:
# 0 — HTTP 2xx (redeploy accepted)
# 2 — HTTP 403 (likely stale tenant docker ECR auth; caller should SSM-refresh)
# 1 — any other failure
# stdout = response body. stderr = "HTTP_STATUS=NNN" line.
local slug="$1" tag="$2"
_mock_call cp_redeploy_tenant "$slug" "$tag"; local _mrc=$?
[[ $_mrc -ne 99 ]] && return $_mrc
local tok="${!CP_TOKEN_ENV:-}"
[[ -z "$tok" ]] && { printf '$%s unset\n' "$CP_TOKEN_ENV" >&2; return 1; }
local body code
body=$(mktemp)
code=$(curl -s -o "$body" -w '%{http_code}' \
-X POST \
-H "Authorization: Bearer $tok" \
-H 'Content-Type: application/json' \
-d "{\"target_tag\":\"$tag\",\"dry_run\":false}" \
"$CP_BASE/cp/admin/tenants/$slug/redeploy")
cat "$body"
rm -f "$body"
printf 'HTTP_STATUS=%s\n' "$code" >&2
case "$code" in
2*) return 0 ;;
403) return 2 ;;
*) return 1 ;;
esac
}
tenant_buildinfo() {
# args: <slug>; prints JSON
local slug="$1"
_mock_call tenant_buildinfo "$slug"; local _mrc=$?
[[ $_mrc -ne 99 ]] && return $_mrc
curl -sf --max-time 10 "https://${slug}.moleculesai.app/buildinfo"
}
tenant_health() {
# args: <slug>; prints raw response, returns 0 if "ok"
local slug="$1"
_mock_call tenant_health "$slug"; local _mrc=$?
[[ $_mrc -ne 99 ]] && return $_mrc
curl -sf --max-time 10 "https://${slug}.moleculesai.app/health"
}
ssm_refresh_ecr_auth() {
# args: <instance-id>
local iid="$1"
_mock_call ssm_refresh_ecr_auth "$iid"; local _mrc=$?
[[ $_mrc -ne 99 ]] && return $_mrc
# Parameters as JSON. python3 json.dumps is used instead of shell printf
# to guarantee correct string escaping (OFFSEC-001 / CWE-78 hardening).
# Account ID is derived from the ECR URI which the daemon is configured for.
local acct="${ECR_ACCOUNT_ID:-153263036946}"
local params
params=$(mktemp)
python3 -c "
import json, sys
region = sys.argv[1]
acct = sys.argv[2]
# Build shell command with proper shell-safe quoting, then JSON-encode.
# Using json.dumps for each interpolated field guarantees correct JSON string
# escaping (OFFSEC-001 / CWE-78 hardening: no shell-injection via region/acct).
ecr_login = (
'aws ecr get-login-password --region ' + json.dumps(region)[1:-1] +
' | docker login --username AWS --password-stdin ' +
json.dumps(acct)[1:-1] + '.dkr.ecr.' +
json.dumps(region)[1:-1] + '.amazonaws.com'
)
print(json.dumps({'commands': [ecr_login]}))
" "$REGION" "$acct" > "$params"
aws ssm send-command \
--instance-ids "$iid" \
--document-name AWS-RunShellScript \
--region "$REGION" \
--parameters "file://$params" \
--query 'Command.CommandId' \
--output text
rm -f "$params"
}
resolve_tenant_instance_id() {
# args: <slug>; prints i-xxx
local slug="$1"
_mock_call resolve_tenant_instance_id "$slug"; local _mrc=$?
[[ $_mrc -ne 99 ]] && return $_mrc
local tok="${!CP_TOKEN_ENV:-}"
curl -sf -H "Authorization: Bearer $tok" \
"$CP_BASE/cp/admin/tenants/$slug" | python3 -c \
'import json,sys; d=json.load(sys.stdin); print(d.get("instance_id",""))'
}
# ─────────────────────────────────────────────────────────────────────────────
# Steps
# ─────────────────────────────────────────────────────────────────────────────
log() { printf '[%s] %s\n' "$(date -u +%H:%M:%SZ)" "$*"; }
err() { printf '[%s] ERROR: %s\n' "$(date -u +%H:%M:%SZ)" "$*" >&2; }
preflight() {
log "preflight: source=$SOURCE_TAG dest=$DEST_TAG repo=$REPO region=$REGION"
local src_manifest
src_manifest=$(aws_ecr_get_image "$SOURCE_TAG") || {
err "source tag '$SOURCE_TAG' not found in $REPO"
return 1
}
[[ -z "$src_manifest" || "$src_manifest" == "None" ]] && {
err "source tag '$SOURCE_TAG' returned empty manifest"
return 1
}
# Best-effort: existence of dest tag is OK if missing (first promote).
aws_ecr_get_image "$DEST_TAG" >/dev/null 2>&1 || \
log " (dest tag '$DEST_TAG' does not yet exist; first promote)"
# CP reachability — admin endpoint should return 401/403 (token unchecked here)
# rather than connection-refused. Anything 2xx/4xx counts as "alive."
if [[ -z "$MOCK_DIR" ]]; then
local code
code=$(curl -s -o /dev/null -w '%{http_code}' --max-time 5 "$CP_BASE/health" 2>/dev/null || echo 000)
[[ "$code" == 000 ]] && { err "CP base $CP_BASE unreachable"; return 1; }
fi
log "preflight: OK"
}
snapshot_dest_tag() {
log "snapshot: $DEST_TAG$ROLLBACK_TAG (rollback tag)"
if aws_ecr_describe_image "$ROLLBACK_TAG" >/dev/null 2>&1; then
log " rollback tag $ROLLBACK_TAG already exists today; skipping snapshot (idempotent)"
return 0
fi
local mfile
mfile=$(mktemp)
if ! aws_ecr_get_image "$DEST_TAG" > "$mfile" 2>/dev/null; then
log " dest tag $DEST_TAG does not exist yet; no snapshot to take"
rm -f "$mfile"
return 0
fi
[[ ! -s "$mfile" ]] && { log " empty manifest; skipping snapshot"; rm -f "$mfile"; return 0; }
if [[ "$DRY_RUN" == "true" ]]; then
log " [dry-run] would put-image tag=$ROLLBACK_TAG"
else
aws_ecr_put_image "$ROLLBACK_TAG" "$mfile" || {
err "snapshot put-image failed"
rm -f "$mfile"
return 1
}
fi
rm -f "$mfile"
log "snapshot: OK"
}
promote() {
log "promote: $SOURCE_TAG$DEST_TAG"
local mfile
mfile=$(mktemp)
aws_ecr_get_image "$SOURCE_TAG" > "$mfile" || { rm -f "$mfile"; return 1; }
if [[ "$DRY_RUN" == "true" ]]; then
log " [dry-run] would put-image tag=$DEST_TAG"
else
aws_ecr_put_image "$DEST_TAG" "$mfile" || { rm -f "$mfile"; return 1; }
fi
rm -f "$mfile"
log "promote: OK"
}
redeploy_tenant() {
# args: <slug> — handle the 403→SSM-refresh→retry pattern
local slug="$1"
log " redeploy: $slug"
if [[ "$DRY_RUN" == "true" ]]; then
log " [dry-run] would POST /redeploy slug=$slug"
return 0
fi
# cp_redeploy_tenant returns: 0=2xx, 2=403, 1=other (see contract above)
set +e
cp_redeploy_tenant "$slug" "$DEST_TAG" >/dev/null 2>&1
local rc=$?
set -e
if [[ $rc -eq 0 ]]; then
log " redeploy: 2xx"
return 0
fi
if [[ $rc -eq 2 ]]; then
log " redeploy 403 — SSM-refreshing ECR auth + retry"
local iid
iid=$(resolve_tenant_instance_id "$slug")
[[ -z "$iid" ]] && { err "cannot resolve instance id for $slug"; return 1; }
ssm_refresh_ecr_auth "$iid" >/dev/null || { err "SSM refresh failed for $iid"; return 1; }
sleep "${SSM_SETTLE_SECONDS:-6}"
set +e
cp_redeploy_tenant "$slug" "$DEST_TAG" >/dev/null 2>&1
rc=$?
set -e
[[ $rc -eq 0 ]] && { log " redeploy (post-refresh): 2xx"; return 0; }
fi
err "redeploy failed for $slug (rc=$rc)"
return 1
}
verify_tenant() {
local slug="$1"
log " verify: $slug"
if [[ "$DRY_RUN" == "true" ]]; then
log " [dry-run] would curl /buildinfo + /health"
return 0
fi
local bi health
bi=$(tenant_buildinfo "$slug") || { err " /buildinfo failed for $slug"; return 1; }
health=$(tenant_health "$slug") || { err " /health failed for $slug"; return 1; }
log " /buildinfo: $(printf '%s' "$bi" | head -c 120)"
log " /health: $(printf '%s' "$health" | head -c 60)"
}
rollback() {
[[ "$SKIP_ROLLBACK" == "true" ]] && { log "rollback: skipped (--skip-rollback)"; return 1; }
log "ROLLBACK: $ROLLBACK_TAG$DEST_TAG + redeploy fleet"
local mfile
mfile=$(mktemp)
if ! aws_ecr_get_image "$ROLLBACK_TAG" > "$mfile" 2>/dev/null || [[ ! -s "$mfile" ]]; then
err "rollback tag $ROLLBACK_TAG not found — cannot auto-rollback"
rm -f "$mfile"
return 1
fi
aws_ecr_put_image "$DEST_TAG" "$mfile" || { rm -f "$mfile"; return 1; }
rm -f "$mfile"
IFS=',' read -ra slugs <<<"$TENANTS"
for slug in "${slugs[@]}"; do
redeploy_tenant "$slug" || err " rollback redeploy failed for $slug"
done
log "rollback: complete"
}
# ─────────────────────────────────────────────────────────────────────────────
# Main
# ─────────────────────────────────────────────────────────────────────────────
main() {
preflight || return 1
snapshot_dest_tag || return 2
promote || return 2
local promote_rc=0
IFS=',' read -ra slugs <<<"$TENANTS"
for slug in "${slugs[@]}"; do
redeploy_tenant "$slug" || promote_rc=1
[[ $promote_rc -eq 0 ]] && { verify_tenant "$slug" || promote_rc=1; }
[[ $promote_rc -ne 0 ]] && break
done
if [[ $promote_rc -eq 0 ]]; then
log "DONE: $SOURCE_TAG$DEST_TAG promoted across [$TENANTS]"
return 0
fi
if rollback; then return 3; else return 4; fi
}
main "$@"
-346
View File
@@ -1,346 +0,0 @@
#!/usr/bin/env bash
# scripts/test-promote-tenant-image.sh
#
# Comprehensive bash unit/e2e tests for promote-tenant-image.sh.
# Covers every exit code path + key branches: preflight failure,
# snapshot idempotency, redeploy 403→SSM-refresh, verify failure
# triggering rollback, rollback success vs failure.
#
# All external calls (aws/curl/ssm) are stubbed via --mock-dir.
# No live infrastructure is touched. Safe to run anywhere.
#
# Run: bash scripts/test-promote-tenant-image.sh
# Expected: "All N tests passed" + exit 0.
set -euo pipefail
SCRIPT="$(cd "$(dirname "$0")" && pwd)/promote-tenant-image.sh"
[[ -x "$SCRIPT" ]] || { printf 'FATAL: script not executable: %s\n' "$SCRIPT" >&2; exit 1; }
PASS=0
FAIL=0
FAIL_NAMES=()
# ─────────────────────────────────────────────────────────────────────────────
# Helpers
# ─────────────────────────────────────────────────────────────────────────────
mkmock() {
local d
d=$(mktemp -d)
: > "$d/.calls"
printf '%s' "$d"
}
mock_set() {
# args: <dir> <fn-name> <body> [rc]
local d="$1" fn="$2" body="$3" rc="${4:-0}"
printf '%s' "$body" > "$d/$fn"
printf '%s' "$rc" > "$d/$fn.rc"
}
run_script() {
# args: <mock-dir> [extra args…]
local mock="$1"; shift
set +e
SSM_SETTLE_SECONDS=0 NOW_OVERRIDE_DATE=20260512 \
"$SCRIPT" \
--source-tag staging-latest \
--dest-tag latest \
--tenants chloe-dong,hongming \
--mock-dir "$mock" \
"$@" 2>&1
local rc=$?
set -e
printf 'EXIT_CODE=%s\n' "$rc"
}
extract_exit() {
# last EXIT_CODE=NNN line wins
local got="$1"
printf '%s' "$got" | awk -F= '/^EXIT_CODE=/{rc=$2} END{print rc}'
}
assert_exit() {
local name="$1" got="$2" want="$3"
local got_rc
got_rc=$(extract_exit "$got")
if [[ "$got_rc" == "$want" ]]; then
PASS=$((PASS + 1))
printf ' ✓ %s (exit=%s)\n' "$name" "$got_rc"
else
FAIL=$((FAIL + 1))
FAIL_NAMES+=("$name")
printf ' ✗ %s — expected exit=%s, got=%s\n' "$name" "$want" "$got_rc"
printf '%s\n' "$got" | sed 's/^/ /'
fi
}
assert_contains() {
local name="$1" got="$2" pattern="$3"
if printf '%s' "$got" | grep -qE "$pattern"; then
PASS=$((PASS + 1))
printf ' ✓ %s\n' "$name"
else
FAIL=$((FAIL + 1))
FAIL_NAMES+=("$name")
printf ' ✗ %s — pattern not found: %s\n' "$name" "$pattern"
fi
}
assert_not_contains() {
local name="$1" got="$2" pattern="$3"
if printf '%s' "$got" | grep -qE "$pattern"; then
FAIL=$((FAIL + 1))
FAIL_NAMES+=("$name")
printf ' ✗ %s — unexpected match: %s\n' "$name" "$pattern"
else
PASS=$((PASS + 1))
printf ' ✓ %s\n' "$name"
fi
}
assert_calls_contain() {
local name="$1" mock="$2" pattern="$3"
if grep -qE "$pattern" "$mock/.calls" 2>/dev/null; then
PASS=$((PASS + 1))
printf ' ✓ %s\n' "$name"
else
FAIL=$((FAIL + 1))
FAIL_NAMES+=("$name")
printf ' ✗ %s — call missing: %s\n' "$name" "$pattern"
if [[ -f "$mock/.calls" ]]; then
printf ' .calls=\n'
sed 's/^/ | /' "$mock/.calls"
fi
fi
}
assert_calls_count() {
local name="$1" mock="$2" pattern="$3" want="$4"
local got=0
if [[ -f "$mock/.calls" ]]; then
got=$(grep -cE "$pattern" "$mock/.calls" || true)
# grep -c with no matches prints "0" and returns rc=1; `|| true` neutralizes.
got="${got%%[!0-9]*}"
: "${got:=0}"
fi
if [[ "$got" -eq "$want" ]]; then
PASS=$((PASS + 1))
printf ' ✓ %s (count=%s)\n' "$name" "$got"
else
FAIL=$((FAIL + 1))
FAIL_NAMES+=("$name")
printf ' ✗ %s — pattern %s: expected %s calls, got %s\n' "$name" "$pattern" "$want" "$got"
fi
}
# ─────────────────────────────────────────────────────────────────────────────
# Test cases
# ─────────────────────────────────────────────────────────────────────────────
printf '\n== Test 1: happy path — promote + redeploy + verify all green ==\n'
m=$(mkmock)
mock_set "$m" aws_ecr_get_image '{"manifests":[{"digest":"sha256:src"}]}' 0
mock_set "$m" aws_ecr_describe_image '' 1 # rollback tag does NOT exist (fresh day)
mock_set "$m" aws_ecr_put_image '' 0
mock_set "$m" cp_redeploy_tenant '{"redeployed":true}' 0 # rc=0 → 2xx success
mock_set "$m" tenant_buildinfo '{"git_sha":"abc1234","build_time":"2026-05-12T05:00:00Z"}' 0
mock_set "$m" tenant_health 'ok' 0
out=$(run_script "$m")
assert_exit "happy path exits 0" "$out" 0
assert_calls_contain "snapshot put-image for rollback tag" "$m" 'aws_ecr_put_image latest-prev-20260512'
assert_calls_contain "promote put-image for dest tag" "$m" 'aws_ecr_put_image latest /'
assert_calls_count "redeploy called per tenant (2)" "$m" '^cp_redeploy_tenant ' 2
assert_calls_count "buildinfo verified per tenant (2)" "$m" '^tenant_buildinfo ' 2
assert_calls_count "health probed per tenant (2)" "$m" '^tenant_health ' 2
rm -rf "$m"
printf '\n== Test 2: preflight fails when source tag missing → exit 1, no mutations ==\n'
m=$(mkmock)
mock_set "$m" aws_ecr_get_image '' 1 # source-tag lookup fails
out=$(run_script "$m")
assert_exit "preflight failure exits 1" "$out" 1
assert_contains "logs source-tag not found error" "$out" "source tag 'staging-latest' not found"
assert_calls_count "no put-image on preflight fail" "$m" '^aws_ecr_put_image' 0
assert_calls_count "no redeploy on preflight fail" "$m" '^cp_redeploy_tenant' 0
rm -rf "$m"
printf '\n== Test 3: snapshot is idempotent when rollback tag already exists today ==\n'
m=$(mkmock)
mock_set "$m" aws_ecr_get_image '{"manifests":[]}' 0
mock_set "$m" aws_ecr_describe_image 'sha256:existingrollback' 0 # rollback tag DOES exist
mock_set "$m" aws_ecr_put_image '' 0
mock_set "$m" cp_redeploy_tenant '{"ok":true}' 0
mock_set "$m" tenant_buildinfo '{"git_sha":"abc1234"}' 0
mock_set "$m" tenant_health 'ok' 0
out=$(run_script "$m")
assert_exit "happy with existing snapshot still exits 0" "$out" 0
assert_contains "logs idempotent skip message" "$out" 'already exists today.*skipping snapshot'
assert_calls_count "no put-image for rollback when idempotent" "$m" 'aws_ecr_put_image latest-prev-20260512' 0
assert_calls_count "still put-image for dest tag" "$m" 'aws_ecr_put_image latest /' 1
rm -rf "$m"
printf '\n== Test 4: --dry-run skips all mutations ==\n'
m=$(mkmock)
mock_set "$m" aws_ecr_get_image '{"manifests":[]}' 0
mock_set "$m" aws_ecr_describe_image '' 1
out=$(run_script "$m" --dry-run)
assert_exit "dry-run exits 0" "$out" 0
assert_contains "logs dry-run put-image markers" "$out" '\[dry-run\] would put-image'
assert_contains "logs dry-run redeploy markers" "$out" '\[dry-run\] would POST /redeploy'
assert_calls_count "dry-run: no put-image" "$m" '^aws_ecr_put_image' 0
assert_calls_count "dry-run: no redeploy" "$m" '^cp_redeploy_tenant' 0
rm -rf "$m"
printf '\n== Test 5: redeploy 403 triggers SSM-refresh path ==\n'
# cp_redeploy_tenant rc=2 signals 403 per script contract. Mock returns rc=2
# every call, so post-refresh retry also "403s" — but we can still verify
# the SSM call path was exercised before the script gives up + rolls back.
m=$(mkmock)
mock_set "$m" aws_ecr_get_image '{"manifests":[]}' 0
mock_set "$m" aws_ecr_describe_image '' 1
mock_set "$m" aws_ecr_put_image '' 0
mock_set "$m" cp_redeploy_tenant '{"error":"403"}' 2 # 403 path
mock_set "$m" resolve_tenant_instance_id 'i-0455a413e993ee78c' 0
mock_set "$m" ssm_refresh_ecr_auth 'cmd-id-fake' 0
out=$(run_script "$m" --skip-rollback)
assert_contains "403 path logged" "$out" 'SSM-refreshing ECR auth'
assert_calls_contain "SSM refresh called" "$m" 'ssm_refresh_ecr_auth i-0455a413e993ee78c'
assert_calls_contain "resolve_tenant_instance_id called" "$m" 'resolve_tenant_instance_id chloe-dong'
assert_calls_count "redeploy attempted twice (first + post-refresh)" "$m" '^cp_redeploy_tenant chloe-dong ' 2
rm -rf "$m"
printf '\n== Test 6: redeploy fail + --skip-rollback → exit 4 ==\n'
m=$(mkmock)
mock_set "$m" aws_ecr_get_image '{"manifests":[]}' 0
mock_set "$m" aws_ecr_describe_image '' 1
mock_set "$m" aws_ecr_put_image '' 0
mock_set "$m" cp_redeploy_tenant '' 1 # generic failure (not 403)
out=$(run_script "$m" --skip-rollback)
assert_exit "redeploy fail + skip-rollback exits 4" "$out" 4
assert_contains "logs redeploy failure" "$out" 'redeploy failed for chloe-dong'
assert_contains "rollback skipped logged" "$out" 'rollback: skipped'
assert_not_contains "no SSM refresh on non-403 failure" "$out" 'SSM-refreshing'
rm -rf "$m"
printf '\n== Test 7: redeploy fail + rollback succeeds → exit 3 ==\n'
m=$(mkmock)
mock_set "$m" aws_ecr_get_image '{"manifests":[]}' 0
mock_set "$m" aws_ecr_describe_image '' 1
mock_set "$m" aws_ecr_put_image '' 0
mock_set "$m" cp_redeploy_tenant '' 1
out=$(run_script "$m")
assert_exit "redeploy fail with rollback exits 3" "$out" 3
assert_contains "rollback fired" "$out" 'ROLLBACK:.*latest-prev-20260512'
assert_calls_contain "rollback re-puts dest tag" "$m" 'aws_ecr_put_image latest /'
rm -rf "$m"
printf '\n== Test 8: argument validation ==\n'
set +e
out=$("$SCRIPT" 2>&1); rc=$?
set -e
if [[ $rc -eq 64 ]] && printf '%s' "$out" | grep -q 'required:.*--source-tag'; then
PASS=$((PASS + 1)); printf ' ✓ exit 64 on missing args with usage line\n'
else
FAIL=$((FAIL + 1)); FAIL_NAMES+=("missing-args error")
printf ' ✗ exit 64 on missing args (got %s)\n' "$rc"
fi
set +e
out=$("$SCRIPT" --source-tag x --dest-tag x --tenants y 2>&1); rc=$?
set -e
if [[ $rc -eq 64 ]] && printf '%s' "$out" | grep -q 'must differ'; then
PASS=$((PASS + 1)); printf ' ✓ exit 64 when source==dest\n'
else
FAIL=$((FAIL + 1)); FAIL_NAMES+=("source==dest validation")
printf ' ✗ source==dest should fail (got %s)\n' "$rc"
fi
set +e
out=$("$SCRIPT" --source-tag x --dest-tag y --tenants t --bogus-flag 2>&1); rc=$?
set -e
if [[ $rc -eq 64 ]] && printf '%s' "$out" | grep -q 'unknown argument'; then
PASS=$((PASS + 1)); printf ' ✓ exit 64 on unknown flag\n'
else
FAIL=$((FAIL + 1)); FAIL_NAMES+=("unknown-flag error")
printf ' ✗ unknown-flag should fail (got %s)\n' "$rc"
fi
printf '\n== Test 9: ROLLBACK_TAG follows YYYYMMDD via NOW_OVERRIDE_DATE ==\n'
m=$(mkmock)
mock_set "$m" aws_ecr_get_image '{}' 0
mock_set "$m" aws_ecr_describe_image '' 1
mock_set "$m" aws_ecr_put_image '' 0
mock_set "$m" cp_redeploy_tenant '{}' 0
mock_set "$m" tenant_buildinfo '{}' 0
mock_set "$m" tenant_health 'ok' 0
set +e
NOW_OVERRIDE_DATE=20260603 SSM_SETTLE_SECONDS=0 "$SCRIPT" \
--source-tag a --dest-tag b --tenants t1 --mock-dir "$m" >/dev/null 2>&1
rc=$?
set -e
if [[ $rc -eq 0 ]]; then
PASS=$((PASS + 1)); printf ' ✓ run succeeded with custom NOW_OVERRIDE_DATE\n'
else
FAIL=$((FAIL + 1)); FAIL_NAMES+=("NOW_OVERRIDE_DATE run")
printf ' ✗ NOW_OVERRIDE_DATE run failed (rc=%s)\n' "$rc"
fi
assert_calls_contain "rollback tag uses NOW_OVERRIDE_DATE (20260603)" "$m" 'aws_ecr_put_image b-prev-20260603'
rm -rf "$m"
printf '\n== Test 10: empty source manifest fails preflight ==\n'
m=$(mkmock)
mock_set "$m" aws_ecr_get_image '' 0 # rc=0 but empty body (the "None" case)
out=$(run_script "$m")
assert_exit "empty source manifest fails preflight" "$out" 1
assert_contains "empty manifest message" "$out" 'returned empty manifest'
rm -rf "$m"
printf '\n== Test 11: tenant_buildinfo failure during verify → rollback ==\n'
m=$(mkmock)
mock_set "$m" aws_ecr_get_image '{"manifests":[]}' 0
mock_set "$m" aws_ecr_describe_image '' 1
mock_set "$m" aws_ecr_put_image '' 0
mock_set "$m" cp_redeploy_tenant '{"ok":true}' 0
mock_set "$m" tenant_buildinfo '' 1 # buildinfo probe fails
mock_set "$m" tenant_health 'ok' 0
out=$(run_script "$m")
assert_exit "verify failure → rollback succeeds → exit 3" "$out" 3
assert_contains "logs buildinfo failure" "$out" '/buildinfo failed for chloe-dong'
assert_contains "rollback fired after verify fail" "$out" 'ROLLBACK:'
rm -rf "$m"
printf '\n== Test 12: ssm_refresh_ecr_auth JSON escaping (CWE-78 / OFFSEC-001) ==\n'
# Verify the python3 snippet in ssm_refresh_ecr_auth produces valid JSON and
# correctly escapes shell-injection characters in region + account ID fields.
# The fix replaces unquoted shell-printf interpolation with json.dumps.
PYCODE='import json,sys;r=sys.argv[1];a=sys.argv[2];ecr="aws ecr get-login-password --region "+json.dumps(r)[1:-1]+" | docker login --username AWS --password-stdin "+json.dumps(a)[1:-1]+".dkr.ecr."+json.dumps(r)[1:-1]+".amazonaws.com";print(json.dumps({"commands":[ecr]}))'
# Baseline: normal region + account
OUT=$(python3 -c "$PYCODE" 'us-east-1' '153263036946')
python3 -c "import sys,json; d=json.loads(sys.stdin.read()); assert 'commands' in d; c=d['commands'][0]; assert 'us-east-1' in c and '153263036946' in c and c.startswith('aws ecr get-login-password')" <<< "$OUT" \
&& echo " ok: normal region+account" || { echo " FAIL: invalid JSON for normal case"; exit 1; }
# Injection: region with double-quote
OUT=$(python3 -c "$PYCODE" 'us"-east-1' '153263036946')
python3 -c "import sys,json; d=json.loads(sys.stdin.read()); c=d['commands'][0]; assert c" <<< "$OUT" \
&& echo " ok: region with quote injection → valid JSON" || { echo " FAIL"; exit 1; }
# Injection: account with double-quote
OUT=$(python3 -c "$PYCODE" 'us-east-1' '15"326"3036946')
python3 -c "import sys,json; d=json.loads(sys.stdin.read()); c=d['commands'][0]; assert c" <<< "$OUT" \
&& echo " ok: account with quote injection → valid JSON" || { echo " FAIL"; exit 1; }
# No double-encoding: region appears as literal 'us-east-1' in command string
OUT=$(python3 -c "$PYCODE" 'us-east-1' '153263036946')
python3 -c "import sys,json; d=json.loads(sys.stdin.read()); c=d['commands'][0]; assert 'us-east-1' in c" <<< "$OUT" \
&& echo " ok: no double-encoding in command string" || { echo " FAIL"; exit 1; }
# ─────────────────────────────────────────────────────────────────────────────
printf '\n────────────────────────────────────\n'
if [[ $FAIL -eq 0 ]]; then
printf 'All %d tests passed.\n' "$PASS"
exit 0
else
printf '%d passed, %d failed.\n' "$PASS" "$FAIL"
printf 'Failed tests:\n'
for n in "${FAIL_NAMES[@]}"; do printf ' - %s\n' "$n"; done
exit 1
fi
-361
View File
@@ -1,361 +0,0 @@
"""Tests for `.gitea/scripts/lint_bp_context_emit_match.py` — Tier 2f lint.
Structural enforcement of internal#350 Tier 2f: BP `status_check_contexts`
and the set of contexts emitted by `.gitea/workflows/*.yml` must agree.
Bidirectional rule:
(a) BP-only: every context in `branch_protections/<branch>.status_check_contexts`
must have at least one EMITTER — a workflow `name:` + job `name:` (or job key)
+ `pull_request` (or `push`) event that produces it. A BP context without
an emitter blocks merges forever (Gitea treats absent-as-pending, NOT
absent-as-skipped). This is the phantom-required-check class
(`feedback_phantom_required_check_after_gitea_migration`).
(b) EMITTER-only: NO automatic flag. The PR#656 case (workflow added a
sentinel context not yet in BP) is Tier 2g's job — a diff-based PR-time
lint. Tier 2f runs scheduled and would falsely flag every transitional
state during a BP rollout. We only flag the BP-empty case in this
direction as a NOTICE (informational), not as an error.
Tier 2f runs on a daily schedule + workflow_dispatch and files a
`[ci-bp-drift]`-tagged issue on mismatch.
Test classes (per `feedback_branch_count_before_approving`):
- test_perfect_match_passes — BP has [X]; workflows emit X.
Exit 0. No issue filed/edited.
- test_bp_orphan_context_fails — BP has [Y] but no workflow
emits Y. Exit 1. Issue body lists the orphan and the closest
candidate workflow names (Levenshtein-1 suggestion for typos).
- test_emitter_orphan_only_warns — workflow emits Z but BP
doesn't have it. Exit 0 with ::notice:: (NOT ::error::) because
Tier 2g handles this at PR time.
- test_multiple_orphans_aggregated — two BP orphans surfaced
together, not short-circuited.
- test_bp_empty_lints_nothing — BP has no contexts.
Exit 0 cleanly.
- test_api_403_skips_gracefully — branch_protections endpoint
403s (token-scope). Exit 0 with ::error::, do NOT red-X.
- test_api_404_skips_gracefully — branch has no protection.
Exit 0 cleanly.
- test_context_event_match_required — BP context says `(push)` and
workflow only emits on `pull_request`. That's NOT a match — the
BP-required gate would still wedge. Exit 1.
- test_workflow_event_mapping_pull_request_target — `pull_request_target`
in workflow `on:` emits a `(pull_request)` context (Gitea convention).
Match counts.
- test_idempotent_issue_filing — when an issue already exists
with the canonical title prefix, edit it instead of POSTing a new one
(idempotency contract — mirrors ci-required-drift).
Run:
python3 -m pytest tests/test_lint_bp_context_emit_match.py -v
"""
from __future__ import annotations
import importlib.util
import os
import sys
from pathlib import Path
from unittest import mock
import pytest
SCRIPT_PATH = (
Path(__file__).resolve().parent.parent
/ ".gitea"
/ "scripts"
/ "lint_bp_context_emit_match.py"
)
def _import_lint():
spec = importlib.util.spec_from_file_location(
f"lint_bp_emit_{os.getpid()}", SCRIPT_PATH
)
m = importlib.util.module_from_spec(spec)
spec.loader.exec_module(m)
return m
@pytest.fixture()
def envset(tmp_path, monkeypatch):
wf = tmp_path / ".gitea" / "workflows"
wf.mkdir(parents=True)
monkeypatch.setenv("WORKFLOWS_DIR", str(wf))
monkeypatch.setenv("GITEA_TOKEN", "stub")
monkeypatch.setenv("GITEA_HOST", "git.example.test")
monkeypatch.setenv("REPO", "owner/molecule-core")
monkeypatch.setenv("BRANCH", "main")
monkeypatch.setenv("DRIFT_LABEL", "ci-bp-drift")
return wf
def _write_wf(d: Path, name: str, content: str) -> Path:
p = d / name
p.write_text(content)
return p
def _stub_api(monkeypatch, lint_mod, bp_response, issue_search_response=None, posted_record=None):
"""Stub the module's `api` function.
bp_response: ("ok", {"status_check_contexts": [...]})
or ("forbidden", None) / ("not_found", None)
issue_search_response: list of issues matching the search query (
may be empty; default empty)
posted_record: dict in which to record any POST/PATCH calls made
(so tests can assert idempotency).
"""
if issue_search_response is None:
issue_search_response = []
if posted_record is None:
posted_record = {}
def fake_api(method, path, *, body=None, query=None):
if "branch_protections" in path:
return bp_response
if "issues/search" in path or "/issues?" in path or path.endswith("/issues"):
if method == "GET":
return ("ok", list(issue_search_response))
if method == "POST":
posted_record.setdefault("posts", []).append({"path": path, "body": body})
return ("ok", {"number": 9001, "html_url": "http://t/9001"})
if "/issues/" in path and method == "PATCH":
posted_record.setdefault("patches", []).append({"path": path, "body": body})
return ("ok", {"number": 9001})
if "/labels" in path:
return ("ok", [{"id": 10, "name": "ci-bp-drift"}, {"id": 9, "name": "tier:high"}])
return ("ok", {})
monkeypatch.setattr(lint_mod, "api", fake_api)
return posted_record
# ---------------------------------------------------------------------------
# Perfect match — both sides agree.
# ---------------------------------------------------------------------------
def test_perfect_match_passes(envset, monkeypatch, capsys):
_write_wf(
envset,
"ci.yml",
"name: CI\non:\n pull_request:\n branches: [main]\njobs:\n"
" all-required:\n runs-on: x\n steps:\n - run: echo hi\n",
)
m = _import_lint()
_stub_api(
monkeypatch,
m,
("ok", {"status_check_contexts": ["CI / all-required (pull_request)"]}),
)
rc = m.run()
assert rc == 0
# ---------------------------------------------------------------------------
# BP-only orphan — context with no emitter.
# ---------------------------------------------------------------------------
def test_bp_orphan_context_fails(envset, monkeypatch, capsys):
_write_wf(
envset,
"ci.yml",
"name: CI\non:\n pull_request:\n branches: [main]\njobs:\n"
" all-required:\n runs-on: x\n steps:\n - run: echo hi\n",
)
m = _import_lint()
posted = _stub_api(
monkeypatch,
m,
("ok", {"status_check_contexts": [
"CI / all-required (pull_request)",
"Ghost workflow / ghost (pull_request)", # the orphan
]}),
)
rc = m.run()
assert rc == 1
out = capsys.readouterr().out
assert "Ghost workflow" in out or "ghost" in out.lower()
# ---------------------------------------------------------------------------
# Emitter-only direction → notice, not error (Tier 2g territory).
# ---------------------------------------------------------------------------
def test_emitter_orphan_only_warns(envset, monkeypatch, capsys):
_write_wf(
envset,
"extra.yml",
"name: Extra\non:\n pull_request:\n branches: [main]\njobs:\n"
" extra-job:\n runs-on: x\n steps:\n - run: echo hi\n",
)
_write_wf(
envset,
"ci.yml",
"name: CI\non:\n pull_request:\n branches: [main]\njobs:\n"
" all-required:\n runs-on: x\n steps:\n - run: echo hi\n",
)
m = _import_lint()
_stub_api(
monkeypatch,
m,
("ok", {"status_check_contexts": ["CI / all-required (pull_request)"]}),
)
rc = m.run()
assert rc == 0
out = capsys.readouterr().out
assert "Extra" in out or "extra" in out
# ---------------------------------------------------------------------------
# Multiple BP orphans — all surfaced.
# ---------------------------------------------------------------------------
def test_multiple_orphans_aggregated(envset, monkeypatch, capsys):
_write_wf(
envset,
"ci.yml",
"name: CI\non:\n pull_request:\n branches: [main]\njobs:\n"
" all-required:\n runs-on: x\n steps:\n - run: echo hi\n",
)
m = _import_lint()
_stub_api(
monkeypatch,
m,
("ok", {"status_check_contexts": [
"CI / all-required (pull_request)",
"Phantom A / a (pull_request)",
"Phantom B / b (pull_request)",
]}),
)
rc = m.run()
assert rc == 1
out = capsys.readouterr().out
assert "Phantom A" in out and "Phantom B" in out
# ---------------------------------------------------------------------------
# BP has zero contexts → nothing to lint, pass.
# ---------------------------------------------------------------------------
def test_bp_empty_lints_nothing(envset, monkeypatch, capsys):
_write_wf(
envset,
"ci.yml",
"name: CI\non:\n pull_request:\n branches: [main]\njobs:\n"
" all-required:\n runs-on: x\n steps:\n - run: echo hi\n",
)
m = _import_lint()
_stub_api(monkeypatch, m, ("ok", {"status_check_contexts": []}))
rc = m.run()
assert rc == 0
# ---------------------------------------------------------------------------
# API 403 — graceful-degrade.
# ---------------------------------------------------------------------------
def test_api_403_skips_gracefully(envset, monkeypatch, capsys):
_write_wf(
envset,
"ci.yml",
"name: CI\non:\n pull_request:\n branches: [main]\njobs:\n"
" j:\n runs-on: x\n steps:\n - run: echo hi\n",
)
m = _import_lint()
_stub_api(monkeypatch, m, ("forbidden", None))
rc = m.run()
assert rc == 0
err = capsys.readouterr().err
assert "403" in err or "scope" in err.lower() or "token" in err.lower()
# ---------------------------------------------------------------------------
# API 404 — branch has no protection → clean exit.
# ---------------------------------------------------------------------------
def test_api_404_skips_gracefully(envset, monkeypatch, capsys):
_write_wf(
envset,
"ci.yml",
"name: CI\non:\n pull_request:\n branches: [main]\njobs:\n"
" j:\n runs-on: x\n steps:\n - run: echo hi\n",
)
m = _import_lint()
_stub_api(monkeypatch, m, ("not_found", None))
rc = m.run()
assert rc == 0
# ---------------------------------------------------------------------------
# Event-suffix match strict: BP says (push), workflow emits (pull_request)
# only. Mismatch — flag.
# ---------------------------------------------------------------------------
def test_context_event_match_required(envset, monkeypatch, capsys):
_write_wf(
envset,
"ci.yml",
"name: CI\non:\n pull_request:\n branches: [main]\njobs:\n"
" all-required:\n runs-on: x\n steps:\n - run: echo hi\n",
)
m = _import_lint()
_stub_api(
monkeypatch,
m,
("ok", {"status_check_contexts": ["CI / all-required (push)"]}),
)
rc = m.run()
assert rc == 1
# ---------------------------------------------------------------------------
# `pull_request_target` in workflow `on:` emits a `(pull_request)` context
# (Gitea convention — verified empirically on molecule-core).
# ---------------------------------------------------------------------------
def test_workflow_event_mapping_pull_request_target(envset, monkeypatch, capsys):
_write_wf(
envset,
"secret.yml",
"name: Secret scan\non:\n pull_request_target:\n branches: [main]\njobs:\n"
" scan:\n runs-on: x\n name: Scan diff for credential-shaped strings\n"
" steps:\n - run: echo hi\n",
)
m = _import_lint()
_stub_api(
monkeypatch,
m,
("ok", {"status_check_contexts": [
"Secret scan / Scan diff for credential-shaped strings (pull_request)",
]}),
)
rc = m.run()
assert rc == 0
# ---------------------------------------------------------------------------
# Idempotency — existing open issue is PATCHed, not duplicated.
# ---------------------------------------------------------------------------
def test_idempotent_issue_filing(envset, monkeypatch, capsys):
_write_wf(
envset,
"ci.yml",
"name: CI\non:\n pull_request:\n branches: [main]\njobs:\n"
" all-required:\n runs-on: x\n steps:\n - run: echo hi\n",
)
m = _import_lint()
posted = _stub_api(
monkeypatch,
m,
("ok", {"status_check_contexts": [
"CI / all-required (pull_request)",
"Ghost / g (pull_request)",
]}),
issue_search_response=[
{
"number": 4242,
"title": "[ci-bp-drift] owner/molecule-core/main: BP→emitter mismatch",
"state": "open",
"html_url": "http://t/4242",
}
],
)
rc = m.run()
assert rc == 1
# Should have PATCHed, not POSTed a new one.
assert posted.get("patches"), f"expected PATCH on existing issue; got {posted!r}"
assert not posted.get("posts"), f"expected no POSTs; got {posted!r}"
@@ -1,440 +0,0 @@
"""Tests for `.gitea/scripts/lint_continue_on_error_tracking.py` — Tier 2e lint.
Structural enforcement of internal#350 Tier 2e: every
`continue-on-error: true` directive in `.gitea/workflows/*.yml` must be
accompanied by a `# mc#NNNN` or `# internal#NNNN` comment within 2 lines
(above OR below), the referenced issue must be OPEN, and ≤14 days old
counted from `created_at`. Older than 14 days → fail, forces close-or-renew.
The class this lint exists to prevent: Phase-3-masked failures.
`continue-on-error: true` on platform-build had been hiding mc#664-class
regressions for ~3 weeks before #656 surfaced them. A 14-day cap forces
a tracker review cycle, preventing indefinite-mask drift.
Test classes (per `feedback_branch_count_before_approving`):
- test_coe_false_is_ignored — `continue-on-error: false`
has no tracker requirement. Exit 0.
- test_coe_true_with_open_recent_mc_passes — coe true + adjacent
`# mc#1234` comment, issue open and 5 days old. Exit 0.
- test_coe_true_with_open_recent_internal — adjacent `# internal#42`,
open, 1 day old. Exit 0.
- test_coe_true_no_comment_fails — coe true with no
nearby tracker comment. Exit 1, names the file+line and the
required tracker shape.
- test_coe_true_comment_too_far_away_fails — `# mc#1234` 5 lines
above the coe directive — outside the 2-line window. Exit 1.
- test_coe_true_closed_issue_fails — issue exists but is
`state=closed`. Exit 1, names the issue.
- test_coe_true_too_old_issue_fails — issue open but
`created_at` is 20 days ago. Exit 1, mentions the age cap.
- test_coe_true_at_14d_passes — boundary: exactly 14d
old. Inclusive. Exit 0.
- test_coe_true_at_15d_fails — boundary: 15d old.
Exclusive. Exit 1.
- test_coe_true_api_404_fails — referenced issue
doesn't exist (deleted or typo). Exit 1.
- test_coe_true_api_403_skips — token-scope issue,
graceful-degrade per Tier 2a contract: exit 0 with ::error::,
do NOT red-X every PR over auth.
- test_two_coe_true_one_violating — multi-violation
aggregation: one passes, one fails → exit 1, all violations
surfaced (not short-circuited).
- test_coe_true_with_comment_AFTER_directive — comment on the line
below the directive (within 2 lines) still satisfies. Exit 0.
- test_coe_value_quoted_string_true_caught — `continue-on-error: "true"`
parses to the string "true" via PyYAML which is truthy but NOT
boolean `True` — the lint catches the IR `True` from
`continue-on-error: true`, and also flags string `"true"` because
Gitea's evaluator coerces it.
Stubs:
- `subprocess.run` is NOT used (this lint reads only files +
HTTP); `urllib.request.urlopen` IS stubbed via monkeypatch on
the module-level `api()` to drive issue-API responses.
Run:
python3 -m pytest tests/test_lint_continue_on_error_tracking.py -v
"""
from __future__ import annotations
import importlib.util
import os
import sys
from datetime import datetime, timedelta, timezone
from pathlib import Path
from unittest import mock
import pytest
SCRIPT_PATH = (
Path(__file__).resolve().parent.parent
/ ".gitea"
/ "scripts"
/ "lint_continue_on_error_tracking.py"
)
def _now_iso() -> str:
return datetime.now(timezone.utc).strftime("%Y-%m-%dT%H:%M:%SZ")
def _iso_days_ago(days: int) -> str:
dt = datetime.now(timezone.utc) - timedelta(days=days)
return dt.strftime("%Y-%m-%dT%H:%M:%SZ")
def _import_lint():
spec = importlib.util.spec_from_file_location(
f"lint_coe_tracking_{os.getpid()}",
SCRIPT_PATH,
)
m = importlib.util.module_from_spec(spec)
spec.loader.exec_module(m)
return m
@pytest.fixture()
def envset(tmp_path, monkeypatch):
wf_dir = tmp_path / ".gitea" / "workflows"
wf_dir.mkdir(parents=True)
monkeypatch.setenv("WORKFLOWS_DIR", str(wf_dir))
monkeypatch.setenv("GITEA_TOKEN", "fake-token")
monkeypatch.setenv("GITEA_HOST", "git.example.test")
monkeypatch.setenv("REPO", "owner/molecule-core")
monkeypatch.setenv("INTERNAL_REPO", "owner/internal")
monkeypatch.setenv("MAX_AGE_DAYS", "14")
return wf_dir
def _write_wf(wf_dir: Path, name: str, content: str) -> Path:
p = wf_dir / name
p.write_text(content)
return p
def _stub_issue_api(monkeypatch, lint_mod, responses: dict[str, dict]):
"""Stub the module's `fetch_issue` to drive issue lookups.
responses keyed by `"<repo-suffix>#NNN"` (e.g. `"mc#1234"`, `"internal#42"`).
Each value is either:
- a dict {"state": "open"|"closed", "created_at": "..."} — normal hit
- the string "404" — issue not found
- the string "403" — auth denied (token scope)
- the string "500" — server error
"""
def fake_fetch(slug_kind: str, num: int):
key = f"{slug_kind}#{num}"
r = responses.get(key)
if r is None:
# Tests must declare every issue they reference.
raise AssertionError(f"no test stub for {key}")
if r == "404":
return ("not_found", None)
if r == "403":
return ("forbidden", None)
if r == "500":
return ("error", None)
return ("ok", r)
monkeypatch.setattr(lint_mod, "fetch_issue", fake_fetch)
# ---------------------------------------------------------------------------
# continue-on-error: false → no tracker required
# ---------------------------------------------------------------------------
def test_coe_false_is_ignored(envset, monkeypatch, capsys):
_write_wf(
envset,
"ok.yml",
"name: ok\non: [push]\njobs:\n a:\n runs-on: x\n continue-on-error: false\n steps:\n - run: echo hi\n",
)
m = _import_lint()
_stub_issue_api(monkeypatch, m, {})
rc = m.run()
assert rc == 0
# ---------------------------------------------------------------------------
# coe true + adjacent OPEN recent mc# tracker → pass
# ---------------------------------------------------------------------------
def test_coe_true_with_open_recent_mc_passes(envset, monkeypatch, capsys):
_write_wf(
envset,
"wf.yml",
"name: w\non: [push]\njobs:\n a:\n runs-on: x\n"
" # mc#1234 — surfacing flaky test, fix-or-renew\n"
" continue-on-error: true\n"
" steps:\n - run: echo hi\n",
)
m = _import_lint()
_stub_issue_api(
monkeypatch,
m,
{"mc#1234": {"state": "open", "created_at": _iso_days_ago(5)}},
)
rc = m.run()
assert rc == 0
def test_coe_true_with_open_recent_internal(envset, monkeypatch, capsys):
_write_wf(
envset,
"wf.yml",
"name: w\non: [push]\njobs:\n a:\n runs-on: x\n"
" continue-on-error: true\n"
" # internal#42 — phase-3 ladder soak\n"
" steps:\n - run: echo hi\n",
)
m = _import_lint()
_stub_issue_api(
monkeypatch,
m,
{"internal#42": {"state": "open", "created_at": _iso_days_ago(1)}},
)
rc = m.run()
assert rc == 0
# ---------------------------------------------------------------------------
# coe true + no nearby tracker comment → fail
# ---------------------------------------------------------------------------
def test_coe_true_no_comment_fails(envset, monkeypatch, capsys):
_write_wf(
envset,
"bad.yml",
"name: b\non: [push]\njobs:\n a:\n runs-on: x\n"
" continue-on-error: true\n"
" steps:\n - run: echo hi\n",
)
m = _import_lint()
_stub_issue_api(monkeypatch, m, {})
rc = m.run()
assert rc == 1
out = capsys.readouterr().out
assert "bad.yml" in out
assert "mc#" in out.lower() or "internal#" in out.lower()
# ---------------------------------------------------------------------------
# Comment too far away — outside the 2-line window → fail
# ---------------------------------------------------------------------------
def test_coe_true_comment_too_far_away_fails(envset, monkeypatch, capsys):
_write_wf(
envset,
"far.yml",
"name: f\non: [push]\n"
"# mc#1234 — referenced too far above\n"
"jobs:\n"
" a:\n"
" runs-on: x\n"
" name: stage\n"
" timeout-minutes: 5\n"
" continue-on-error: true\n"
" steps:\n - run: echo hi\n",
)
m = _import_lint()
_stub_issue_api(
monkeypatch,
m,
{"mc#1234": {"state": "open", "created_at": _iso_days_ago(1)}},
)
rc = m.run()
assert rc == 1
# ---------------------------------------------------------------------------
# Closed issue → fail
# ---------------------------------------------------------------------------
def test_coe_true_closed_issue_fails(envset, monkeypatch, capsys):
_write_wf(
envset,
"wf.yml",
"name: w\non: [push]\njobs:\n a:\n runs-on: x\n"
" # mc#999\n"
" continue-on-error: true\n"
" steps:\n - run: echo hi\n",
)
m = _import_lint()
_stub_issue_api(
monkeypatch,
m,
{"mc#999": {"state": "closed", "created_at": _iso_days_ago(1)}},
)
rc = m.run()
assert rc == 1
out = capsys.readouterr().out
assert "999" in out
assert "closed" in out.lower()
# ---------------------------------------------------------------------------
# Issue is too old (>14d) → fail
# ---------------------------------------------------------------------------
def test_coe_true_too_old_issue_fails(envset, monkeypatch, capsys):
_write_wf(
envset,
"wf.yml",
"name: w\non: [push]\njobs:\n a:\n runs-on: x\n"
" # mc#7\n"
" continue-on-error: true\n"
" steps:\n - run: echo hi\n",
)
m = _import_lint()
_stub_issue_api(
monkeypatch,
m,
{"mc#7": {"state": "open", "created_at": _iso_days_ago(20)}},
)
rc = m.run()
assert rc == 1
out = capsys.readouterr().out
assert "20" in out or "14" in out
def test_coe_true_at_14d_passes(envset, monkeypatch, capsys):
_write_wf(
envset,
"wf.yml",
"name: w\non: [push]\njobs:\n a:\n runs-on: x\n"
" # mc#7\n"
" continue-on-error: true\n"
" steps:\n - run: echo hi\n",
)
m = _import_lint()
_stub_issue_api(
monkeypatch,
m,
{"mc#7": {"state": "open", "created_at": _iso_days_ago(14)}},
)
rc = m.run()
assert rc == 0
def test_coe_true_at_15d_fails(envset, monkeypatch, capsys):
_write_wf(
envset,
"wf.yml",
"name: w\non: [push]\njobs:\n a:\n runs-on: x\n"
" # mc#7\n"
" continue-on-error: true\n"
" steps:\n - run: echo hi\n",
)
m = _import_lint()
_stub_issue_api(
monkeypatch,
m,
{"mc#7": {"state": "open", "created_at": _iso_days_ago(15)}},
)
rc = m.run()
assert rc == 1
# ---------------------------------------------------------------------------
# 404 (deleted/typo) → fail
# ---------------------------------------------------------------------------
def test_coe_true_api_404_fails(envset, monkeypatch, capsys):
_write_wf(
envset,
"wf.yml",
"name: w\non: [push]\njobs:\n a:\n runs-on: x\n"
" # mc#9999\n"
" continue-on-error: true\n"
" steps:\n - run: echo hi\n",
)
m = _import_lint()
_stub_issue_api(monkeypatch, m, {"mc#9999": "404"})
rc = m.run()
assert rc == 1
# ---------------------------------------------------------------------------
# 403 (token-scope, not lint's fault) → exit 0 with ::error:: per
# Tier 2a graceful-degrade contract.
# ---------------------------------------------------------------------------
def test_coe_true_api_403_skips(envset, monkeypatch, capsys):
_write_wf(
envset,
"wf.yml",
"name: w\non: [push]\njobs:\n a:\n runs-on: x\n"
" # mc#1\n"
" continue-on-error: true\n"
" steps:\n - run: echo hi\n",
)
m = _import_lint()
_stub_issue_api(monkeypatch, m, {"mc#1": "403"})
rc = m.run()
assert rc == 0
err = capsys.readouterr().err
assert "403" in err or "scope" in err.lower() or "token" in err.lower()
# ---------------------------------------------------------------------------
# Multi-violation aggregation — all surfaced, not short-circuited
# ---------------------------------------------------------------------------
def test_two_coe_true_one_violating(envset, monkeypatch, capsys):
_write_wf(
envset,
"two.yml",
"name: t\non: [push]\njobs:\n"
" good:\n"
" runs-on: x\n"
" # mc#100\n"
" continue-on-error: true\n"
" steps:\n - run: echo a\n"
" bad:\n"
" runs-on: x\n"
" continue-on-error: true\n"
" steps:\n - run: echo b\n",
)
m = _import_lint()
_stub_issue_api(
monkeypatch,
m,
{"mc#100": {"state": "open", "created_at": _iso_days_ago(2)}},
)
rc = m.run()
assert rc == 1
out = capsys.readouterr().out
assert "bad" in out.lower() or "no tracker" in out.lower()
# ---------------------------------------------------------------------------
# Comment on line AFTER the directive — within 2-line window → pass
# ---------------------------------------------------------------------------
def test_coe_true_with_comment_AFTER_directive(envset, monkeypatch, capsys):
_write_wf(
envset,
"after.yml",
"name: a\non: [push]\njobs:\n a:\n runs-on: x\n"
" continue-on-error: true # mc#3\n"
" steps:\n - run: echo hi\n",
)
m = _import_lint()
_stub_issue_api(
monkeypatch,
m,
{"mc#3": {"state": "open", "created_at": _iso_days_ago(0)}},
)
rc = m.run()
assert rc == 0
# ---------------------------------------------------------------------------
# Quoted string `"true"` — coerced by Gitea evaluator; should be caught
# ---------------------------------------------------------------------------
def test_coe_value_quoted_string_true_caught(envset, monkeypatch, capsys):
_write_wf(
envset,
"quoted.yml",
"name: q\non: [push]\njobs:\n a:\n runs-on: x\n"
" continue-on-error: \"true\"\n"
" steps:\n - run: echo hi\n",
)
m = _import_lint()
_stub_issue_api(monkeypatch, m, {})
rc = m.run()
# No tracker → fail
assert rc == 1
-88
View File
@@ -1,88 +0,0 @@
"""Tests for `.gitea/scripts/lint-curl-status-capture.py`.
Run:
python3 -m pytest tests/test_lint_curl_status_capture.py -v
"""
from __future__ import annotations
import importlib.util
from pathlib import Path
SCRIPT_PATH = (
Path(__file__).resolve().parent.parent
/ ".gitea"
/ "scripts"
/ "lint-curl-status-capture.py"
)
def _load_module():
spec = importlib.util.spec_from_file_location("lint_curl_status_capture", SCRIPT_PATH)
module = importlib.util.module_from_spec(spec)
spec.loader.exec_module(module)
return module
def test_finds_quoted_echo_fallback_pollution():
lint = _load_module()
content = """
HTTP_CODE=$(curl -sS -o /tmp/body -w "%{http_code}" https://example.test || echo "000")
"""
findings = lint.scan_content("workflow.yml", content)
assert len(findings) == 1
assert "echo" in findings[0].snippet
def test_finds_unquoted_echo_fallback_pollution():
lint = _load_module()
content = """
HTTP_CODE=$(curl -sS -o /tmp/body -w '%{http_code}' https://example.test || echo 000)
"""
findings = lint.scan_content("workflow.yml", content)
assert len(findings) == 1
assert "echo" in findings[0].snippet
def test_finds_printf_fallback_pollution():
lint = _load_module()
content = """
HTTP_CODE=$(curl -sS -o /tmp/body -w '%{http_code}' https://example.test || printf '000')
"""
findings = lint.scan_content("workflow.yml", content)
assert len(findings) == 1
assert "printf" in findings[0].snippet
def test_ignores_tempfile_fallback_after_curl():
lint = _load_module()
content = """
set +e
curl -sS -o /tmp/body -w '%{http_code}' https://example.test >/tmp/code
rc=$?
set -e
HTTP_CODE=$(cat /tmp/code 2>/dev/null || echo "000")
[ -z "$HTTP_CODE" ] && HTTP_CODE="000"
"""
assert lint.scan_content("workflow.yml", content) == []
def test_collapses_bash_line_continuations():
lint = _load_module()
content = """
HTTP_CODE=$(curl -sS -o /tmp/body \\
-w "%{http_code}" \\
https://example.test \\
|| echo "000")
"""
findings = lint.scan_content("workflow.yml", content)
assert len(findings) == 1
-357
View File
@@ -1,357 +0,0 @@
"""Tests for `.gitea/scripts/lint_mask_pr_atomicity.py` — Tier 2d lint.
Structural enforcement of internal#350 Tier 2d: a PR that touches
`.gitea/workflows/ci.yml` and modifies `continue-on-error` OR the
`all-required` sentinel's `needs:` block must EITHER:
- Touch both atomically in the same PR (preferred), OR
- Cross-link to the paired PR via `Paired: #NNN` in body OR a commit
message.
The class this lint exists to prevent: PR#665 (interim
continue-on-error: true on platform-build) + PR#668 (sentinel-exempt)
were designed-as-a-pair but merged solo — #665 landed at 04:47Z, #668
still open at 05:07Z when the watchdog fired. ~20 min of main red.
Test classes (per `feedback_branch_count_before_approving`, every
prod branch enumerated):
- test_diff_touches_neither_passes — diff is in ci.yml
but neither continue-on-error nor all-required.needs is touched.
PR is exempt. Exit 0.
- test_diff_touches_both_atomically_passes — both touched in
the same PR. Atomic. Exit 0.
- test_diff_touches_coe_only_no_pair_fails — continue-on-error
flipped without sentinel-needs change AND no `Paired: #NNN`
reference anywhere. Exit 1.
- test_diff_touches_needs_only_no_pair_fails — sentinel `needs:`
changed without `continue-on-error` change AND no pair reference.
Exit 1.
- test_diff_touches_coe_only_pair_in_body — coe changed, no
needs change, body has `Paired: #668`. Exit 0.
- test_diff_touches_needs_only_pair_in_commit — needs changed, no
coe change, commit message includes `Paired: #665`. Exit 0.
- test_paired_reference_must_be_numeric — `Paired: #abc` or
`Paired: NNNN` (missing `#`) doesn't satisfy the rule. Exit 1.
- test_ci_yml_unchanged_skips — no ci.yml in the
diff at all (defensive — workflow paths-filter already prevents,
but the lint should not crash). Exit 0.
The lint receives base SHA + head SHA via env (set by the workflow
from the pull_request payload) and uses `git show` to read both
sides without a separate clone. Tests stub `subprocess.run` to drive
the diff content; the actual git is never invoked.
Run:
python3 -m pytest tests/test_lint_mask_pr_atomicity.py -v
Dependencies: stdlib + PyYAML (the script reads ci.yml via PyYAML AST
per `feedback_behavior_based_ast_gates`). No network. No live git.
"""
from __future__ import annotations
import importlib.util
import os
import subprocess
import sys
import textwrap
from pathlib import Path
from unittest import mock
import pytest
SCRIPT_PATH = (
Path(__file__).resolve().parent.parent
/ ".gitea"
/ "scripts"
/ "lint_mask_pr_atomicity.py"
)
# Minimal ci.yml fixture — only the bits the lint actually parses
# (a job with continue-on-error + the all-required aggregator).
CI_YML_BASE = """
name: CI
on:
push:
branches: [main]
pull_request:
branches: [main]
jobs:
platform-build:
runs-on: ubuntu-latest
continue-on-error: false
steps:
- run: echo build
canvas-build:
runs-on: ubuntu-latest
continue-on-error: false
steps:
- run: echo build
all-required:
runs-on: ubuntu-latest
needs:
- platform-build
- canvas-build
if: always()
steps:
- run: echo agg
"""
# Same as base but with continue-on-error flipped on platform-build.
CI_YML_COE_FLIPPED = CI_YML_BASE.replace(
" platform-build:\n runs-on: ubuntu-latest\n continue-on-error: false",
" platform-build:\n runs-on: ubuntu-latest\n continue-on-error: true",
)
# Same as base but with canvas-build dropped from all-required.needs.
CI_YML_NEEDS_CHANGED = CI_YML_BASE.replace(
" needs:\n - platform-build\n - canvas-build",
" needs:\n - platform-build",
)
# Both changed at once.
CI_YML_BOTH = CI_YML_COE_FLIPPED.replace(
" needs:\n - platform-build\n - canvas-build",
" needs:\n - platform-build",
)
def _import_lint(monkeypatch):
"""Import the lint module under a fresh name per test."""
spec = importlib.util.spec_from_file_location(
f"lint_mask_pr_atomicity_{os.getpid()}_{id(monkeypatch)}",
SCRIPT_PATH,
)
m = importlib.util.module_from_spec(spec)
spec.loader.exec_module(m)
return m
def _stub_git(base_yml: str | None, head_yml: str | None, commits: list[str]):
"""Build a fake `subprocess.run` that emulates git show + log.
base_yml / head_yml: contents the lint sees at base/head SHA.
Pass `None` to simulate "path didn't exist on that side" (git
show returns exit code 128 — file-not-in-tree).
commits: list of commit messages on the PR (head's ancestry up to
the base merge-base). The lint runs
`git log --format=%B base..head` to find Paired: refs.
"""
def fake_run(cmd, *args, **kwargs):
if not isinstance(cmd, list):
raise AssertionError(f"unexpected non-list cmd: {cmd!r}")
# `git show <sha>:<path>`
if cmd[:2] == ["git", "show"] and len(cmd) >= 3 and ":" in cmd[2]:
sha, path = cmd[2].split(":", 1)
if "base" in sha or "BASE" in sha:
content = base_yml
else:
content = head_yml
if content is None:
return subprocess.CompletedProcess(
cmd, returncode=128, stdout="", stderr="fatal: path not in tree"
)
return subprocess.CompletedProcess(
cmd, returncode=0, stdout=content, stderr=""
)
# `git log --format=%B base..head -- .`
if cmd[:2] == ["git", "log"]:
body = "\n\n--commit-boundary--\n\n".join(commits)
return subprocess.CompletedProcess(
cmd, returncode=0, stdout=body, stderr=""
)
# `git diff --name-only base..head`
if cmd[:2] == ["git", "diff"]:
# If either side had ci.yml, it's in the diff; else not.
paths = []
if (base_yml or "") != (head_yml or ""):
paths.append(".gitea/workflows/ci.yml")
return subprocess.CompletedProcess(
cmd, returncode=0, stdout="\n".join(paths) + "\n", stderr=""
)
raise AssertionError(f"unexpected git invocation: {cmd!r}")
return fake_run
@pytest.fixture()
def env(monkeypatch):
monkeypatch.setenv("BASE_SHA", "base-sha-1")
monkeypatch.setenv("HEAD_SHA", "head-sha-1")
monkeypatch.setenv("PR_BODY", "")
monkeypatch.setenv("CI_WORKFLOW_PATH", ".gitea/workflows/ci.yml")
monkeypatch.setenv("SENTINEL_JOB_KEY", "all-required")
return monkeypatch
# ---------------------------------------------------------------------------
# Diff in ci.yml but neither rule predicate triggered → pass
# ---------------------------------------------------------------------------
def test_diff_touches_neither_passes(env, monkeypatch, capsys):
# Add a comment-only change (no coe flip, no needs change).
base = CI_YML_BASE
head = "# a harmless comment\n" + CI_YML_BASE
monkeypatch.setattr(
subprocess, "run", _stub_git(base, head, ["chore: comment"])
)
m = _import_lint(monkeypatch)
rc = m.run()
assert rc == 0
out = capsys.readouterr().out
assert "no atomicity risk" in out.lower() or "ok" in out.lower()
# ---------------------------------------------------------------------------
# Diff touches BOTH coe and sentinel.needs in the same PR → atomic, pass
# ---------------------------------------------------------------------------
def test_diff_touches_both_atomically_passes(env, monkeypatch, capsys):
monkeypatch.setattr(
subprocess,
"run",
_stub_git(CI_YML_BASE, CI_YML_BOTH, ["fix(ci): atomic flip"]),
)
m = _import_lint(monkeypatch)
rc = m.run()
assert rc == 0
out = capsys.readouterr().out
assert "atomic" in out.lower()
# ---------------------------------------------------------------------------
# Diff touches ONLY continue-on-error, no pair reference → fail
# ---------------------------------------------------------------------------
def test_diff_touches_coe_only_no_pair_fails(env, monkeypatch, capsys):
monkeypatch.setattr(
subprocess,
"run",
_stub_git(
CI_YML_BASE,
CI_YML_COE_FLIPPED,
["fix(ci): flip coe on platform-build"],
),
)
m = _import_lint(monkeypatch)
rc = m.run()
assert rc == 1
out = capsys.readouterr().out
assert "paired" in out.lower() or "atomicity" in out.lower()
# Actionable failure: must name what is missing.
assert "continue-on-error" in out.lower()
# ---------------------------------------------------------------------------
# Diff touches ONLY sentinel.needs, no pair reference → fail
# ---------------------------------------------------------------------------
def test_diff_touches_needs_only_no_pair_fails(env, monkeypatch, capsys):
monkeypatch.setattr(
subprocess,
"run",
_stub_git(
CI_YML_BASE,
CI_YML_NEEDS_CHANGED,
["fix(ci): drop canvas-build from sentinel"],
),
)
m = _import_lint(monkeypatch)
rc = m.run()
assert rc == 1
out = capsys.readouterr().out
assert "paired" in out.lower() or "atomicity" in out.lower()
assert "needs" in out.lower() or "sentinel" in out.lower()
# ---------------------------------------------------------------------------
# COE-only flip with `Paired: #668` in PR body → pass
# ---------------------------------------------------------------------------
def test_diff_touches_coe_only_pair_in_body(env, monkeypatch, capsys):
monkeypatch.setenv("PR_BODY", "Interim coe flip. Paired: #668")
monkeypatch.setattr(
subprocess,
"run",
_stub_git(
CI_YML_BASE,
CI_YML_COE_FLIPPED,
["fix(ci): flip coe on platform-build"],
),
)
m = _import_lint(monkeypatch)
rc = m.run()
assert rc == 0
out = capsys.readouterr().out
assert "paired" in out.lower()
assert "668" in out
# ---------------------------------------------------------------------------
# Needs-only flip with `Paired: #665` in a commit message → pass
# ---------------------------------------------------------------------------
def test_diff_touches_needs_only_pair_in_commit(env, monkeypatch, capsys):
monkeypatch.setattr(
subprocess,
"run",
_stub_git(
CI_YML_BASE,
CI_YML_NEEDS_CHANGED,
[
"fix(ci): drop canvas-build from sentinel\n\nPaired: #665",
],
),
)
m = _import_lint(monkeypatch)
rc = m.run()
assert rc == 0
out = capsys.readouterr().out
assert "paired" in out.lower()
assert "665" in out
# ---------------------------------------------------------------------------
# `Paired: #abc` is not a valid issue/PR ref — fail
# ---------------------------------------------------------------------------
def test_paired_reference_must_be_numeric(env, monkeypatch, capsys):
monkeypatch.setenv("PR_BODY", "Paired: #abc")
monkeypatch.setattr(
subprocess,
"run",
_stub_git(
CI_YML_BASE,
CI_YML_COE_FLIPPED,
["fix(ci): flip coe"],
),
)
m = _import_lint(monkeypatch)
rc = m.run()
assert rc == 1
# ---------------------------------------------------------------------------
# Defensive: ci.yml not in diff at all → skip cleanly
# ---------------------------------------------------------------------------
def test_ci_yml_unchanged_skips(env, monkeypatch, capsys):
monkeypatch.setattr(
subprocess, "run", _stub_git(CI_YML_BASE, CI_YML_BASE, ["chore: noop"])
)
m = _import_lint(monkeypatch)
rc = m.run()
assert rc == 0
out = capsys.readouterr().out
assert "ci.yml" in out.lower() or "not in" in out.lower() or "skip" in out.lower()
# ---------------------------------------------------------------------------
# Cross-cutting: file ADDED on head side (no base) — coe inferred as
# "newly added with coe=true". Should NOT trigger the lint (it's a new
# file, not a flip — Tier 2e covers tracking-issue for new coe=true).
# ---------------------------------------------------------------------------
def test_ci_yml_newly_added_passes(env, monkeypatch, capsys):
monkeypatch.setattr(
subprocess,
"run",
_stub_git(None, CI_YML_COE_FLIPPED, ["feat(ci): add ci.yml"]),
)
m = _import_lint(monkeypatch)
rc = m.run()
assert rc == 0
@@ -1,430 +0,0 @@
"""Tests for `.gitea/scripts/lint_required_context_exists_in_bp.py` — Tier 2g lint.
Structural enforcement of internal#350 Tier 2g: when a PR adds a NEW
commit-status emission (a workflow's `name:` + a new job-key/name pair
that didn't exist on the base side), the PR must EITHER:
(a) Include a `# bp-required: yes` directive comment on the workflow
AND the new context must already be in
`branch_protections/<branch>.status_check_contexts`, OR
(b) Include a `# bp-required: pending #NNN` directive (acknowledged
asymmetry with a tracking issue), OR
(c) Include a `# bp-exempt: <reason>` directive (informational job,
not intended to be a required gate).
Default (no directive on a new emitter) = FAIL.
The class this prevents
-----------------------
PR#656 added `CI / all-required (pull_request)` as a sentinel context
that workflows emit, but BP did NOT list it — so when `platform-build`
failed, `all-required` failed, but BP let the PR merge anyway. Cascade
to mc#664. With Tier 2g, PR#656 would have been blocked until either
the BP PATCH ran alongside OR the author marked the emission with a
`bp-required: pending #NNN` directive.
Test classes (per `feedback_branch_count_before_approving`):
- test_no_new_emissions_skips — diff doesn't add any
new emitter; pass.
- test_new_emission_with_bp_required_yes_in_bp — directive set AND
BP lists the context; pass.
- test_new_emission_with_bp_required_yes_not_in_bp — directive set
BUT BP doesn't list; fail.
- test_new_emission_with_bp_required_pending — `# bp-required:
pending #800` directive references an open tracker; pass.
- test_new_emission_with_bp_exempt — `# bp-exempt:
informational` directive; pass.
- test_new_emission_no_directive_fails — no directive on a
new emission; fail with the 3-option fix-hint.
- test_modified_workflow_with_new_job_is_new — pre-existing
workflow gains a new job with a new name → counted as new
emission. Apply rule.
- test_modified_workflow_job_renamed_is_new — same workflow,
same job-key, but job `name:` changed → counted as new emission
(the OLD context name disappears; the NEW one needs validation).
- test_unrelated_workflow_edit_is_not_new — edit a comment in
an existing emitter; no new context introduced; pass.
- test_api_403_skips_gracefully — BP read 403; exit 0
with stderr ::error::.
- test_directive_must_be_in_workflow_yml — directive in PR
body alone is NOT sufficient; the comment must live in the
workflow file so future scheduled Tier 2f runs can see it.
Run:
python3 -m pytest tests/test_lint_required_context_exists_in_bp.py -v
"""
from __future__ import annotations
import importlib.util
import os
import subprocess
import sys
from pathlib import Path
from unittest import mock
import pytest
SCRIPT_PATH = (
Path(__file__).resolve().parent.parent
/ ".gitea"
/ "scripts"
/ "lint_required_context_exists_in_bp.py"
)
def _import_lint():
spec = importlib.util.spec_from_file_location(
f"lint_required_ctx_in_bp_{os.getpid()}", SCRIPT_PATH
)
m = importlib.util.module_from_spec(spec)
spec.loader.exec_module(m)
return m
# Sample workflows used across multiple tests.
WF_CI_BASE = """name: CI
on:
pull_request:
branches: [main]
jobs:
all-required:
runs-on: x
steps:
- run: echo hi
"""
# CI with a new job added.
WF_CI_NEW_JOB = """name: CI
on:
pull_request:
branches: [main]
jobs:
all-required:
runs-on: x
steps:
- run: echo hi
brand-new:
runs-on: x
steps:
- run: echo new
"""
WF_CI_NEW_JOB_BP_YES = """name: CI
on:
pull_request:
branches: [main]
jobs:
all-required:
runs-on: x
steps:
- run: echo hi
# bp-required: yes
brand-new:
runs-on: x
steps:
- run: echo new
"""
WF_CI_NEW_JOB_BP_PENDING = """name: CI
on:
pull_request:
branches: [main]
jobs:
all-required:
runs-on: x
steps:
- run: echo hi
# bp-required: pending #800
brand-new:
runs-on: x
steps:
- run: echo new
"""
WF_CI_NEW_JOB_BP_EXEMPT = """name: CI
on:
pull_request:
branches: [main]
jobs:
all-required:
runs-on: x
steps:
- run: echo hi
# bp-exempt: informational sticker, not a gate
brand-new:
runs-on: x
steps:
- run: echo new
"""
# Same WF, job rename only (CI/all-required → CI/sentinel).
WF_CI_JOB_RENAMED = """name: CI
on:
pull_request:
branches: [main]
jobs:
all-required:
runs-on: x
name: sentinel
steps:
- run: echo hi
"""
# Comment-only edit — should NOT count as new emission.
WF_CI_COMMENT_ONLY = """# a fresh comment line
name: CI
on:
pull_request:
branches: [main]
jobs:
all-required:
runs-on: x
steps:
- run: echo hi
"""
def _stub_git_and_api(
monkeypatch,
lint_mod,
base_files: dict[str, str | None],
head_files: dict[str, str | None],
bp_response,
):
"""Stub `subprocess.run` for git, and `lint_mod.api` for HTTP."""
def fake_run(cmd, *args, **kwargs):
if not isinstance(cmd, list):
raise AssertionError(f"unexpected cmd: {cmd!r}")
if cmd[:2] == ["git", "show"] and ":" in cmd[2]:
sha, path = cmd[2].split(":", 1)
side = base_files if "base" in sha else head_files
content = side.get(path)
if content is None:
return subprocess.CompletedProcess(cmd, 128, "", "fatal: path not in tree")
return subprocess.CompletedProcess(cmd, 0, content, "")
if cmd[:2] == ["git", "diff"]:
# Names of files that changed (any side has differing contents
# from the other, or only appears on one side).
all_paths = set(base_files) | set(head_files)
changed = sorted(p for p in all_paths if base_files.get(p) != head_files.get(p))
return subprocess.CompletedProcess(cmd, 0, "\n".join(changed) + "\n", "")
raise AssertionError(f"unexpected cmd: {cmd!r}")
monkeypatch.setattr(subprocess, "run", fake_run)
def fake_api(method, path, *, body=None, query=None):
if "branch_protections" in path:
return bp_response
return ("ok", {})
monkeypatch.setattr(lint_mod, "api", fake_api)
@pytest.fixture()
def env(monkeypatch):
monkeypatch.setenv("BASE_SHA", "base-x")
monkeypatch.setenv("HEAD_SHA", "head-x")
monkeypatch.setenv("GITEA_TOKEN", "stub")
monkeypatch.setenv("GITEA_HOST", "git.example.test")
monkeypatch.setenv("REPO", "owner/molecule-core")
monkeypatch.setenv("BRANCH", "main")
monkeypatch.setenv("WORKFLOWS_DIR", ".gitea/workflows")
return monkeypatch
# ---------------------------------------------------------------------------
# No new emissions — pass.
# ---------------------------------------------------------------------------
def test_no_new_emissions_skips(env, monkeypatch, capsys):
m = _import_lint()
_stub_git_and_api(
monkeypatch,
m,
base_files={".gitea/workflows/ci.yml": WF_CI_BASE},
head_files={".gitea/workflows/ci.yml": WF_CI_BASE},
bp_response=("ok", {"status_check_contexts": []}),
)
rc = m.run()
assert rc == 0
# ---------------------------------------------------------------------------
# New emission + bp-required: yes + in BP → pass.
# ---------------------------------------------------------------------------
def test_new_emission_with_bp_required_yes_in_bp(env, monkeypatch, capsys):
m = _import_lint()
_stub_git_and_api(
monkeypatch,
m,
base_files={".gitea/workflows/ci.yml": WF_CI_BASE},
head_files={".gitea/workflows/ci.yml": WF_CI_NEW_JOB_BP_YES},
bp_response=(
"ok",
{"status_check_contexts": ["CI / brand-new (pull_request)"]},
),
)
rc = m.run()
assert rc == 0
# ---------------------------------------------------------------------------
# bp-required: yes but NOT in BP → fail.
# ---------------------------------------------------------------------------
def test_new_emission_with_bp_required_yes_not_in_bp(env, monkeypatch, capsys):
m = _import_lint()
_stub_git_and_api(
monkeypatch,
m,
base_files={".gitea/workflows/ci.yml": WF_CI_BASE},
head_files={".gitea/workflows/ci.yml": WF_CI_NEW_JOB_BP_YES},
bp_response=("ok", {"status_check_contexts": []}),
)
rc = m.run()
assert rc == 1
out = capsys.readouterr().out
assert "brand-new" in out
# ---------------------------------------------------------------------------
# bp-required: pending #NNN → pass.
# ---------------------------------------------------------------------------
def test_new_emission_with_bp_required_pending(env, monkeypatch, capsys):
m = _import_lint()
_stub_git_and_api(
monkeypatch,
m,
base_files={".gitea/workflows/ci.yml": WF_CI_BASE},
head_files={".gitea/workflows/ci.yml": WF_CI_NEW_JOB_BP_PENDING},
bp_response=("ok", {"status_check_contexts": []}),
)
rc = m.run()
assert rc == 0
# ---------------------------------------------------------------------------
# bp-exempt → pass.
# ---------------------------------------------------------------------------
def test_new_emission_with_bp_exempt(env, monkeypatch, capsys):
m = _import_lint()
_stub_git_and_api(
monkeypatch,
m,
base_files={".gitea/workflows/ci.yml": WF_CI_BASE},
head_files={".gitea/workflows/ci.yml": WF_CI_NEW_JOB_BP_EXEMPT},
bp_response=("ok", {"status_check_contexts": []}),
)
rc = m.run()
assert rc == 0
# ---------------------------------------------------------------------------
# New emission, no directive → fail with 3-option fix hint.
# ---------------------------------------------------------------------------
def test_new_emission_no_directive_fails(env, monkeypatch, capsys):
m = _import_lint()
_stub_git_and_api(
monkeypatch,
m,
base_files={".gitea/workflows/ci.yml": WF_CI_BASE},
head_files={".gitea/workflows/ci.yml": WF_CI_NEW_JOB},
bp_response=("ok", {"status_check_contexts": []}),
)
rc = m.run()
assert rc == 1
out = capsys.readouterr().out
assert "brand-new" in out
assert "bp-required" in out
assert "bp-exempt" in out
# ---------------------------------------------------------------------------
# Pre-existing workflow gains a new job → counted as new emission.
# ---------------------------------------------------------------------------
def test_modified_workflow_with_new_job_is_new(env, monkeypatch, capsys):
m = _import_lint()
_stub_git_and_api(
monkeypatch,
m,
base_files={".gitea/workflows/ci.yml": WF_CI_BASE},
head_files={".gitea/workflows/ci.yml": WF_CI_NEW_JOB},
bp_response=("ok", {"status_check_contexts": []}),
)
rc = m.run()
# No directive → fail
assert rc == 1
# ---------------------------------------------------------------------------
# Same workflow, same job-key, but job `name:` changed → new context.
# ---------------------------------------------------------------------------
def test_modified_workflow_job_renamed_is_new(env, monkeypatch, capsys):
m = _import_lint()
_stub_git_and_api(
monkeypatch,
m,
base_files={".gitea/workflows/ci.yml": WF_CI_BASE},
head_files={".gitea/workflows/ci.yml": WF_CI_JOB_RENAMED},
bp_response=("ok", {"status_check_contexts": []}),
)
rc = m.run()
assert rc == 1
out = capsys.readouterr().out
assert "sentinel" in out
# ---------------------------------------------------------------------------
# Comment-only edit → no new emission.
# ---------------------------------------------------------------------------
def test_unrelated_workflow_edit_is_not_new(env, monkeypatch, capsys):
m = _import_lint()
_stub_git_and_api(
monkeypatch,
m,
base_files={".gitea/workflows/ci.yml": WF_CI_BASE},
head_files={".gitea/workflows/ci.yml": WF_CI_COMMENT_ONLY},
bp_response=("ok", {"status_check_contexts": []}),
)
rc = m.run()
assert rc == 0
# ---------------------------------------------------------------------------
# BP API 403 → exit 0 with ::error::.
# ---------------------------------------------------------------------------
def test_api_403_skips_gracefully(env, monkeypatch, capsys):
m = _import_lint()
_stub_git_and_api(
monkeypatch,
m,
base_files={".gitea/workflows/ci.yml": WF_CI_BASE},
head_files={".gitea/workflows/ci.yml": WF_CI_NEW_JOB},
bp_response=("forbidden", None),
)
rc = m.run()
assert rc == 0
err = capsys.readouterr().err
assert "403" in err or "scope" in err.lower() or "token" in err.lower()
# ---------------------------------------------------------------------------
# Directive must be in the workflow YML, not PR body.
# ---------------------------------------------------------------------------
def test_directive_must_be_in_workflow_yml(env, monkeypatch, capsys):
monkeypatch = env
monkeypatch.setenv("PR_BODY", "bp-required: yes — see comment above")
m = _import_lint()
_stub_git_and_api(
monkeypatch,
m,
base_files={".gitea/workflows/ci.yml": WF_CI_BASE},
head_files={".gitea/workflows/ci.yml": WF_CI_NEW_JOB},
bp_response=("ok", {"status_check_contexts": []}),
)
rc = m.run()
# Even though PR body claims, the workflow itself lacks the directive.
assert rc == 1
-554
View File
@@ -1,554 +0,0 @@
"""Tests for `.gitea/scripts/lint-required-no-paths.py`.
Structural enforcement of `feedback_path_filtered_workflow_cant_be_required`:
no workflow whose status-check context is in `branch_protections/main`
`status_check_contexts` may use `paths:` or `paths-ignore:` filters in its
`on:` block. A path-filtered workflow silently does not fire on a PR whose
diff doesn't touch the filter — Gitea treats that as `pending` forever,
not `skipped`-as-`success`, so the gate degrades to an indefinite block.
Worse, a docs-only PR could never satisfy a required check whose filter
excludes docs paths, and the protected branch becomes unreachable.
Five test classes:
- test_no_required_workflows_succeeds — empty status_check_contexts → exit 0
- test_required_workflow_no_paths_passes — required workflow with no
paths filter → exit 0
- test_required_workflow_with_paths_filter_fails — required workflow
with `paths: ['**.go']` → exit 1, error names workflow
- test_required_workflow_with_paths_ignore_fails — same shape for
`paths-ignore`
- test_unknown_required_context_warns_not_fails — context whose
workflow file is missing → warn, do NOT fail (graceful — could be a
cross-repo context name or a workflow renamed mid-PR; the lint is for
paths-filter detection, not orphaned-context detection — that's
ci-required-drift's job)
Also covers the workflow-name → file-path mapping (parses the
`<workflow_name> / <job_name> (<event>)` context format) and the
multi-event `on:` block edge cases (paths under `on.push` vs `on.pull_request`
vs top-level `on.paths`).
Run:
python3 -m pytest tests/test_lint_required_no_paths.py -v
Dependencies: stdlib + PyYAML (already required by the script itself).
No network. No live Gitea calls — `api()` is stubbed.
"""
from __future__ import annotations
import importlib.util
import os
import sys
from pathlib import Path
from unittest import mock
import pytest
# --------------------------------------------------------------------------
# Module import fixture — mirror of tests/test_ci_required_drift.py shape
# --------------------------------------------------------------------------
SCRIPT_PATH = (
Path(__file__).resolve().parent.parent
/ ".gitea"
/ "scripts"
/ "lint-required-no-paths.py"
)
@pytest.fixture()
def lint_module(tmp_path, monkeypatch):
"""Import the script as a module with a clean env per test.
Tests need a per-test workflows directory under tmp_path; the module
reads `WORKFLOWS_DIR` from env. Fresh import per test means tests
cannot leak global state into each other.
"""
env = {
"GITEA_TOKEN": "test-token",
"GITEA_HOST": "git.example.test",
"REPO": "owner/repo",
"BRANCH": "main",
"WORKFLOWS_DIR": str(tmp_path / ".gitea" / "workflows"),
}
(tmp_path / ".gitea" / "workflows").mkdir(parents=True)
monkeypatch.setattr(os, "environ", {**os.environ, **env})
spec = importlib.util.spec_from_file_location(
f"lint_required_no_paths_{id(tmp_path)}", SCRIPT_PATH
)
m = importlib.util.module_from_spec(spec)
spec.loader.exec_module(m)
# Force-set the globals from env (they were captured at import time;
# we mutate them so the per-test tmp_path is what the script reads).
m.GITEA_TOKEN = env["GITEA_TOKEN"]
m.GITEA_HOST = env["GITEA_HOST"]
m.REPO = env["REPO"]
m.BRANCH = env["BRANCH"]
m.WORKFLOWS_DIR = env["WORKFLOWS_DIR"]
m.OWNER, m.NAME = "owner", "repo"
m.API = f"https://{env['GITEA_HOST']}/api/v1"
return m
def _write_workflow(workflows_dir: str, filename: str, content: str) -> Path:
p = Path(workflows_dir) / filename
p.write_text(content, encoding="utf-8")
return p
def _make_stub_api(responses: dict):
"""Build a fake `api()` callable.
`responses` maps (method, path) tuples to either:
- (status_int, body) → returned as-is
- Exception instance → raised
Calls are recorded in `.calls` for later assertion.
"""
class StubApi:
def __init__(self):
self.calls: list[tuple] = []
def __call__(self, method, path, *, body=None, query=None, expect_json=True):
self.calls.append((method, path, body, query))
key = (method, path)
if key not in responses:
raise AssertionError(
f"unexpected api call: {method} {path} (no stub registered)"
)
r = responses[key]
if isinstance(r, Exception):
raise r
return r
return StubApi()
# --------------------------------------------------------------------------
# context → (workflow_name, job_name, event) parser
# --------------------------------------------------------------------------
def test_parse_context_standard_shape(lint_module):
"""`<workflow_name> / <job_name> (<event>)` round-trips cleanly."""
parsed = lint_module.parse_context(
"Secret scan / Scan diff for credential-shaped strings (pull_request)"
)
assert parsed == (
"Secret scan",
"Scan diff for credential-shaped strings",
"pull_request",
)
def test_parse_context_with_slash_in_job_name(lint_module):
"""Job names CAN contain ' / ' literally in Gitea; the parser must
split on the LAST ' / ' before the trailing ' (event)' suffix."""
parsed = lint_module.parse_context(
"ci / setup / install-deps (pull_request)"
)
# Workflow = first segment; job = everything between first ' / ' and
# the trailing ' (event)'. Pragmatic split: the workflow name is
# `name:` from the YAML, so multi-slash workflow names are unlikely;
# treat the first ' / ' as the divider.
assert parsed[0] == "ci"
assert parsed[1] == "setup / install-deps"
assert parsed[2] == "pull_request"
def test_parse_context_unparseable_returns_none(lint_module):
"""Malformed context string → None so the caller can warn-and-skip."""
assert lint_module.parse_context("garbage no event marker") is None
assert lint_module.parse_context("") is None
# --------------------------------------------------------------------------
# workflow-name → file resolution
# --------------------------------------------------------------------------
def test_resolve_workflow_file_matches_name_attr(lint_module):
"""Resolution scans workflows/*.yml for a `name:` matching the
context's workflow_name. Filename is NOT the source of truth — the
`name:` attribute is, because Gitea's context format uses
`name:` (not the filename).
"""
_write_workflow(
lint_module.WORKFLOWS_DIR,
"some-file.yml",
"name: Secret scan\non:\n pull_request:\n types: [opened]\njobs:\n scan:\n runs-on: ubuntu-latest\n",
)
p = lint_module.resolve_workflow_file("Secret scan")
assert p is not None
assert p.name == "some-file.yml"
def test_resolve_workflow_file_returns_none_when_missing(lint_module):
"""No matching `name:` found → None."""
_write_workflow(
lint_module.WORKFLOWS_DIR,
"other.yml",
"name: Other\non:\n pull_request: {}\njobs:\n x:\n runs-on: ubuntu-latest\n",
)
assert lint_module.resolve_workflow_file("Secret scan") is None
# --------------------------------------------------------------------------
# paths-filter detection
# --------------------------------------------------------------------------
def test_workflow_has_no_paths_filter_clean(lint_module):
"""No paths/paths-ignore → returns empty list (no findings)."""
_write_workflow(
lint_module.WORKFLOWS_DIR,
"clean.yml",
"name: Clean\n"
"on:\n"
" pull_request:\n"
" types: [opened, synchronize]\n"
"jobs:\n"
" x:\n"
" runs-on: ubuntu-latest\n",
)
findings = lint_module.detect_paths_filters(
Path(lint_module.WORKFLOWS_DIR) / "clean.yml"
)
assert findings == []
def test_workflow_with_pull_request_paths_filter_detected(lint_module):
"""`on.pull_request.paths` → ONE finding naming pull_request + paths."""
_write_workflow(
lint_module.WORKFLOWS_DIR,
"bad.yml",
"name: Bad\n"
"on:\n"
" pull_request:\n"
" paths: ['**.go', 'workspace/**']\n"
"jobs:\n"
" x:\n"
" runs-on: ubuntu-latest\n",
)
findings = lint_module.detect_paths_filters(
Path(lint_module.WORKFLOWS_DIR) / "bad.yml"
)
assert len(findings) == 1
f = findings[0]
assert "pull_request" in f
assert "paths" in f
assert "**.go" in f or "workspace/**" in f # filter content surfaced
def test_workflow_with_paths_ignore_filter_detected(lint_module):
"""`on.pull_request.paths-ignore` → finding naming paths-ignore.
paths-ignore is the SAME class of defect: a docs-only PR (that
matches the ignore pattern) silently won't fire the workflow, and the
required context stays pending.
"""
_write_workflow(
lint_module.WORKFLOWS_DIR,
"bad.yml",
"name: Bad\n"
"on:\n"
" pull_request:\n"
" paths-ignore: ['docs/**']\n"
"jobs:\n"
" x:\n"
" runs-on: ubuntu-latest\n",
)
findings = lint_module.detect_paths_filters(
Path(lint_module.WORKFLOWS_DIR) / "bad.yml"
)
assert len(findings) == 1
assert "paths-ignore" in findings[0]
def test_workflow_with_push_paths_filter_detected(lint_module):
"""`on.push.paths` → also a finding. A required check on a PR is
typically `(pull_request)`-event, but a workflow may ALSO have a
push trigger; a paths filter on the push side affects the same
workflow file, and a future PR might add `paths:` to the wrong
event-branch and trip the gate. Surface all paths-filter sites.
"""
_write_workflow(
lint_module.WORKFLOWS_DIR,
"bad.yml",
"name: Bad\n"
"on:\n"
" pull_request:\n"
" types: [opened]\n"
" push:\n"
" branches: [main]\n"
" paths: ['**.py']\n"
"jobs:\n"
" x:\n"
" runs-on: ubuntu-latest\n",
)
findings = lint_module.detect_paths_filters(
Path(lint_module.WORKFLOWS_DIR) / "bad.yml"
)
assert len(findings) == 1
assert "push" in findings[0]
assert "paths" in findings[0]
def test_workflow_with_both_paths_and_paths_ignore_two_findings(lint_module):
"""Both filters under one event → two findings (one per offending
key). Test ensures the detector doesn't short-circuit after the
first."""
_write_workflow(
lint_module.WORKFLOWS_DIR,
"bad.yml",
"name: Bad\n"
"on:\n"
" pull_request:\n"
" paths: ['**.go']\n"
" paths-ignore: ['docs/**']\n"
"jobs:\n"
" x:\n"
" runs-on: ubuntu-latest\n",
)
findings = lint_module.detect_paths_filters(
Path(lint_module.WORKFLOWS_DIR) / "bad.yml"
)
assert len(findings) == 2
def test_workflow_with_on_shorthand_string_passes(lint_module):
"""`on: pull_request` (string shorthand, no sub-keys) cannot have a
paths filter — detector treats it as clean."""
_write_workflow(
lint_module.WORKFLOWS_DIR,
"clean.yml",
"name: Clean\non: pull_request\njobs:\n x:\n runs-on: ubuntu-latest\n",
)
findings = lint_module.detect_paths_filters(
Path(lint_module.WORKFLOWS_DIR) / "clean.yml"
)
assert findings == []
def test_workflow_with_on_list_shorthand_passes(lint_module):
"""`on: [pull_request, push]` (list shorthand) cannot carry filters
either — clean."""
_write_workflow(
lint_module.WORKFLOWS_DIR,
"clean.yml",
"name: Clean\non: [pull_request, push]\njobs:\n x:\n runs-on: ubuntu-latest\n",
)
findings = lint_module.detect_paths_filters(
Path(lint_module.WORKFLOWS_DIR) / "clean.yml"
)
assert findings == []
def test_workflow_on_event_with_null_value_passes(lint_module):
"""`pull_request:` with no body (None / null) is event-shorthand —
no filter possible."""
_write_workflow(
lint_module.WORKFLOWS_DIR,
"clean.yml",
"name: Clean\non:\n pull_request:\n push:\n branches: [main]\njobs:\n x:\n runs-on: ubuntu-latest\n",
)
findings = lint_module.detect_paths_filters(
Path(lint_module.WORKFLOWS_DIR) / "clean.yml"
)
assert findings == []
# --------------------------------------------------------------------------
# End-to-end lint (main) — required-checks fan-out
# --------------------------------------------------------------------------
def test_no_required_workflows_succeeds(lint_module, monkeypatch, capsys):
"""Empty status_check_contexts → exit 0, no findings reported."""
stub = _make_stub_api({
("GET", "/repos/owner/repo/branch_protections/main"): (
200,
{"status_check_contexts": []},
),
})
monkeypatch.setattr(lint_module, "api", stub)
rc = lint_module.run()
assert rc == 0
out = capsys.readouterr().out
assert "no required contexts" in out.lower() or "0 required" in out.lower()
def test_required_workflow_no_paths_passes(lint_module, monkeypatch, capsys):
"""A required workflow with no paths filter → exit 0."""
_write_workflow(
lint_module.WORKFLOWS_DIR,
"secret-scan.yml",
"name: Secret scan\non:\n pull_request:\n types: [opened]\njobs:\n scan:\n runs-on: ubuntu-latest\n",
)
stub = _make_stub_api({
("GET", "/repos/owner/repo/branch_protections/main"): (
200,
{
"status_check_contexts": [
"Secret scan / scan (pull_request)",
]
},
),
})
monkeypatch.setattr(lint_module, "api", stub)
rc = lint_module.run()
assert rc == 0
def test_required_workflow_with_paths_filter_fails(
lint_module, monkeypatch, capsys
):
"""A required workflow that has `paths:` filter → exit 1 + error
names the offending workflow + the filter."""
_write_workflow(
lint_module.WORKFLOWS_DIR,
"secret-scan.yml",
"name: Secret scan\n"
"on:\n"
" pull_request:\n"
" paths: ['**.go']\n"
"jobs:\n"
" scan:\n"
" runs-on: ubuntu-latest\n",
)
stub = _make_stub_api({
("GET", "/repos/owner/repo/branch_protections/main"): (
200,
{"status_check_contexts": ["Secret scan / scan (pull_request)"]},
),
})
monkeypatch.setattr(lint_module, "api", stub)
rc = lint_module.run()
assert rc == 1
out = capsys.readouterr().out
assert "secret-scan.yml" in out
assert "Secret scan" in out
assert "paths" in out
assert "::error::" in out
def test_required_workflow_with_paths_ignore_fails(
lint_module, monkeypatch, capsys
):
"""Same defect class for `paths-ignore` — exit 1, named."""
_write_workflow(
lint_module.WORKFLOWS_DIR,
"sop-tier-check.yml",
"name: sop-tier-check\n"
"on:\n"
" pull_request_target:\n"
" paths-ignore: ['docs/**']\n"
"jobs:\n"
" tier-check:\n"
" runs-on: ubuntu-latest\n",
)
stub = _make_stub_api({
("GET", "/repos/owner/repo/branch_protections/main"): (
200,
{
"status_check_contexts": [
"sop-tier-check / tier-check (pull_request_target)"
]
},
),
})
monkeypatch.setattr(lint_module, "api", stub)
rc = lint_module.run()
assert rc == 1
out = capsys.readouterr().out
assert "sop-tier-check.yml" in out
assert "paths-ignore" in out
def test_unknown_required_context_warns_not_fails(
lint_module, monkeypatch, capsys
):
"""Required context with no matching workflow file → warn, don't
fail. This is gracefully bounded — the lint's mandate is paths-filter
detection, not orphaned-context detection (`ci-required-drift` is the
canonical detector for that).
"""
# No workflows written → all required contexts will be unresolved.
stub = _make_stub_api({
("GET", "/repos/owner/repo/branch_protections/main"): (
200,
{
"status_check_contexts": [
"Mystery / job (pull_request)",
]
},
),
})
monkeypatch.setattr(lint_module, "api", stub)
rc = lint_module.run()
assert rc == 0 # warn-not-fail
out = capsys.readouterr().out
assert "::warning::" in out
assert "Mystery" in out
def test_multi_required_one_bad_one_good_fails(
lint_module, monkeypatch, capsys
):
"""Two required contexts; one workflow is bad. Lint still fails
(one defect is enough) and the error names ONLY the bad workflow."""
_write_workflow(
lint_module.WORKFLOWS_DIR,
"good.yml",
"name: Good\non:\n pull_request:\n types: [opened]\njobs:\n x:\n runs-on: ubuntu-latest\n",
)
_write_workflow(
lint_module.WORKFLOWS_DIR,
"bad.yml",
"name: Bad\n"
"on:\n"
" pull_request:\n"
" paths: ['src/**']\n"
"jobs:\n x:\n runs-on: ubuntu-latest\n",
)
stub = _make_stub_api({
("GET", "/repos/owner/repo/branch_protections/main"): (
200,
{
"status_check_contexts": [
"Good / x (pull_request)",
"Bad / x (pull_request)",
]
},
),
})
monkeypatch.setattr(lint_module, "api", stub)
rc = lint_module.run()
assert rc == 1
out = capsys.readouterr().out
assert "bad.yml" in out
# `good.yml` should NOT show up in the error block — only the bad one.
# (It may appear as a "checked" notice; assert it's not flagged as bad.)
assert "::error::" in out
error_lines = [ln for ln in out.split("\n") if ln.startswith("::error::") or "paths" in ln.lower() and "good" in ln.lower()]
# The good workflow must not appear under an ::error:: line referencing paths.
for ln in error_lines:
if ln.startswith("::error::"):
# The error line itself shouldn't name good.yml as offending.
assert "good.yml" not in ln
def test_protection_403_treated_as_skip(lint_module, monkeypatch, capsys):
"""If the token can't read branch_protections (HTTP 403), exit 0
with a clear ::error::-but-non-fatal note. Same scope-fallback shape
as ci-required-drift.py per the precedent.
Rationale: if the lint workflow itself can't read protection, the PR
can't make THIS state worse (a paths-filter PR was already addable
without the lint). Better to surface a token-scope problem loudly
than to red-X every PR until the token is fixed.
"""
stub = _make_stub_api({
("GET", "/repos/owner/repo/branch_protections/main"): (
lint_module.ApiError(
"GET /repos/owner/repo/branch_protections/main → HTTP 403: forbidden"
)
),
})
monkeypatch.setattr(lint_module, "api", stub)
rc = lint_module.run()
assert rc == 0
err = capsys.readouterr().err
assert "::error::" in err
assert "403" in err
-413
View File
@@ -1,413 +0,0 @@
"""Tests for `.gitea/scripts/lint-workflow-yaml.py` — Gitea-1.22.6-hostile shape lint.
Hard-gate (Tier-2) lint that catches workflow YAML shapes Gitea 1.22.6
silently rejects, so they never reach `main`. The six anti-patterns are
documented in saved memory; this test suite is the structural enforcement.
Per-rule positive (anti-pattern present -> exit 1) + negative (clean -> exit 0)
cases, plus a multi-file collision case and an aggregation case.
Run:
python3 -m pytest tests/test_lint_workflow_yaml.py -v
Dependencies: stdlib + PyYAML. No network.
Cross-links:
- feedback_gitea_workflow_dispatch_inputs_unsupported (rule 1)
- internal task #81 (rule 2 — workflow_run unsupported)
- feedback_workflow_name_with_slash_breaks_parsing (rule 3, if filed)
- feedback_gitea_cross_repo_uses_blocked (rule 5)
- feedback_act_runner_github_server_url (rule 6)
- feedback_smoke_test_vendor_truth_not_shape_match (test-shape rule)
"""
from __future__ import annotations
import subprocess
import sys
import textwrap
from pathlib import Path
import pytest # noqa: F401 (declares the dep)
REPO_ROOT = Path(__file__).resolve().parents[1]
SCRIPT = REPO_ROOT / ".gitea" / "scripts" / "lint-workflow-yaml.py"
def _run_lint(workflow_dir: Path) -> subprocess.CompletedProcess:
"""Invoke the lint as a subprocess against an isolated workflow dir."""
return subprocess.run(
[sys.executable, str(SCRIPT), "--workflow-dir", str(workflow_dir)],
capture_output=True,
text=True,
)
def _write(workflow_dir: Path, name: str, content: str) -> Path:
"""Write a workflow YAML fixture and return its path."""
workflow_dir.mkdir(parents=True, exist_ok=True)
p = workflow_dir / name
p.write_text(textwrap.dedent(content).lstrip())
return p
# ---------------------------------------------------------------------------
# Rule 1 — workflow_dispatch.inputs (Gitea 1.22.6 parser rejects)
# ---------------------------------------------------------------------------
WD_INPUTS_BAD = """
name: bad-wd-inputs
on:
workflow_dispatch:
inputs:
version:
description: "version"
required: true
type: string
jobs:
x:
runs-on: ubuntu-latest
steps:
- run: echo hi
"""
WD_INPUTS_OK = """
name: ok-wd-no-inputs
on:
workflow_dispatch:
push:
branches: [main]
jobs:
x:
runs-on: ubuntu-latest
steps:
- run: echo hi
"""
def test_rule1_workflow_dispatch_inputs_detects_violation(tmp_path):
_write(tmp_path, "bad.yml", WD_INPUTS_BAD)
r = _run_lint(tmp_path)
assert r.returncode == 1
assert "workflow_dispatch.inputs" in r.stdout
assert "bad.yml" in r.stdout
def test_rule1_workflow_dispatch_inputs_passes_when_absent(tmp_path):
_write(tmp_path, "ok.yml", WD_INPUTS_OK)
r = _run_lint(tmp_path)
assert r.returncode == 0, f"stdout={r.stdout}\nstderr={r.stderr}"
# ---------------------------------------------------------------------------
# Rule 2 — workflow_run event (not supported on Gitea 1.22.6)
# ---------------------------------------------------------------------------
WF_RUN_BAD = """
name: bad-workflow-run
on:
workflow_run:
workflows: ["upstream"]
types: [completed]
jobs:
x:
runs-on: ubuntu-latest
steps:
- run: echo hi
"""
WF_RUN_OK = """
name: ok-no-workflow-run
on:
push:
branches: [main]
jobs:
x:
runs-on: ubuntu-latest
steps:
- run: echo hi
"""
def test_rule2_workflow_run_event_detects_violation(tmp_path):
_write(tmp_path, "bad.yml", WF_RUN_BAD)
r = _run_lint(tmp_path)
assert r.returncode == 1
assert "workflow_run" in r.stdout
assert "bad.yml" in r.stdout
def test_rule2_workflow_run_event_passes_when_absent(tmp_path):
_write(tmp_path, "ok.yml", WF_RUN_OK)
r = _run_lint(tmp_path)
assert r.returncode == 0, f"stdout={r.stdout}\nstderr={r.stderr}"
# ---------------------------------------------------------------------------
# Rule 3 — name: contains "/" (breaks "<workflow> / <job> (<event>)" parsing)
# ---------------------------------------------------------------------------
NAME_SLASH_BAD = """
name: ci / build
on: [push]
jobs:
x:
runs-on: ubuntu-latest
steps:
- run: echo hi
"""
NAME_SLASH_OK = """
name: ci-build
on: [push]
jobs:
x:
runs-on: ubuntu-latest
steps:
- run: echo hi
"""
def test_rule3_name_with_slash_detects_violation(tmp_path):
_write(tmp_path, "bad.yml", NAME_SLASH_BAD)
r = _run_lint(tmp_path)
assert r.returncode == 1
assert "name" in r.stdout.lower()
assert "/" in r.stdout
assert "bad.yml" in r.stdout
def test_rule3_name_with_slash_passes_when_absent(tmp_path):
_write(tmp_path, "ok.yml", NAME_SLASH_OK)
r = _run_lint(tmp_path)
assert r.returncode == 0, f"stdout={r.stdout}\nstderr={r.stderr}"
# ---------------------------------------------------------------------------
# Rule 4 — name collision across files (cross-file)
# ---------------------------------------------------------------------------
COLLISION_A = """
name: shared-name
on: [push]
jobs:
x:
runs-on: ubuntu-latest
steps:
- run: echo a
"""
COLLISION_B = """
name: shared-name
on: [push]
jobs:
x:
runs-on: ubuntu-latest
steps:
- run: echo b
"""
DISTINCT_A = """
name: name-a
on: [push]
jobs:
x:
runs-on: ubuntu-latest
steps:
- run: echo a
"""
DISTINCT_B = """
name: name-b
on: [push]
jobs:
x:
runs-on: ubuntu-latest
steps:
- run: echo b
"""
def test_rule4_name_collision_across_two_files_detects_violation(tmp_path):
_write(tmp_path, "a.yml", COLLISION_A)
_write(tmp_path, "b.yml", COLLISION_B)
r = _run_lint(tmp_path)
assert r.returncode == 1
assert ("collision" in r.stdout.lower()) or ("duplicate" in r.stdout.lower())
assert "shared-name" in r.stdout
def test_rule4_name_collision_passes_when_names_distinct(tmp_path):
_write(tmp_path, "a.yml", DISTINCT_A)
_write(tmp_path, "b.yml", DISTINCT_B)
r = _run_lint(tmp_path)
assert r.returncode == 0, f"stdout={r.stdout}\nstderr={r.stderr}"
# ---------------------------------------------------------------------------
# Rule 5 — cross-repo `uses: org/repo/...@ref` (blocked on 1.22.6)
# ---------------------------------------------------------------------------
CROSS_REPO_BAD = """
name: bad-cross-repo
on: [push]
jobs:
x:
runs-on: ubuntu-latest
steps:
- uses: molecule-ai/molecule-ci/.gitea/actions/audit-force-merge@main
"""
# actions/checkout — bare `org/repo@ref` form — allowed. Rule 5 targets
# `org/repo/SUBPATH@ref` cross-repo composite/reusable references because
# only those resolve through `[actions].DEFAULT_ACTIONS_URL`+org-suspended-host.
CROSS_REPO_OK = """
name: ok-no-cross-repo
on: [push]
jobs:
x:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd
- run: echo hi
"""
def test_rule5_cross_repo_uses_detects_violation(tmp_path):
_write(tmp_path, "bad.yml", CROSS_REPO_BAD)
r = _run_lint(tmp_path)
assert r.returncode == 1
assert ("cross-repo" in r.stdout.lower()) or ("uses" in r.stdout.lower())
assert "bad.yml" in r.stdout
def test_rule5_cross_repo_uses_passes_when_only_actions_org(tmp_path):
_write(tmp_path, "ok.yml", CROSS_REPO_OK)
r = _run_lint(tmp_path)
assert r.returncode == 0, f"stdout={r.stdout}\nstderr={r.stderr}"
# ---------------------------------------------------------------------------
# Rule 6 — GITHUB_SERVER_URL heuristic (warn-not-fail per halt-condition 3)
# ---------------------------------------------------------------------------
GH_API_REF_NO_SERVER = """
name: warn-server-url
on: [push]
jobs:
x:
runs-on: ubuntu-latest
steps:
- run: curl https://api.github.com/repos/foo/bar
"""
GH_API_REF_WITH_SERVER = """
name: ok-server-url-set
on: [push]
env:
GITHUB_SERVER_URL: https://git.moleculesai.app
jobs:
x:
runs-on: ubuntu-latest
steps:
- run: curl https://api.github.com/repos/foo/bar
"""
def test_rule6_github_server_url_missing_is_warning_not_fatal(tmp_path):
"""Heuristic rule — emits warning but does NOT exit 1.
Per halt-condition 3: heuristic may false-positive (current main has 3:
OCI label + jq-release URL refs). Downgrade to warn-not-fail.
"""
_write(tmp_path, "warn.yml", GH_API_REF_NO_SERVER)
r = _run_lint(tmp_path)
assert r.returncode == 0
combined = (r.stdout + r.stderr).lower()
assert ("github_server_url" in combined) or ("::warning" in combined)
def test_rule6_github_server_url_present_no_warning(tmp_path):
_write(tmp_path, "ok.yml", GH_API_REF_WITH_SERVER)
r = _run_lint(tmp_path)
assert r.returncode == 0
# No warning emitted (server URL is set)
assert "::warning" not in r.stdout
# ---------------------------------------------------------------------------
# Aggregation — single file with multiple anti-patterns
# ---------------------------------------------------------------------------
MULTI_VIOLATIONS = """
name: ci / multi
on:
workflow_dispatch:
inputs:
v:
type: string
workflow_run:
workflows: [up]
types: [completed]
jobs:
x:
runs-on: ubuntu-latest
steps:
- uses: molecule-ai/molecule-ci/.gitea/actions/x@main
"""
def test_all_violations_aggregated_single_file(tmp_path):
_write(tmp_path, "multi.yml", MULTI_VIOLATIONS)
r = _run_lint(tmp_path)
assert r.returncode == 1
out = r.stdout
# All four FATAL rules should be reported (1, 2, 3, 5)
assert "workflow_dispatch.inputs" in out
assert "workflow_run" in out
assert "/" in out # rule 3 surfaces the slash
assert ("cross-repo" in out.lower()) or ("uses" in out.lower())
# ---------------------------------------------------------------------------
# Empty-dir / no-workflows edge case
# ---------------------------------------------------------------------------
def test_no_workflows_exits_zero(tmp_path):
r = _run_lint(tmp_path)
assert r.returncode == 0
# ---------------------------------------------------------------------------
# Vendor-truth: rule 1 catches the exact 2026-05-11 publish-runtime.yml shape
# ---------------------------------------------------------------------------
# The exact YAML shape from feedback_gitea_workflow_dispatch_inputs_unsupported
# that caused publish-runtime-v1.0.0 to silently freeze PyPI at 0.1.129 for ~24h.
PUBLISH_RUNTIME_VENDOR_TRUTH = """
name: publish-runtime
on:
push:
tags: ['runtime-v*']
workflow_dispatch:
inputs:
version:
description: "Version to publish (e.g. 0.1.6). Required for manual dispatch."
required: true
type: string
jobs:
x:
runs-on: ubuntu-latest
steps:
- run: echo hi
"""
def test_rule1_catches_2026_05_11_publish_runtime_regression(tmp_path):
"""Vendor-truth fixture: the exact YAML shape that froze PyPI for 24h."""
_write(tmp_path, "publish-runtime.yml", PUBLISH_RUNTIME_VENDOR_TRUTH)
r = _run_lint(tmp_path)
assert r.returncode == 1, (
"Lint must catch the 2026-05-11 publish-runtime regression "
f"(memory: feedback_gitea_workflow_dispatch_inputs_unsupported)."
f"\nstdout={r.stdout}"
)
+1 -16
View File
@@ -35,22 +35,7 @@ RUN CGO_ENABLED=0 GOOS=linux go build \
-o /memory-plugin ./cmd/memory-plugin-postgres
FROM alpine:3.20@sha256:c64c687cbea9300178b30c95835354e34c4e4febc4badfe27102879de0483b5e
# docker-cli is required by internal/provisioner/localbuild.go which
# shells out via exec.Command("docker", "image", "inspect"/"build"/"tag", ...)
# whenever Resolve().Mode == RegistryModeLocal — which is the permanent
# mode post-2026-05-06 (Molecule-AI GitHub org suspended → GHCR
# unreachable → MOLECULE_IMAGE_REGISTRY unset → registry_mode.go falls
# through to RegistryModeLocal). Without docker-cli here the platform
# fails every workspace re-provision with `local-build: image inspect
# for molecule-local/workspace-template-<runtime>:<sha> failed
# (exec: "docker": executable file not found in $PATH)` and the
# workspace stays status=failed. The Docker SOCKET is already mounted
# (entrypoint.sh adds the platform user to the docker group) — only
# the CLI binary was missing. Caught after sdk-lead + CP-QA went down
# this way during the MiniMax-switch attempt + after-Class-A audit.
# Related: Task #194 / Issue #63 (local-build path added);
# `feedback_workspace_image_ghcr_dead`.
RUN apk add --no-cache ca-certificates docker-cli git tzdata wget
RUN apk add --no-cache ca-certificates git tzdata wget
COPY --from=builder /platform /platform
COPY --from=builder /memory-plugin /memory-plugin
COPY workspace-server/migrations /migrations
@@ -501,18 +501,8 @@ func (h *WorkspaceHandler) proxyA2ARequest(ctx context.Context, workspaceID stri
// to correctly route delivery-confirmed responses (where the agent completed
// the work but the TCP connection dropped before the full body was received)
// to success instead of failure (#159).
//
// For non-2xx responses (server explicitly rejected with 3xx+), preserve
// resp.StatusCode in the proxyA2AError.Status so isTransientProxyError
// returns false — a server-authored rejection is not a transient transport
// error and must not be retried. Only 2xx body-read errors keep Status=502
// (the agent completed work but the TCP layer dropped the response).
errStatus := http.StatusBadGateway
if resp.StatusCode >= 300 {
errStatus = resp.StatusCode
}
return resp.StatusCode, respBody, &proxyA2AError{
Status: errStatus,
Status: http.StatusBadGateway,
Response: gin.H{
"error": "failed to read agent response",
"delivery_confirmed": deliveryConfirmed,
@@ -520,21 +510,6 @@ func (h *WorkspaceHandler) proxyA2ARequest(ctx context.Context, workspaceID stri
}
}
// 2xx with empty body: the agent completed the request but returned no content.
// An A2A agent must always return a JSON body; empty means the agent is
// broken or the connection closed before any body bytes were written.
// Return a proxyA2AError so executeDelegation routes this to failure rather
// than silently marking it as completed with a nil body.
// logA2ASuccess is intentionally NOT called here — delivery was not confirmed.
if resp.StatusCode >= 200 && resp.StatusCode < 300 && len(respBody) == 0 {
log.Printf("ProxyA2A: agent %s returned %d with empty body — treating as failure",
workspaceID, resp.StatusCode)
return resp.StatusCode, respBody, &proxyA2AError{
Status: resp.StatusCode,
Response: gin.H{"error": "agent returned empty response body"},
}
}
if logActivity {
h.logA2ASuccess(ctx, workspaceID, callerID, body, respBody, a2aMethod, resp.StatusCode, durationMs)
}
@@ -410,7 +410,7 @@ func extractToolTrace(respBody []byte) json.RawMessage {
return nil
}
trace, ok := meta["tool_trace"]
if !ok || len(trace) == 0 || string(trace) == "null" || string(trace) == "[]" {
if !ok || len(trace) == 0 {
return nil
}
return trace
@@ -1,243 +0,0 @@
package handlers
import (
"encoding/json"
"testing"
)
// ─────────────────────────────────────────────────────────────────────────────
// nilIfEmpty tests
// ─────────────────────────────────────────────────────────────────────────────
func TestNilIfEmpty_EmptyString(t *testing.T) {
got := nilIfEmpty("")
if got != nil {
t.Errorf("empty string: got %p, want nil", got)
}
}
func TestNilIfEmpty_NonEmptyString(t *testing.T) {
s := "hello"
got := nilIfEmpty(s)
if got == nil {
t.Fatal("non-empty string: got nil, want pointer")
}
if *got != "hello" {
t.Errorf("non-empty string: got %q, want %q", *got, "hello")
}
}
// ─────────────────────────────────────────────────────────────────────────────
// extractToolTrace tests
// ─────────────────────────────────────────────────────────────────────────────
func TestExtractToolTrace_EmptyBody(t *testing.T) {
got := extractToolTrace(nil)
if got != nil {
t.Errorf("nil body: got %v, want nil", got)
}
got = extractToolTrace([]byte{})
if got != nil {
t.Errorf("empty body: got %v, want nil", got)
}
}
func TestExtractToolTrace_InvalidJSON(t *testing.T) {
got := extractToolTrace([]byte("not json"))
if got != nil {
t.Errorf("invalid JSON: got %v, want nil", got)
}
}
func TestExtractToolTrace_NoResultKey(t *testing.T) {
got := extractToolTrace([]byte(`{"error": "oops"}`))
if got != nil {
t.Errorf("no result key: got %v, want nil", got)
}
}
func TestExtractToolTrace_NoMetadataKey(t *testing.T) {
got := extractToolTrace([]byte(`{"result": {"data": {}}}`))
if got != nil {
t.Errorf("no metadata key: got %v, want nil", got)
}
}
func TestExtractToolTrace_NoToolTraceKey(t *testing.T) {
got := extractToolTrace([]byte(`{"result": {"metadata": {}}}`))
if got != nil {
t.Errorf("no tool_trace key: got %v, want nil", got)
}
}
// extractToolTrace calls json.Unmarshal, which sets a RawMessage to nil when
// unmarshaling a JSON null value. The fix for mc#669 changes len(trace)==0
// to string(trace)=="[]" to avoid len(nil) panicking on null.
func TestExtractToolTrace_NullValue(t *testing.T) {
// JSON null in tool_trace → RawMessage becomes nil → len would panic.
// The fix checks string(trace)=="[]" which is safe on nil (returns false).
body := []byte(`{"result": {"metadata": {"tool_trace": null}}}`)
got := extractToolTrace(body)
if got != nil {
t.Errorf("null tool_trace: got %v, want nil", got)
}
}
// "[]" unmarshaled into RawMessage is []byte("[]") — not nil, len=2.
// The fix returns nil for [] so empty tool_trace arrays don't surface as traces.
func TestExtractToolTrace_EmptyArray(t *testing.T) {
body := []byte(`{"result": {"metadata": {"tool_trace": []}}}`)
got := extractToolTrace(body)
if got != nil {
t.Errorf("empty array tool_trace: got %v, want nil", got)
}
}
func TestExtractToolTrace_ValidNonEmpty(t *testing.T) {
trace := []byte(`[{"name":"search","result":"done"}]`)
body, _ := json.Marshal(map[string]interface{}{
"result": map[string]interface{}{
"metadata": map[string]interface{}{
"tool_trace": json.RawMessage(trace),
},
},
})
got := extractToolTrace(body)
if got == nil {
t.Fatal("valid non-empty trace: got nil, want the trace")
}
if string(got) != string(trace) {
t.Errorf("valid trace: got %s, want %s", got, trace)
}
}
// Document that the CURRENT code (len check) panics on null tool_trace.
// This test exists to signal when PR #669's fix lands: after the fix,
// the defer-recover will NOT trigger (panic goes away) and the
// post-recover assertion runs. While unfixed: the panic fires and
// ─────────────────────────────────────────────────────────────────────────────
// readUsageMap tests
// ─────────────────────────────────────────────────────────────────────────────
func TestReadUsageMap_NoUsageKey(t *testing.T) {
m := map[string]json.RawMessage{}
_, _, ok := readUsageMap(m)
if ok {
t.Error("no usage key: ok should be false")
}
}
func TestReadUsageMap_InvalidUsageJSON(t *testing.T) {
m := map[string]json.RawMessage{"usage": json.RawMessage(`"not an object"`)}
_, _, ok := readUsageMap(m)
if ok {
t.Error("invalid usage JSON: ok should be false")
}
}
func TestReadUsageMap_ZeroUsage(t *testing.T) {
m := map[string]json.RawMessage{"usage": json.RawMessage(`{"input_tokens": 0, "output_tokens": 0}`)}
_, _, ok := readUsageMap(m)
if ok {
t.Error("zero usage: ok should be false")
}
}
func TestReadUsageMap_InputOnly(t *testing.T) {
m := map[string]json.RawMessage{"usage": json.RawMessage(`{"input_tokens": 100, "output_tokens": 0}`)}
in, out, ok := readUsageMap(m)
if !ok {
t.Fatal("input-only usage: ok should be true")
}
if in != 100 {
t.Errorf("input tokens: got %d, want 100", in)
}
if out != 0 {
t.Errorf("output tokens: got %d, want 0", out)
}
}
func TestReadUsageMap_BothTokens(t *testing.T) {
m := map[string]json.RawMessage{"usage": json.RawMessage(`{"input_tokens": 500, "output_tokens": 200}`)}
in, out, ok := readUsageMap(m)
if !ok {
t.Fatal("both tokens: ok should be true")
}
if in != 500 || out != 200 {
t.Errorf("tokens: got (%d, %d), want (500, 200)", in, out)
}
}
// ─────────────────────────────────────────────────────────────────────────────
// parseUsageFromA2AResponse tests
// ─────────────────────────────────────────────────────────────────────────────
func TestParseUsageFromA2AResponse_Empty(t *testing.T) {
in, out := parseUsageFromA2AResponse(nil)
if in != 0 || out != 0 {
t.Errorf("nil: got (%d, %d), want (0, 0)", in, out)
}
in, out = parseUsageFromA2AResponse([]byte{})
if in != 0 || out != 0 {
t.Errorf("empty: got (%d, %d), want (0, 0)", in, out)
}
}
func TestParseUsageFromA2AResponse_InvalidJSON(t *testing.T) {
in, out := parseUsageFromA2AResponse([]byte("not json"))
if in != 0 || out != 0 {
t.Errorf("invalid JSON: got (%d, %d), want (0, 0)", in, out)
}
}
func TestParseUsageFromA2AResponse_NoResultNoUsage(t *testing.T) {
in, out := parseUsageFromA2AResponse([]byte(`{"id": 1}`))
if in != 0 || out != 0 {
t.Errorf("no result/usage: got (%d, %d), want (0, 0)", in, out)
}
}
func TestParseUsageFromA2AResponse_ResultUsage(t *testing.T) {
body := []byte(`{"result": {"usage": {"input_tokens": 42, "output_tokens": 7}}}`)
in, out := parseUsageFromA2AResponse(body)
if in != 42 || out != 7 {
t.Errorf("result usage: got (%d, %d), want (42, 7)", in, out)
}
}
func TestParseUsageFromA2AResponse_ResultUsageWinsOverTopLevel(t *testing.T) {
// JSON-RPC result.usage takes precedence over top-level usage.
body := []byte(`{"result": {"usage": {"input_tokens": 42, "output_tokens": 7}}, "usage": {"input_tokens": 99, "output_tokens": 99}}`)
in, out := parseUsageFromA2AResponse(body)
if in != 42 || out != 7 {
t.Errorf("result usage should win: got (%d, %d), want (42, 7)", in, out)
}
}
func TestParseUsageFromA2AResponse_TopLevelFallback(t *testing.T) {
// Direct (non-JSON-RPC) response: usage at top level.
body := []byte(`{"usage": {"input_tokens": 11, "output_tokens": 13}}`)
in, out := parseUsageFromA2AResponse(body)
if in != 11 || out != 13 {
t.Errorf("top-level usage: got (%d, %d), want (11, 13)", in, out)
}
}
func TestParseUsageFromA2AResponse_ZeroValuesInResult(t *testing.T) {
// Zero usage in result.result.usage: returns (0, 0) — no panic.
body := []byte(`{"result": {"usage": {"input_tokens": 0, "output_tokens": 0}}}`)
in, out := parseUsageFromA2AResponse(body)
if in != 0 || out != 0 {
t.Errorf("zero usage: got (%d, %d), want (0, 0)", in, out)
}
}
func TestParseUsageFromA2AResponse_MissingTokensInUsageObject(t *testing.T) {
// usage object exists but tokens are absent — returns (0, 0).
body := []byte(`{"result": {"usage": {"other_field": 5}}}`)
in, out := parseUsageFromA2AResponse(body)
if in != 0 || out != 0 {
t.Errorf("missing tokens: got (%d, %d), want (0, 0)", in, out)
}
}
@@ -7,7 +7,6 @@ import (
"go/parser"
"go/token"
"testing"
"time"
"github.com/DATA-DOG/go-sqlmock"
"github.com/Molecule-AI/molecule-monorepo/platform/internal/models"
@@ -72,8 +71,6 @@ func TestPreflight_ContainerRunning_ReturnsNil(t *testing.T) {
// triggers the offline-flip + WORKSPACE_OFFLINE broadcast + async restart.
// This is the load-bearing case — saves the caller 2-30s of network timeout.
func TestPreflight_ContainerNotRunning_StructuredFastFail(t *testing.T) {
const wsID = "ws-dead-456"
resetRestartStatesFor(wsID)
mock := setupTestDB(t)
_ = setupTestRedis(t)
stub := &preflightLocalProv{running: false, err: nil}
@@ -82,14 +79,14 @@ func TestPreflight_ContainerNotRunning_StructuredFastFail(t *testing.T) {
// Expect the offline-flip UPDATE.
mock.ExpectExec(`UPDATE workspaces SET status =`).
WithArgs(models.StatusOffline, wsID).
WithArgs(models.StatusOffline, "ws-dead-456").
WillReturnResult(sqlmock.NewResult(0, 1))
// Broadcaster's INSERT INTO structure_events fires too — best-effort
// log entry for the WORKSPACE_OFFLINE event. Match permissively.
mock.ExpectExec(`INSERT INTO structure_events`).
WillReturnResult(sqlmock.NewResult(0, 1))
proxyErr := h.preflightContainerHealth(context.Background(), wsID)
proxyErr := h.preflightContainerHealth(context.Background(), "ws-dead-456")
if proxyErr == nil {
t.Fatal("preflight should return *proxyA2AError when container not running")
}
@@ -110,32 +107,6 @@ func TestPreflight_ContainerNotRunning_StructuredFastFail(t *testing.T) {
// h.broadcaster.RecordAndBroadcast call but not asserted here — the
// real *events.Broadcaster doesn't expose received events for inspection.
// The DB UPDATE expectation is sufficient to pin the offline-flip path.
waitRestartByIDGoroutineIdle(t, wsID)
}
func waitRestartByIDGoroutineIdle(t *testing.T, wsID string) {
t.Helper()
deadline := time.Now().Add(2 * time.Second)
sawState := false
for time.Now().Before(deadline) {
sv, ok := restartStates.Load(wsID)
if ok {
sawState = true
st := sv.(*restartState)
st.mu.Lock()
running := st.running
st.mu.Unlock()
if !running {
resetRestartStatesFor(wsID)
return
}
}
time.Sleep(time.Millisecond)
}
if !sawState {
t.Fatalf("preflight did not start RestartByID goroutine for %s", wsID)
}
t.Fatalf("RestartByID goroutine for %s did not drain before test cleanup", wsID)
}
// TestPreflight_TransientError_FailsSoftAsAlive — IsRunning(true,err): the
@@ -6,7 +6,6 @@ import (
"log"
"net/http"
"os"
"runtime"
"time"
"github.com/Molecule-AI/molecule-monorepo/platform/internal/db"
@@ -163,7 +162,7 @@ func (h *DelegationHandler) Delegate(c *gin.Context) {
})
// Fire-and-forget: send A2A in background goroutine
go h.executeDelegation(ctx, sourceID, body.TargetID, delegationID, a2aBody)
go h.executeDelegation(sourceID, body.TargetID, delegationID, a2aBody)
// Broadcast event so canvas shows delegation in real-time
h.broadcaster.RecordAndBroadcast(ctx, string(events.EventDelegationSent), sourceID, map[string]interface{}{
@@ -309,50 +308,21 @@ func insertDelegationRow(ctx context.Context, c *gin.Context, sourceID string, b
// to land a fresh URL in the cache before we try again. Fixes #74 —
// bulk restarts used to produce spurious "failed to reach workspace
// agent" errors when delegations fired within the warm-up window.
var delegationRetryDelay = 8 * time.Second
const delegationRetryDelay = 8 * time.Second
// NB: the log.Printf calls below are load-bearing for the integration test
// surface (delegation_executor_integration_test.go). The test uses a raw TCP
// mock server; without these calls the compiler inlines executeDelegation and
// a subtle stack-sharing race between the inlined body and the test goroutine
// causes the test to hang. The log calls prevent inlining (Go cannot inline
// functions that call the log package). This is a known Go compiler behaviour.
// runtime.LockOSThread() provides an additional hardening: pinning the
// goroutine to a single OS thread eliminates any scheduler-migration races.
// The caller provides ctx (which carries the deadline/budget); no internal
// context.WithTimeout is created here.
// executeDelegation runs the A2A dispatch for a delegation. ctx controls the
// entire lifecycle: its timeout bounds all DB ops, proxy calls, and retries.
// Pass context.Background() when no external deadline applies (e.g. tests).
func (h *DelegationHandler) executeDelegation(ctx context.Context, sourceID, targetID, delegationID string, a2aBody []byte) {
runtime.LockOSThread() // pin to thread; prevents scheduler-migration races in integration tests
func (h *DelegationHandler) executeDelegation(sourceID, targetID, delegationID string, a2aBody []byte) {
ctx, cancel := context.WithTimeout(context.Background(), 30*time.Minute)
defer cancel()
log.Printf("Delegation %s: %s → %s (dispatched)", delegationID, sourceID, targetID)
log.Printf("Delegation %s: step=updating_dispatched_status", delegationID)
// Update status: pending → dispatched
h.updateDelegationStatus(ctx, sourceID, delegationID, "dispatched", "")
log.Printf("Delegation %s: step=broadcasting_dispatched", delegationID)
h.updateDelegationStatus(sourceID, delegationID, "dispatched", "")
h.broadcaster.RecordAndBroadcast(ctx, string(events.EventDelegationStatus), sourceID, map[string]interface{}{
"delegation_id": delegationID, "target_id": targetID, "status": "dispatched",
})
log.Printf("Delegation %s: step=proxying_a2a_request", delegationID)
status, respBody, proxyErr := h.workspace.proxyA2ARequest(ctx, targetID, a2aBody, sourceID, true)
log.Printf("Delegation %s: step=proxy_done status=%d bodyLen=%d err=%v", delegationID, status, len(respBody), proxyErr)
// When proxyA2ARequest returns an error but we have a non-empty response body
// with a 2xx status code, the agent completed the work successfully — the error
// is a delivery/transport error (e.g., connection reset after response was
// received). Treat as success: the response body is valid and the work is done.
// This check MUST run before the transient-retry gate so a delivery-confirmed
// partial-body 2xx response is never retried.
if isDeliveryConfirmedSuccess(proxyErr, status, respBody) {
log.Printf("Delegation %s: completed with delivery error (status=%d, respBody=%d bytes, proxyErr=%v) — treating as success",
delegationID, status, len(respBody), proxyErr.Error())
goto handleSuccess
}
// #74: one retry after the reactive URL refresh has had a chance to
// run. The proxyA2ARequest's health-check path on a connection error
@@ -372,10 +342,21 @@ func (h *DelegationHandler) executeDelegation(ctx context.Context, sourceID, tar
}
}
// When proxyA2ARequest returns an error but we have a non-empty response body
// with a 2xx status code, the agent completed the work successfully — the error
// is a delivery/transport error (e.g., connection reset after response was
// received). Treat as success: the response body is valid and the work is done.
// This prevents "retry storms" where the canvas sees error + Restart-workspace
// suggestion even though the delegation actually completed.
if isDeliveryConfirmedSuccess(proxyErr, status, respBody) {
log.Printf("Delegation %s: completed with delivery error (status=%d, respBody=%d bytes, proxyErr=%v) — treating as success",
delegationID, status, len(respBody), proxyErr.Error())
goto handleSuccess
}
if proxyErr != nil {
log.Printf("Delegation %s: step=handling_failure err=%v", delegationID, proxyErr)
log.Printf("Delegation %s: failed — %s", delegationID, proxyErr.Error())
h.updateDelegationStatus(ctx, sourceID, delegationID, "failed", proxyErr.Error())
h.updateDelegationStatus(sourceID, delegationID, "failed", proxyErr.Error())
if _, err := db.DB.ExecContext(ctx, `
INSERT INTO activity_logs (workspace_id, activity_type, method, source_id, target_id, summary, status, error_detail)
@@ -392,27 +373,7 @@ func (h *DelegationHandler) executeDelegation(ctx context.Context, sourceID, tar
return
}
if status >= 200 && status < 300 && len(respBody) == 0 {
errMsg := "workspace agent returned empty response"
log.Printf("Delegation %s: step=handling_failure err=%s", delegationID, errMsg)
h.updateDelegationStatus(ctx, sourceID, delegationID, "failed", errMsg)
if _, err := db.DB.ExecContext(ctx, `
INSERT INTO activity_logs (workspace_id, activity_type, method, source_id, target_id, summary, status, error_detail)
VALUES ($1, 'delegation', 'delegate_result', $2, $3, $4, 'failed', $5)
`, sourceID, sourceID, targetID, "Delegation failed", errMsg); err != nil {
log.Printf("Delegation %s: failed to insert empty-response error log: %v", delegationID, err)
}
h.broadcaster.RecordAndBroadcast(ctx, string(events.EventDelegationFailed), sourceID, map[string]interface{}{
"delegation_id": delegationID, "target_id": targetID, "error": errMsg,
})
pushDelegationResultToInbox(ctx, sourceID, delegationID, "failed", "", errMsg)
return
}
handleSuccess:
log.Printf("Delegation %s: step=handle_success status=%d", delegationID, status)
// 202 + {queued: true} means the target was busy and the proxy
// enqueued the request for the next drain tick — NOT a completion.
@@ -426,7 +387,7 @@ handleSuccess:
// the user.
if status == http.StatusAccepted && isQueuedProxyResponse(respBody) {
log.Printf("Delegation %s: target %s busy — queued for drain", delegationID, targetID)
h.updateDelegationStatus(ctx, sourceID, delegationID, "queued", "")
h.updateDelegationStatus(sourceID, delegationID, "queued", "")
// Store delegation_id in response_body so DrainQueueForWorkspace's
// stitch step can find this row by JSON-path key after the queued
// dispatch eventually succeeds. Without the key, the drain finds
@@ -453,7 +414,6 @@ handleSuccess:
responseText := extractResponseText(respBody)
log.Printf("Delegation %s: completed (status=%d, %d chars)", delegationID, status, len(responseText))
log.Printf("Delegation %s: step=inserting_success_log", delegationID)
// Store success (response_body must be JSONB, include delegation_id)
respJSON, _ := json.Marshal(map[string]interface{}{
"text": responseText,
@@ -465,7 +425,6 @@ handleSuccess:
`, sourceID, sourceID, targetID, "Delegation completed ("+textutil.TruncateBytes(responseText, 80)+")", string(respJSON)); err != nil {
log.Printf("Delegation %s: failed to insert success log: %v", delegationID, err)
}
log.Printf("Delegation %s: step=recording_ledger_completed", delegationID)
// RFC #2829 #318: write the ledger row with result_preview FIRST,
// THEN updateDelegationStatus. Order matters: SetStatus has a
@@ -475,9 +434,7 @@ handleSuccess:
// Caught by the local-Postgres integration test in
// delegation_ledger_integration_test.go.
recordLedgerStatus(ctx, delegationID, "completed", "", responseText)
log.Printf("Delegation %s: step=updating_completed_status", delegationID)
h.updateDelegationStatus(ctx, sourceID, delegationID, "completed", "")
log.Printf("Delegation %s: step=broadcasting_complete", delegationID)
h.updateDelegationStatus(sourceID, delegationID, "completed", "")
h.broadcaster.RecordAndBroadcast(ctx, string(events.EventDelegationComplete), sourceID, map[string]interface{}{
"delegation_id": delegationID,
"target_id": targetID,
@@ -485,12 +442,11 @@ handleSuccess:
})
// RFC #2829 PR-2 result-push (see UpdateStatus for rationale).
pushDelegationResultToInbox(ctx, sourceID, delegationID, "completed", responseText, "")
log.Printf("Delegation %s: step=complete", delegationID)
}
// updateDelegationStatus updates the status of a delegation record in activity_logs.
// ctx is used for DB operations; caller controls the timeout/retry budget.
func (h *DelegationHandler) updateDelegationStatus(ctx context.Context, workspaceID, delegationID, status, errorDetail string) {
func (h *DelegationHandler) updateDelegationStatus(workspaceID, delegationID, status, errorDetail string) {
ctx := context.Background()
if _, err := db.DB.ExecContext(ctx, `
UPDATE activity_logs
SET status = $1, error_detail = CASE WHEN $2 = '' THEN error_detail ELSE $2 END
@@ -604,7 +560,7 @@ func (h *DelegationHandler) UpdateStatus(c *gin.Context) {
recordLedgerStatus(ctx, delegationID, "completed", "", body.ResponsePreview)
}
h.updateDelegationStatus(ctx, sourceID, delegationID, body.Status, body.Error)
h.updateDelegationStatus(sourceID, delegationID, body.Status, body.Error)
if body.Status == "completed" {
respJSON, _ := json.Marshal(map[string]interface{}{
@@ -816,3 +772,4 @@ func extractResponseText(body []byte) string {
}
return string(body)
}
@@ -1,535 +0,0 @@
//go:build integration
// +build integration
// delegation_executor_integration_test.go — REAL Postgres integration tests for
// executeDelegation HTTP proxy edge cases that sqlmock cannot cover.
//
// The sqlmock tests in delegation_test.go pin which SQL statements fire but
// cannot detect bugs that depend on the row state AFTER the SQL runs. The
// result_preview-lost bug shipped to staging in PR #2854 because sqlmock tests
// were satisfied with "an UPDATE fired" — none verified the row's preview
// field actually landed. These integration tests close that gap.
//
// How HTTP is mocked
// -----------------
// We use raw TCP listeners (net.Listener) instead of httptest.Server to avoid
// any HTTP-library-level goroutine complexity. The test opens a TCP port,
// serves one HTTP response, then closes the connection. The a2aClient transport
// is overridden with a DialContext that intercepts all dials and redirects to
// the test server's port. No DNS, no TCP handshake overhead, no HTTP library
// goroutines that could block on request-body reads.
//
// Run with:
//
// docker run --rm -d --name pg-integration \
// -e POSTGRES_PASSWORD=test -e POSTGRES_DB=molecule \
// -p 55432:5432 postgres:15-alpine
// sleep 4
// psql ... < workspace-server/migrations/049_delegations.up.sql
// cd workspace-server
// INTEGRATION_DB_URL="postgres://postgres:test@localhost:55432/molecule?sslmode=disable" \
// go test -tags=integration ./internal/handlers/ -run Integration_ExecuteDelegation
//
// CI (.gitea/workflows/handlers-postgres-integration.yml) runs this on
// every PR that touches workspace-server/internal/handlers/**.
package handlers
import (
"context"
"database/sql"
"encoding/json"
"net"
"net/http"
"runtime"
"strconv"
"testing"
"time"
"github.com/Molecule-AI/molecule-monorepo/platform/internal/db"
)
// integrationDB is imported from delegation_ledger_integration_test.go.
// Each test gets a fresh table state.
const testDelegationID = "del-159-test-integration"
const testSourceID = "aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaa"
const testTargetID = "bbbbbbbb-bbbb-bbbb-bbbb-bbbbbbbbbbbb"
// rawHTTPServer starts a TCP listener, serves one HTTP response, and closes.
// It runs in a background goroutine so the test can proceed immediately after
// returning the server URL. The server URL (e.g. "http://127.0.0.1:<port>/")
// is suitable for caching in Redis and passing to executeDelegation.
//
// The server reads HTTP headers using a deadline, then immediately sends the
// response. This prevents the classic TCP deadlock: server blocked reading
// body while client blocked waiting for response.
func rawHTTPServer(t *testing.T, statusCode int, body string) (serverURL string, closeFn func()) {
t.Helper()
// Use ListenTCP with explicit IPv4 to avoid IPv6 mismatch on macOS
// (Listen("tcp", "127.0.0.1:0") might bind ::1 on some systems).
ln, err := net.ListenTCP("tcp4", &net.TCPAddr{IP: net.ParseIP("127.0.0.1"), Port: 0})
if err != nil {
t.Fatalf("rawHTTPServer listen: %v", err)
}
port := ln.Addr().(*net.TCPAddr).Port
serverURL = "http://127.0.0.1:" + strconv.Itoa(port) + "/"
connCh := make(chan net.Conn, 1)
go func() {
conn, err := ln.Accept()
if err != nil {
return
}
connCh <- conn
}()
closeFn = func() {
ln.Close()
}
// Handle in background so we don't block test execution.
// Strategy: read available bytes with a deadline (enough for headers).
// After deadline fires, send the response immediately.
// The kernel discards any unread buffered body bytes when the
// connection closes — harmless.
go func() {
conn := <-connCh
if conn == nil {
return
}
// Read what we can with a 2s deadline. Headers always arrive first.
conn.SetReadDeadline(time.Now().Add(2 * time.Second))
headerBuf := make([]byte, 4096)
for {
n, err := conn.Read(headerBuf)
if n > 0 {
_ = headerBuf[:n]
}
if err != nil {
break
}
}
// Send response and IMMEDIATELY close the connection.
// If we keep it open, the client's request-body writer goroutine
// might block on the socket (waiting for the server to drain the
// body). Closing immediately unblocks it. The client already
// received the response, so the write error is harmless.
resp := buildHTTPResponse(statusCode, body)
conn.Write(resp) //nolint:errcheck
conn.Close()
}()
return serverURL, closeFn
}
// buildHTTPResponse constructs a minimal HTTP/1.1 response.
func buildHTTPResponse(statusCode int, body string) []byte {
statusText := http.StatusText(statusCode)
if statusText == "" {
statusText = "Unknown"
}
header := "HTTP/1.1 " + strconv.Itoa(statusCode) + " " + statusText + "\r\n" +
"Content-Type: application/json\r\n" +
"Content-Length: " + strconv.Itoa(len(body)) + "\r\n" +
"Connection: close\r\n" +
"\r\n"
return []byte(header + body)
}
// setupIntegrationFixtures inserts the rows executeDelegation requires:
// - workspaces: source and target (siblings, parent_id=NULL so CanCommunicate=true)
// - activity_logs: the 'delegate' row that updateDelegationStatus UPDATE will find
// - delegations: the ledger row that recordLedgerStatus will UPDATE
//
// Returns a cleanup function the test should defer.
func setupIntegrationFixtures(t *testing.T, conn *sql.DB) func() {
t.Helper()
ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
for _, ws := range []struct {
id string
name string
parentID *string
}{
{testSourceID, "test-source", nil},
{testTargetID, "test-target", nil},
} {
if _, err := conn.ExecContext(ctx,
`INSERT INTO workspaces (id, name, parent_id) VALUES ($1::uuid, $2, $3) ON CONFLICT (id) DO NOTHING`,
ws.id, ws.name, ws.parentID,
); err != nil {
cancel()
t.Fatalf("seed workspace %s: %v", ws.id, err)
}
}
reqBody, _ := json.Marshal(map[string]any{
"delegation_id": testDelegationID,
"task": "do work",
})
if _, err := conn.ExecContext(ctx, `
INSERT INTO activity_logs
(workspace_id, activity_type, method, source_id, target_id, request_body, status)
VALUES ($1, 'delegate', 'delegate', $1, $2, $3::jsonb, 'pending')
ON CONFLICT DO NOTHING
`, testSourceID, testTargetID, string(reqBody)); err != nil {
cancel()
t.Fatalf("seed activity_logs: %v", err)
}
if _, err := conn.ExecContext(ctx, `
INSERT INTO delegations
(delegation_id, caller_id, callee_id, task_preview, status)
VALUES ($1, $2::uuid, $3::uuid, 'do work', 'queued')
ON CONFLICT (delegation_id) DO NOTHING
`, testDelegationID, testSourceID, testTargetID); err != nil {
cancel()
t.Fatalf("seed delegations: %v", err)
}
cancel()
return func() {
ctx2, cancel2 := context.WithTimeout(context.Background(), 5*time.Second)
defer cancel2()
conn.ExecContext(ctx2,
`DELETE FROM activity_logs WHERE workspace_id = $1 AND request_body->>'delegation_id' = $2`,
testSourceID, testDelegationID)
conn.ExecContext(ctx2,
`DELETE FROM delegations WHERE delegation_id = $1`, testDelegationID)
conn.ExecContext(ctx2,
`DELETE FROM workspaces WHERE id IN ($1, $2)`, testSourceID, testTargetID)
}
}
// readDelegationRow returns (status, result_preview, error_detail) for the test
// delegation, or fails the test if the row is not found.
func readDelegationRow(t *testing.T, conn *sql.DB) (status, preview, errorDetail string) {
t.Helper()
ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
defer cancel()
var prev, errDet sql.NullString
err := conn.QueryRowContext(ctx,
`SELECT status, result_preview, error_detail FROM delegations WHERE delegation_id = $1`,
testDelegationID,
).Scan(&status, &prev, &errDet)
if err != nil {
t.Fatalf("readDelegationRow: %v", err)
}
return status, prev.String, errDet.String
}
// stack returns the current goroutine stack trace. Used by runWithTimeout to
// pinpoint the blocking call site when a test times out.
func stack() string {
buf := make([]byte, 4096)
n := runtime.Stack(buf, false)
return string(buf[:n])
}
// runWithTimeout calls fn in a goroutine and fails t if it doesn't return within
// timeout. ctx is passed to fn so it can propagate cancellation to
// executeDelegation's DB and network operations — without this, the goroutine
// leaks indefinitely when the test times out (context.Background() never cancels).
func runWithTimeout(t *testing.T, timeout time.Duration, fn func(context.Context)) {
t.Helper()
ctx, cancel := context.WithTimeout(context.Background(), timeout)
defer cancel()
done := make(chan struct{})
var panicErr interface{}
go func() {
defer func() {
if p := recover(); p != nil {
panicErr = p
}
close(done)
}()
fn(ctx)
}()
select {
case <-done:
if panicErr != nil {
t.Fatalf("executeDelegation panicked: %v\n%s", panicErr, stack())
}
case <-ctx.Done():
cancel()
t.Fatalf("executeDelegation timed out after %s\n%s", timeout, stack())
}
}
// TestIntegration_ExecuteDelegation_DeliveryConfirmedProxyError_TreatsAsSuccess
// is the integration regression gate for issue #159.
//
// Scenario: proxyA2ARequest returns a 200 status code with a non-empty body.
// isDeliveryConfirmedSuccess guard (status>=200 && <300 && len(body)>0 && err!=nil)
// routes to handleSuccess. The integration test verifies the DB row lands at
// 'completed' with the response body as result_preview.
func TestIntegration_ExecuteDelegation_DeliveryConfirmedProxyError_TreatsAsSuccess(t *testing.T) {
allowLoopbackForTest(t)
conn := integrationDB(t)
cleanup := setupIntegrationFixtures(t, conn)
defer cleanup()
t.Setenv("DELEGATION_LEDGER_WRITE", "1")
agentURL, closeServer := rawHTTPServer(t, 200, `{"result":{"parts":[{"text":"work completed successfully"}]}}`)
defer closeServer()
mr := setupTestRedis(t)
defer mr.Close()
db.CacheURL(context.Background(), testTargetID, agentURL)
prevClient := a2aClient
defer func() { a2aClient = prevClient }()
a2aClient = newA2AClientForHost(extractHostPort(agentURL))
broadcaster := newTestBroadcaster()
wh := NewWorkspaceHandler(broadcaster, nil, "http://localhost:8080", t.TempDir())
dh := NewDelegationHandler(wh, broadcaster)
a2aBody, _ := json.Marshal(map[string]interface{}{
"jsonrpc": "2.0",
"id": "1",
"method": "message/send",
"params": map[string]interface{}{
"message": map[string]interface{}{
"role": "user",
"parts": []map[string]string{{"type": "text", "text": "do work"}},
},
},
})
start := time.Now()
runWithTimeout(t, 30*time.Second, func(ctx context.Context) {
dh.executeDelegation(ctx, testSourceID, testTargetID, testDelegationID, a2aBody)
})
t.Logf("executeDelegation took %v", time.Since(start))
status, preview, errDet := readDelegationRow(t, conn)
if status != "completed" {
t.Errorf("status: want completed, got %q", status)
}
if preview == "" {
t.Errorf("result_preview should be non-empty, got %q", preview)
}
if errDet != "" {
t.Errorf("error_detail should be empty on success: got %q", errDet)
}
}
// TestIntegration_ExecuteDelegation_ProxyErrorNon2xx_RemainsFailed verifies that
// a 500 response routes to failure, not success. isDeliveryConfirmedSuccess
// requires status>=200 && <300, so 500 always fails the guard.
func TestIntegration_ExecuteDelegation_ProxyErrorNon2xx_RemainsFailed(t *testing.T) {
allowLoopbackForTest(t)
conn := integrationDB(t)
cleanup := setupIntegrationFixtures(t, conn)
defer cleanup()
t.Setenv("DELEGATION_LEDGER_WRITE", "1")
agentURL, closeServer := rawHTTPServer(t, 500, `{"error":"agent crashed"}`)
defer closeServer()
mr := setupTestRedis(t)
defer mr.Close()
db.CacheURL(context.Background(), testTargetID, agentURL)
prevClient := a2aClient
defer func() { a2aClient = prevClient }()
a2aClient = newA2AClientForHost(extractHostPort(agentURL))
broadcaster := newTestBroadcaster()
wh := NewWorkspaceHandler(broadcaster, nil, "http://localhost:8080", t.TempDir())
dh := NewDelegationHandler(wh, broadcaster)
a2aBody, _ := json.Marshal(map[string]interface{}{
"jsonrpc": "2.0", "id": "1", "method": "message/send",
"params": map[string]interface{}{
"message": map[string]interface{}{
"role": "user",
"parts": []map[string]string{{"type": "text", "text": "do work"}},
},
},
})
start := time.Now()
runWithTimeout(t, 30*time.Second, func(ctx context.Context) {
dh.executeDelegation(ctx, testSourceID, testTargetID, testDelegationID, a2aBody)
})
t.Logf("executeDelegation took %v", time.Since(start))
status, _, errDet := readDelegationRow(t, conn)
if status != "failed" {
t.Errorf("status: want failed, got %q", status)
}
if errDet == "" {
t.Error("error_detail should be non-empty on failure")
}
}
// TestIntegration_ExecuteDelegation_ProxyErrorEmptyBody_RemainsFailed verifies that
// a 200 response with an empty body routes to failure. isDeliveryConfirmedSuccess
// requires len(body) > 0, so an empty body fails the guard.
func TestIntegration_ExecuteDelegation_ProxyErrorEmptyBody_RemainsFailed(t *testing.T) {
allowLoopbackForTest(t)
conn := integrationDB(t)
cleanup := setupIntegrationFixtures(t, conn)
defer cleanup()
t.Setenv("DELEGATION_LEDGER_WRITE", "1")
agentURL, closeServer := rawHTTPServer(t, 200, "")
defer closeServer()
mr := setupTestRedis(t)
defer mr.Close()
db.CacheURL(context.Background(), testTargetID, agentURL)
prevClient := a2aClient
defer func() { a2aClient = prevClient }()
a2aClient = newA2AClientForHost(extractHostPort(agentURL))
broadcaster := newTestBroadcaster()
wh := NewWorkspaceHandler(broadcaster, nil, "http://localhost:8080", t.TempDir())
dh := NewDelegationHandler(wh, broadcaster)
a2aBody, _ := json.Marshal(map[string]interface{}{
"jsonrpc": "2.0", "id": "1", "method": "message/send",
"params": map[string]interface{}{
"message": map[string]interface{}{
"role": "user",
"parts": []map[string]string{{"type": "text", "text": "do work"}},
},
},
})
start := time.Now()
runWithTimeout(t, 30*time.Second, func(ctx context.Context) {
dh.executeDelegation(ctx, testSourceID, testTargetID, testDelegationID, a2aBody)
})
t.Logf("executeDelegation took %v", time.Since(start))
status, _, errDet := readDelegationRow(t, conn)
if status != "failed" {
t.Errorf("status: want failed, got %q", status)
}
if errDet == "" {
t.Error("error_detail should be non-empty on failure")
}
}
// TestIntegration_ExecuteDelegation_CleanProxyResponse_Unchanged is the baseline:
// a clean 200 response with a valid body and no error routes to success.
func TestIntegration_ExecuteDelegation_CleanProxyResponse_Unchanged(t *testing.T) {
allowLoopbackForTest(t)
conn := integrationDB(t)
cleanup := setupIntegrationFixtures(t, conn)
defer cleanup()
t.Setenv("DELEGATION_LEDGER_WRITE", "1")
agentURL, closeServer := rawHTTPServer(t, 200, `{"result":{"parts":[{"text":"all good"}]}}`)
defer closeServer()
mr := setupTestRedis(t)
defer mr.Close()
db.CacheURL(context.Background(), testTargetID, agentURL)
prevClient := a2aClient
defer func() { a2aClient = prevClient }()
a2aClient = newA2AClientForHost(extractHostPort(agentURL))
broadcaster := newTestBroadcaster()
wh := NewWorkspaceHandler(broadcaster, nil, "http://localhost:8080", t.TempDir())
dh := NewDelegationHandler(wh, broadcaster)
a2aBody, _ := json.Marshal(map[string]interface{}{
"jsonrpc": "2.0", "id": "1", "method": "message/send",
"params": map[string]interface{}{
"message": map[string]interface{}{
"role": "user",
"parts": []map[string]string{{"type": "text", "text": "do work"}},
},
},
})
start := time.Now()
runWithTimeout(t, 30*time.Second, func(ctx context.Context) {
dh.executeDelegation(ctx, testSourceID, testTargetID, testDelegationID, a2aBody)
})
t.Logf("executeDelegation took %v", time.Since(start))
status, preview, errDet := readDelegationRow(t, conn)
if status != "completed" {
t.Errorf("status: want completed, got %q", status)
}
if preview == "" {
t.Errorf("result_preview should be non-empty, got %q", preview)
}
if errDet != "" {
t.Errorf("error_detail should be empty on success: got %q", errDet)
}
}
// Test that a delegation where Redis cannot be reached still routes to failure
// (not panic). proxyA2ARequest falls back to DB URL lookup when Redis is down.
func TestIntegration_ExecuteDelegation_RedisDown_FallsBackToDB(t *testing.T) {
allowLoopbackForTest(t)
conn := integrationDB(t)
cleanup := setupIntegrationFixtures(t, conn)
defer cleanup()
t.Setenv("DELEGATION_LEDGER_WRITE", "1")
// Set up miniredis so db.RDB is non-nil, but do NOT cache any URL.
// resolveAgentURL skips Redis and falls back to DB, which also has no URL.
mr := setupTestRedis(t)
defer mr.Close()
broadcaster := newTestBroadcaster()
wh := NewWorkspaceHandler(broadcaster, nil, "http://localhost:8080", t.TempDir())
dh := NewDelegationHandler(wh, broadcaster)
a2aBody, _ := json.Marshal(map[string]interface{}{
"jsonrpc": "2.0", "id": "1", "method": "message/send",
"params": map[string]interface{}{
"message": map[string]interface{}{
"role": "user",
"parts": []map[string]string{{"type": "text", "text": "do work"}},
},
},
})
start := time.Now()
runWithTimeout(t, 30*time.Second, func(ctx context.Context) {
dh.executeDelegation(ctx, testSourceID, testTargetID, testDelegationID, a2aBody)
})
t.Logf("executeDelegation took %v", time.Since(start))
status, _, errDet := readDelegationRow(t, conn)
if status != "failed" {
t.Errorf("status: want failed (no target URL), got %q", status)
}
if errDet == "" {
t.Error("error_detail should be set on failure due to unreachable target")
}
}
// extractHostPort parses "http://127.0.0.1:PORT/" and returns "127.0.0.1:PORT".
func extractHostPort(rawURL string) string {
// Simple parse: strip "http://" prefix and trailing slash.
// The URL format is always "http://127.0.0.1:PORT/" in our usage.
if len(rawURL) > 7 {
return rawURL[7 : len(rawURL)-1]
}
return rawURL
}
// newA2AClientForHost creates an http.Client that redirects all connections
// to the given host:port. This lets us mock the agent endpoint without
// running a real HTTP server.
func newA2AClientForHost(targetHost string) *http.Client {
return &http.Client{
Transport: &http.Transport{
DialContext: func(ctx context.Context, network, addr string) (net.Conn, error) {
return net.Dial("tcp", targetHost)
},
ResponseHeaderTimeout: 180 * time.Second,
},
}
}
@@ -154,28 +154,10 @@ func (l *DelegationLedger) SetStatus(ctx context.Context,
return err
}
// Same-status replay (e.g. duplicate completion notification): usually a
// no-op. If the replay carries terminal detail that the first write lacked,
// fill the missing nullable column once. This keeps duplicate notifications
// idempotent while preserving the first observed result/error when a legacy
// path wrote the terminal status before it had the detail payload.
// Same-status replay (e.g. duplicate completion notification): no-op,
// don't bump updated_at, no error.
if current == status {
if errorDetail == "" && resultPreview == "" {
return nil
}
_, err = l.db.ExecContext(ctx, `
UPDATE delegations
SET error_detail = COALESCE(error_detail, NULLIF($2, '')),
result_preview = COALESCE(result_preview, NULLIF($3, '')),
updated_at = CASE
WHEN (error_detail IS NULL AND NULLIF($2, '') IS NOT NULL)
OR (result_preview IS NULL AND NULLIF($3, '') IS NOT NULL)
THEN now()
ELSE updated_at
END
WHERE delegation_id = $1
`, delegationID, errorDetail, textutil.TruncateBytesNoMarker(resultPreview, previewCap))
return err
return nil
}
// Forward-only on terminal states.
@@ -39,7 +39,6 @@ import (
"os"
"strings"
"testing"
"time"
mdb "github.com/Molecule-AI/molecule-monorepo/platform/internal/db"
_ "github.com/lib/pq"
@@ -65,16 +64,12 @@ func integrationDB(t *testing.T) *sql.DB {
if err != nil {
t.Fatalf("open: %v", err)
}
ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
defer cancel()
if err := conn.PingContext(ctx); err != nil {
if err := conn.Ping(); err != nil {
t.Fatalf("ping: %v", err)
}
// Each test gets a fresh table state — fail loud if cleanup fails so
// a bad test doesn't pollute the next one.
ctx2, cancel2 := context.WithTimeout(context.Background(), 10*time.Second)
defer cancel2()
if _, err := conn.ExecContext(ctx2, `DELETE FROM delegations`); err != nil {
if _, err := conn.ExecContext(context.Background(), `DELETE FROM delegations`); err != nil {
t.Fatalf("cleanup: %v", err)
}
// Wire the package-level db.DB so production helpers (recordLedgerInsert,
@@ -150,11 +145,16 @@ func TestIntegration_ResultPreviewPreservedThroughCompletion(t *testing.T) {
}
}
// Same-status terminal replays remain idempotent, but if the first terminal
// write lacked result_preview, a later same-status replay carrying the preview
// should fill that missing field once. This protects legacy call ordering and
// mirrors the failure-path error_detail repair.
func TestIntegration_ResultPreviewSameStatusReplayFillsMissingPreview(t *testing.T) {
// TestIntegration_ResultPreviewBuggyOrderIsLost — DIAGNOSTIC test that
// confirms the ORIGINAL buggy order does lose the preview. Useful when
// auditing similar wiring elsewhere.
//
// This is documented behavior: it asserts the same-status replay no-op
// works as designed in DelegationLedger.SetStatus. The fix in
// delegation.go is to AVOID this order, not to change SetStatus's
// same-status semantics (which the operator dashboard relies on for
// idempotent completion notifications).
func TestIntegration_ResultPreviewBuggyOrderIsLost(t *testing.T) {
conn := integrationDB(t)
t.Setenv("DELEGATION_LEDGER_WRITE", "1")
@@ -162,17 +162,16 @@ func TestIntegration_ResultPreviewSameStatusReplayFillsMissingPreview(t *testing
caller := "11111111-1111-1111-1111-111111111111"
callee := "22222222-2222-2222-2222-222222222222"
// Legacy sequence: queued → dispatched → completed (no preview)
// completed (preview). The second completed replay should repair the
// missing preview without changing status.
// BUGGY sequence in production-shape order: queued → dispatched →
// completed (no preview) → completed (preview ignored as same-status).
recordLedgerInsert(context.Background(), caller, callee, id, "the question", "")
recordLedgerStatus(context.Background(), id, "dispatched", "", "")
recordLedgerStatus(context.Background(), id, "completed", "", "")
recordLedgerStatus(context.Background(), id, "completed", "", "the answer")
recordLedgerStatus(context.Background(), id, "dispatched", "", "") // pre-completion stage
recordLedgerStatus(context.Background(), id, "completed", "", "") // inner first
recordLedgerStatus(context.Background(), id, "completed", "", "the answer") // outer same-status no-op
_, preview, _ := readLedgerRow(t, conn, id)
if preview != "the answer" {
t.Errorf("same-status replay should fill missing preview; got %q", preview)
if preview != "" {
t.Errorf("buggy-order preview was unexpectedly non-empty: %q (SetStatus same-status no-op contract may have changed)", preview)
}
}
@@ -226,25 +226,6 @@ func TestLedgerSetStatus_SameStatusReplay_NoUpdate(t *testing.T) {
}
}
func TestLedgerSetStatus_SameStatusReplay_FillsMissingDetail(t *testing.T) {
mock := setupTestDB(t)
l := NewDelegationLedger(nil)
mock.ExpectQuery(`SELECT status FROM delegations WHERE delegation_id = \$1`).
WithArgs("d-1").
WillReturnRows(sqlmock.NewRows([]string{"status"}).AddRow("failed"))
mock.ExpectExec(`UPDATE delegations\s+SET error_detail = COALESCE\(error_detail, NULLIF\(\$2, ''\)\),\s+result_preview = COALESCE\(result_preview, NULLIF\(\$3, ''\)\),\s+updated_at = CASE`).
WithArgs("d-1", "agent returned empty response", "").
WillReturnResult(sqlmock.NewResult(0, 1))
if err := l.SetStatus(context.Background(), "d-1", "failed", "agent returned empty response", ""); err != nil {
t.Errorf("same-status detail fill should succeed, got err: %v", err)
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("unmet: %v", err)
}
}
func TestLedgerSetStatus_MissingRowIsNoOp(t *testing.T) {
// A SetStatus call that arrives before Insert (lost INSERT, race, etc.)
// must NOT error — it's a transient inconsistency the next agent retry
@@ -5,8 +5,10 @@ import (
"context"
"encoding/json"
"fmt"
"net"
"net/http"
"net/http/httptest"
"sync"
"testing"
"time"
@@ -956,3 +958,316 @@ func TestInsertDelegationOutcome_ZeroValueIsUnknown(t *testing.T) {
t.Errorf("insertOutcomeUnknown must not collide with insertOK")
}
}
// ==================== executeDelegation — delivery-confirmed proxy error regression tests ====================
//
// These test the fix for issue #159: when proxyA2ARequest returns an error but we have a
// non-empty response body with a 2xx status code, executeDelegation must treat it as success.
// The error is a delivery/transport error (e.g., connection reset after response was received).
// Previously, executeDelegation marked these as "failed" even though the work was done,
// causing retry storms and "error" rendering in canvas despite the response being available.
//
// Test strategy: spin up a mock A2A agent server, set up the source/target DB rows, call
// executeDelegation directly, and verify the activity_logs status and delegation status.
const testDelegationID = "del-159-test"
const testSourceID = "ws-source-159"
const testTargetID = "ws-target-159"
// expectExecuteDelegationBase sets up sqlmock expectations for the DB queries that
// executeDelegation always makes, regardless of outcome.
func expectExecuteDelegationBase(mock sqlmock.Sqlmock) {
// updateDelegationStatus: dispatched
// Uses prefix match — sqlmock regexes match the full query string.
mock.ExpectExec("UPDATE activity_logs SET status").
WithArgs("dispatched", "", testSourceID, testDelegationID).
WillReturnResult(sqlmock.NewResult(0, 1))
// CanCommunicate: source != target → fires two getWorkspaceRef lookups.
// Both test fixtures have parent_id = NULL (root-level siblings) → allowed.
// Order matches call order: source first, then target.
mock.ExpectQuery("SELECT id, parent_id FROM workspaces WHERE id").
WithArgs(testSourceID).
WillReturnRows(sqlmock.NewRows([]string{"id", "parent_id"}).AddRow(testSourceID, nil))
mock.ExpectQuery("SELECT id, parent_id FROM workspaces WHERE id").
WithArgs(testTargetID).
WillReturnRows(sqlmock.NewRows([]string{"id", "parent_id"}).AddRow(testTargetID, nil))
// resolveAgentURL: reads ws:{id}:url from Redis, falls back to DB for target
mock.ExpectQuery("SELECT url, status FROM workspaces WHERE id = ").
WithArgs(testTargetID).
WillReturnRows(sqlmock.NewRows([]string{"url", "status"}).AddRow("", "online"))
}
// expectExecuteDelegationSuccess sets up expectations for a completed delegation.
func expectExecuteDelegationSuccess(mock sqlmock.Sqlmock, respBody string) {
// INSERT activity_logs for delegation completion (response_body status = 'completed')
mock.ExpectExec("INSERT INTO activity_logs").
WithArgs(sqlmock.AnyArg(), sqlmock.AnyArg(), sqlmock.AnyArg(), sqlmock.AnyArg(), sqlmock.AnyArg(), sqlmock.AnyArg(), "completed").
WillReturnResult(sqlmock.NewResult(0, 1))
// updateDelegationStatus: completed
mock.ExpectExec("UPDATE activity_logs SET status").
WithArgs("completed", "", testSourceID, testDelegationID).
WillReturnResult(sqlmock.NewResult(0, 1))
}
// expectExecuteDelegationFailed sets up expectations for a failed delegation.
func expectExecuteDelegationFailed(mock sqlmock.Sqlmock) {
// INSERT activity_logs for delegation failure (response_body status = 'failed')
mock.ExpectExec("INSERT INTO activity_logs").
WithArgs(sqlmock.AnyArg(), sqlmock.AnyArg(), sqlmock.AnyArg(), sqlmock.AnyArg(), sqlmock.AnyArg(), sqlmock.AnyArg(), "failed").
WillReturnResult(sqlmock.NewResult(0, 1))
// updateDelegationStatus: failed
mock.ExpectExec("UPDATE activity_logs SET status").
WithArgs("failed", sqlmock.AnyArg(), testSourceID, testDelegationID).
WillReturnResult(sqlmock.NewResult(0, 1))
}
// TestExecuteDelegation_DeliveryConfirmedProxyError_TreatsAsSuccess is the primary regression
// test for issue #159. The scenario:
// - Attempt 1: server sends 200 OK headers + partial body, then closes connection.
// proxyA2ARequest: body read gets io.EOF (partial body read), returns (200, <partial>, BadGateway).
// isTransientProxyError(BadGateway) = TRUE → retry.
// - Attempt 2: server does the same thing (closes after partial body).
// proxyA2ARequest: same (200, <partial>, BadGateway).
// isTransientProxyError(BadGateway) = TRUE → retry AGAIN (but outer context will fire soon,
// or we get one more attempt). For the test we let it run.
// POST-FIX: the executeDelegation new condition sees status=200, body=<partial>, err!=nil
// and routes to handleSuccess immediately.
//
// The key pre/post-fix difference: pre-fix, executeDelegation received status=0 (hardcoded)
// even when the server sent 200, so the condition always failed. Post-fix, status=200 is
// preserved through the error return path (proxyA2ARequest now returns resp.StatusCode, respBody).
// In this test the retry ultimately succeeds (server eventually sends full body), but
// the critical assertion is that a 2xx partial-body delivery-confirmed response is never
// classified as "failed" — it always routes to success.
func TestExecuteDelegation_DeliveryConfirmedProxyError_TreatsAsSuccess(t *testing.T) {
mock := setupTestDB(t)
mr := setupTestRedis(t)
allowLoopbackForTest(t)
broadcaster := newTestBroadcaster()
wh := NewWorkspaceHandler(broadcaster, nil, "http://localhost:8080", t.TempDir())
dh := NewDelegationHandler(wh, broadcaster)
// Server that sends a 200 response with declared Content-Length but closes
// the connection before sending all bytes. Go's http.Client sees io.EOF on
// the body read. proxyA2ARequest captures the partial body + status=200 and
// returns (200, <partial>, error). executeDelegation's new condition sees
// status=200 + body > 0 + error != nil → routes to handleSuccess.
var wg sync.WaitGroup
wg.Add(1)
ln, err := net.Listen("tcp", "127.0.0.1:0")
if err != nil {
t.Fatalf("failed to listen: %v", err)
}
defer ln.Close()
go func() {
defer wg.Done()
conn, err := ln.Accept()
if err != nil {
return
}
defer conn.Close()
// Consume the HTTP request
buf := make([]byte, 2048)
conn.Read(buf)
// Send 200 OK with Content-Length: 100 but only 74 bytes of body
// (less than declared length → io.LimitReader returns io.EOF after reading all 74)
resp := "HTTP/1.1 200 OK\r\nContent-Type: application/json\r\nContent-Length: 100\r\n\r\n"
resp += `{"result":{"parts":[{"text":"work completed successfully"}]}}` // 74 bytes
conn.Write([]byte(resp))
// Close immediately — client gets io.EOF on body read
}()
agentURL := "http://" + ln.Addr().String()
mr.Set(fmt.Sprintf("ws:%s:url", testTargetID), agentURL)
allowLoopbackForTest(t)
expectExecuteDelegationBase(mock)
expectExecuteDelegationSuccess(mock, `{"result":{"parts":[{"text":"work completed successfully"}]}}`)
// Execute synchronously (not as a goroutine) so we can check DB state immediately.
// The handler fires it as goroutine; we call it directly for deterministic testing.
a2aBody, _ := json.Marshal(map[string]interface{}{
"jsonrpc": "2.0",
"id": "1",
"method": "message/send",
"params": map[string]interface{}{
"message": map[string]interface{}{
"role": "user",
"parts": []map[string]string{{"type": "text", "text": "do work"}},
},
},
})
dh.executeDelegation(testSourceID, testTargetID, testDelegationID, a2aBody)
time.Sleep(100 * time.Millisecond) // let DB writes settle
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("unmet sqlmock expectations: %v", err)
}
}
// TestExecuteDelegation_ProxyErrorNon2xx_RemainsFailed verifies that the pre-fix failure
// path is unchanged when proxyA2ARequest returns a delivery-confirmed error with a non-2xx
// status code (e.g., 500 Internal Server Error with partial body read before connection drop).
// The new condition requires status >= 200 && status < 300, so non-2xx always routes to failure.
func TestExecuteDelegation_ProxyErrorNon2xx_RemainsFailed(t *testing.T) {
mock := setupTestDB(t)
mr := setupTestRedis(t)
allowLoopbackForTest(t)
broadcaster := newTestBroadcaster()
wh := NewWorkspaceHandler(broadcaster, nil, "http://localhost:8080", t.TempDir())
dh := NewDelegationHandler(wh, broadcaster)
// Server returns 500 with declared Content-Length but closes connection early.
// proxyA2ARequest: reads 500 headers, partial body, then connection drop → body read error.
// Returns (500, <partial_body>, BadGateway).
// New condition: status=500 is NOT >= 200 && < 300 → routes to failure.
// isTransientProxyError(500) = false → no retry.
var wg sync.WaitGroup
wg.Add(1)
ln, err := net.Listen("tcp", "127.0.0.1:0")
if err != nil {
t.Fatalf("failed to listen: %v", err)
}
defer ln.Close()
go func() {
defer wg.Done()
conn, err := ln.Accept()
if err != nil {
return
}
defer conn.Close()
buf := make([]byte, 2048)
conn.Read(buf)
// 500 with Content-Length: 100 but only ~60 bytes of body
resp := "HTTP/1.1 500 Internal Server Error\r\nContent-Type: application/json\r\nContent-Length: 100\r\n\r\n"
resp += `{"error":"agent crashed"}` // ~24 bytes, less than declared
conn.Write([]byte(resp))
// Close immediately — client gets io.EOF on body read
}()
agentURL := "http://" + ln.Addr().String()
mr.Set(fmt.Sprintf("ws:%s:url", testTargetID), agentURL)
allowLoopbackForTest(t)
expectExecuteDelegationBase(mock)
expectExecuteDelegationFailed(mock)
a2aBody, _ := json.Marshal(map[string]interface{}{
"jsonrpc": "2.0", "id": "1", "method": "message/send",
"params": map[string]interface{}{
"message": map[string]interface{}{
"role": "user",
"parts": []map[string]string{{"type": "text", "text": "do work"}},
},
},
})
dh.executeDelegation(testSourceID, testTargetID, testDelegationID, a2aBody)
time.Sleep(100 * time.Millisecond)
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("unmet sqlmock expectations: %v", err)
}
}
// TestExecuteDelegation_ProxyErrorEmptyBody_RemainsFailed verifies that the pre-fix failure
// path is unchanged when proxyA2ARequest returns an error with a 2xx status but empty body.
// The new condition requires len(respBody) > 0, so empty body routes to failure.
func TestExecuteDelegation_ProxyErrorEmptyBody_RemainsFailed(t *testing.T) {
mock := setupTestDB(t)
mr := setupTestRedis(t)
allowLoopbackForTest(t)
broadcaster := newTestBroadcaster()
wh := NewWorkspaceHandler(broadcaster, nil, "http://localhost:8080", t.TempDir())
dh := NewDelegationHandler(wh, broadcaster)
// Server returns 502 Bad Gateway — proxyA2ARequest returns 502, body="" (empty), error != nil.
// New condition: proxyErr != nil && len(respBody) > 0 && status >= 200 && status < 300
// → len(respBody) == 0 → condition FALSE → falls through to failure.
// isTransientProxyError(502) is TRUE → retry → same result → failure.
agentServer := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(http.StatusBadGateway)
// No body — connection closes normally
}))
defer agentServer.Close()
mr.Set(fmt.Sprintf("ws:%s:url", testTargetID), agentServer.URL)
allowLoopbackForTest(t)
// First attempt: updateDelegationStatus(dispatched) — from expectExecuteDelegationBase
expectExecuteDelegationBase(mock)
// Second attempt (retry): updateDelegationStatus(dispatched) again
mock.ExpectExec("UPDATE activity_logs SET status").
WithArgs("dispatched", "", testSourceID, testDelegationID).
WillReturnResult(sqlmock.NewResult(0, 1))
// Failure: INSERT + UPDATE (failed)
expectExecuteDelegationFailed(mock)
a2aBody, _ := json.Marshal(map[string]interface{}{
"jsonrpc": "2.0", "id": "1", "method": "message/send",
"params": map[string]interface{}{
"message": map[string]interface{}{
"role": "user",
"parts": []map[string]string{{"type": "text", "text": "do work"}},
},
},
})
dh.executeDelegation(testSourceID, testTargetID, testDelegationID, a2aBody)
time.Sleep(100 * time.Millisecond)
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("unmet sqlmock expectations: %v", err)
}
}
// TestExecuteDelegation_CleanProxyResponse_Unchanged verifies that a clean proxy response
// (no error, 200 with body) is unaffected by the new condition. This is the baseline:
// proxyErr == nil so the new condition never fires.
func TestExecuteDelegation_CleanProxyResponse_Unchanged(t *testing.T) {
mock := setupTestDB(t)
mr := setupTestRedis(t)
allowLoopbackForTest(t)
broadcaster := newTestBroadcaster()
wh := NewWorkspaceHandler(broadcaster, nil, "http://localhost:8080", t.TempDir())
dh := NewDelegationHandler(wh, broadcaster)
agentServer := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(http.StatusOK)
w.Header().Set("Content-Type", "application/json")
w.Write([]byte(`{"result":{"parts":[{"text":"all good"}]}}`))
}))
defer agentServer.Close()
mr.Set(fmt.Sprintf("ws:%s:url", testTargetID), agentServer.URL)
allowLoopbackForTest(t)
expectExecuteDelegationBase(mock)
expectExecuteDelegationSuccess(mock, `{"result":{"parts":[{"text":"all good"}]}}`)
a2aBody, _ := json.Marshal(map[string]interface{}{
"jsonrpc": "2.0", "id": "1", "method": "message/send",
"params": map[string]interface{}{
"message": map[string]interface{}{
"role": "user",
"parts": []map[string]string{{"type": "text", "text": "do work"}},
},
},
})
dh.executeDelegation(testSourceID, testTargetID, testDelegationID, a2aBody)
time.Sleep(100 * time.Millisecond)
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("unmet sqlmock expectations: %v", err)
}
}
@@ -292,8 +292,8 @@ func filterPeersByQuery(peers []map[string]interface{}, q string) []map[string]i
needle := strings.ToLower(q)
out := make([]map[string]interface{}, 0, len(peers))
for _, p := range peers {
name, _ := p["name"].(string) // nil → "" — safe on empty-role rows
role, _ := p["role"].(string) // nil → "" — queryPeerMaps sets nil when DB role is empty
name := p["name"].(string)
role := p["role"].(string)
if strings.Contains(strings.ToLower(name), needle) ||
strings.Contains(strings.ToLower(role), needle) {
out = append(out, p)
@@ -394,80 +394,6 @@ func TestPeers_Q_NoMatches_RawBodyIsArrayNotNull(t *testing.T) {
}
}
// TestFilterPeersByQuery_NilRoleRegression is the regression gate for
// mc#730/#731: queryPeerMaps sets peer["role"] = nil when the DB role column
// is empty (discovery.go lines 337-341). filterPeersByQuery did a bare
// type assertion p["role"].(string) which panics on nil. The fix uses the
// comma-ok form so nil → "". The test passes a map with nil name and nil
// role and asserts no panic + correct filter behaviour.
func TestFilterPeersByQuery_NilRoleRegression(t *testing.T) {
cases := []struct {
name string
peers []map[string]interface{}
q string
wantLen int
wantIDs []string
}{
{
name: "nil role matches on name",
peers: []map[string]interface{}{
{"id": "ws-a", "name": nil, "role": nil},
{"id": "ws-b", "name": "Alpha Builder", "role": nil},
{"id": "ws-c", "name": "Beta Builder", "role": nil},
},
q: "alpha",
wantLen: 1,
wantIDs: []string{"ws-b"},
},
{
name: "nil name matches on nil role (empty string)",
peers: []map[string]interface{}{
{"id": "ws-x", "name": nil, "role": nil},
{"id": "ws-y", "name": "Dev Workspace", "role": nil},
},
q: "",
wantLen: 2, // empty q is a no-op
wantIDs: []string{"ws-x", "ws-y"},
},
{
name: "all nil — no panic, returns input",
peers: []map[string]interface{}{
{"id": "ws-z", "name": nil, "role": nil},
},
q: "anything",
wantLen: 0,
wantIDs: nil,
},
}
for _, tc := range cases {
t.Run(tc.name, func(t *testing.T) {
got := filterPeersByQuery(tc.peers, tc.q)
if len(got) != tc.wantLen {
t.Fatalf("len: got %d, want %d", len(got), tc.wantLen)
}
gotIDs := make([]string, len(got))
for i, p := range got {
gotIDs[i] = p["id"].(string)
}
if tc.wantIDs != nil {
for _, id := range tc.wantIDs {
found := false
for _, g := range gotIDs {
if g == id {
found = true
break
}
}
if !found {
t.Errorf("missing id %q; got IDs: %v", id, gotIDs)
}
}
}
})
}
}
func keysOf(m map[string]struct{}) []string {
out := make([]string, 0, len(m))
for k := range m {
+1 -2
View File
@@ -434,8 +434,7 @@ func (h *MCPHandler) dispatchRPC(ctx context.Context, workspaceID string, req mc
}
default:
// Per OFFSEC-001: error message must not include user-controlled req.Method.
base.Error = &mcpRPCError{Code: -32601, Message: "method not found"}
base.Error = &mcpRPCError{Code: -32601, Message: "method not found: " + req.Method}
}
return base
+11 -117
View File
@@ -9,7 +9,6 @@ import (
"net/http"
"net/http/httptest"
"os"
"strings"
"testing"
"errors"
@@ -205,9 +204,6 @@ func TestMCPHandler_NotificationsInitialized_Returns200(t *testing.T) {
// Unknown method
// ─────────────────────────────────────────────────────────────────────────────
// TestMCPHandler_UnknownMethod_Returns32601 verifies dispatchRPC returns
// -32601 for an unknown method. Per OFFSEC-001: the error message must be
// constant — req.Method is user-controlled and must NOT appear in the response.
func TestMCPHandler_UnknownMethod_Returns32601(t *testing.T) {
h, _ := newMCPHandler(t)
@@ -228,14 +224,6 @@ func TestMCPHandler_UnknownMethod_Returns32601(t *testing.T) {
if resp.Error.Code != -32601 {
t.Errorf("expected code -32601, got %d", resp.Error.Code)
}
// Message must be constant — no user-controlled method name leak.
if resp.Error.Message != "method not found" {
t.Errorf("error message should be constant 'method not found', got: %q", resp.Error.Message)
}
// Double-check the method name never appears in the message (defence-in-depth).
if strings.Contains(resp.Error.Message, "not/a/real/method") {
t.Error("error message must not echo the user-controlled method name")
}
}
// ─────────────────────────────────────────────────────────────────────────────
@@ -417,32 +405,11 @@ func TestMCPHandler_CommitMemory_LocalScope_Success(t *testing.T) {
}
}
// TestMCPHandler_CommitMemory_GlobalScope_Blocked_ScrubsInternalError verifies
// two contracts at once on the GLOBAL-scope-blocked path:
//
// 1. C3 invariant (commit_memory with scope=GLOBAL aborts on the MCP bridge
// before touching the DB), AND
// 2. OFFSEC-001 / #259 scrub contract (commit 7d1a189f): the JSON-RPC error
// returned to the client is a CONSTANT — code=-32000, message="tool call
// failed" — with the production-internal err.Error() text logged
// server-side, never reflected back to the caller.
//
// Prior to this rename the test asserted that the client-visible message
// CONTAINED the substring "GLOBAL", which was the human-readable internal
// error from toolCommitMemory. mc#664 Class 2 flipped that assertion the
// right way around: now the test FAILS if the scrub regresses (i.e. if the
// internal string is ever reflected back to the wire), and PASSES iff the
// scrubbed constant reaches the client.
//
// Coupling note: the constant string "tool call failed" and the code -32000
// are the same values asserted by
// TestMCPHandler_dispatchRPC_UnknownTool_ReturnsConstantMessage — both are
// the OFFSEC-001 contract for the dispatch-failure branch in mcp.go (the
// third err.Error() leak that 7d1a189f scrubbed). If those constants ever
// change, both tests must move together.
func TestMCPHandler_CommitMemory_GlobalScope_Blocked_ScrubsInternalError(t *testing.T) {
// TestMCPHandler_CommitMemory_GlobalScope_Blocked verifies that C3 is enforced:
// GLOBAL scope is not permitted on the MCP bridge.
func TestMCPHandler_CommitMemory_GlobalScope_Blocked(t *testing.T) {
h, mock := newMCPHandler(t)
// No DB expectations — handler must abort before touching the DB (C3).
// No DB expectations — handler must abort before touching the DB.
w := mcpPost(t, h, "ws-1", map[string]interface{}{
"jsonrpc": "2.0",
@@ -457,53 +424,14 @@ func TestMCPHandler_CommitMemory_GlobalScope_Blocked_ScrubsInternalError(t *test
},
})
// JSON-RPC envelope returns 200 with the error in the body — only
// malformed-JSON-at-the-envelope-layer returns 400 (see Call() in mcp.go).
if w.Code != http.StatusOK {
t.Fatalf("expected 200 (JSON-RPC error in body), got %d: %s", w.Code, w.Body.String())
}
var resp mcpResponse
if err := json.Unmarshal(w.Body.Bytes(), &resp); err != nil {
t.Fatalf("response is not valid JSON: %v", err)
}
// (1) C3: an error must be reported.
json.Unmarshal(w.Body.Bytes(), &resp)
if resp.Error == nil {
t.Fatal("expected JSON-RPC error for GLOBAL scope, got nil")
t.Error("expected JSON-RPC error for GLOBAL scope, got nil")
}
// (2) OFFSEC-001 positive assertions — exact equality on the scrubbed
// constants so any change (re-leak of err.Error(), code mutation) trips
// the test. Substring-match would not catch a partial re-leak.
if resp.Error.Code != -32000 {
t.Errorf("error code should be -32000 (Server error / dispatch-failure), got: %d", resp.Error.Code)
if resp.Error != nil && !bytes.Contains([]byte(resp.Error.Message), []byte("GLOBAL")) {
t.Errorf("error message should mention GLOBAL, got: %s", resp.Error.Message)
}
if resp.Error.Message != "tool call failed" {
t.Errorf("error message should be the OFFSEC-001 constant %q, got: %q", "tool call failed", resp.Error.Message)
}
// (3) OFFSEC-001 negative assertions — the internal err.Error() text
// from toolCommitMemory ("GLOBAL scope is not permitted via the MCP
// bridge — use LOCAL or TEAM") must NOT appear in the client-visible
// message. Each token below is a distinct substring of that internal
// string; if ANY leaks through, the scrub in mcp.go dispatchRPC has
// regressed and this assertion fires the canary.
leakedTokens := []string{
"GLOBAL", // scope name
"scope", // policy lexicon
"permitted", // policy verb
"bridge", // internal architecture term
"LOCAL", // alternative scope name
"TEAM", // alternative scope name
}
for _, tok := range leakedTokens {
if bytes.Contains([]byte(resp.Error.Message), []byte(tok)) {
t.Errorf("OFFSEC-001 scrub regression: client-visible error.message leaks internal token %q (got: %q)", tok, resp.Error.Message)
}
}
// (4) C3 invariant preserved: handler must short-circuit before any DB call.
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("unexpected DB calls on GLOBAL scope block: %v", err)
}
@@ -608,13 +536,7 @@ func TestMCPHandler_CommitMemory_CleanContent_PassesThrough(t *testing.T) {
// tools/call — recall_memory
// ─────────────────────────────────────────────────────────────────────────────
// TestMCPHandler_RecallMemory_GlobalScope_Blocked_ScrubsInternalError verifies
// C3 (GLOBAL scope blocked on MCP bridge) is enforced and that the OFFSEC-001
// scrub contract applies: the client-visible error.message is the constant
// "tool call failed", NOT the descriptive internal reason. The internal reason
// ("GLOBAL scope is not permitted via the MCP bridge") is logged server-side
// but must never reach the wire.
func TestMCPHandler_RecallMemory_GlobalScope_Blocked_ScrubsInternalError(t *testing.T) {
func TestMCPHandler_RecallMemory_GlobalScope_Blocked(t *testing.T) {
h, mock := newMCPHandler(t)
// No DB expectations — handler must abort before touching the DB.
@@ -632,38 +554,10 @@ func TestMCPHandler_RecallMemory_GlobalScope_Blocked_ScrubsInternalError(t *test
})
var resp mcpResponse
if err := json.Unmarshal(w.Body.Bytes(), &resp); err != nil {
t.Fatalf("response is not valid JSON: %v", err)
}
// (1) C3: an error must be reported.
json.Unmarshal(w.Body.Bytes(), &resp)
if resp.Error == nil {
t.Fatal("expected JSON-RPC error for GLOBAL scope recall, got nil")
t.Error("expected JSON-RPC error for GLOBAL scope recall, got nil")
}
// (2) OFFSEC-001 positive assertions — exact equality on the scrubbed
// constants so any change (re-leak of err.Error(), code mutation) trips
// the test.
if resp.Error.Code != -32000 {
t.Errorf("error code should be -32000 (Server error / dispatch-failure), got: %d", resp.Error.Code)
}
if resp.Error.Message != "tool call failed" {
t.Errorf("error message should be the OFFSEC-001 constant %q, got: %q", "tool call failed", resp.Error.Message)
}
// (3) OFFSEC-001 negative assertions — the internal reason must NOT appear
// in the client-visible message.
leakedTokens := []string{
"GLOBAL", // scope name
"scope", // policy lexicon
"permitted", // policy verb
"bridge", // internal architecture term
"LOCAL", // alternative scope name
"TEAM", // alternative scope name
}
for _, tok := range leakedTokens {
if bytes.Contains([]byte(resp.Error.Message), []byte(tok)) {
t.Errorf("OFFSEC-001 scrub regression: client-visible error.message leaks internal token %q (got: %q)", tok, resp.Error.Message)
}
}
// (4) C3 invariant preserved: handler must short-circuit before any DB call.
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("unexpected DB calls on GLOBAL scope block: %v", err)
}
@@ -1,539 +0,0 @@
package handlers
import (
"strings"
"testing"
)
// ─────────────────────────────────────────────────────────────────────────────
// countWorkspaces tests
// ─────────────────────────────────────────────────────────────────────────────
func TestCountWorkspaces_Empty(t *testing.T) {
got := countWorkspaces(nil)
if got != 0 {
t.Errorf("nil: got %d, want 0", got)
}
got = countWorkspaces([]OrgWorkspace{})
if got != 0 {
t.Errorf("empty: got %d, want 0", got)
}
}
func TestCountWorkspaces_Flat(t *testing.T) {
tree := []OrgWorkspace{
{Name: "a"},
{Name: "b"},
{Name: "c"},
}
got := countWorkspaces(tree)
if got != 3 {
t.Errorf("flat 3: got %d, want 3", got)
}
}
func TestCountWorkspaces_Nested(t *testing.T) {
// root (1)
// / | \ (3 children)
// c1 c2 c3
// | |
// g1 g2 (2 grandchildren)
tree := []OrgWorkspace{
{
Name: "root",
Children: []OrgWorkspace{
{Name: "child1", Children: []OrgWorkspace{{Name: "grandchild1"}}},
{Name: "child2"},
{Name: "child3", Children: []OrgWorkspace{{Name: "grandchild2"}}},
},
},
}
got := countWorkspaces(tree)
if got != 6 {
t.Errorf("nested: got %d, want 6 (1 root + 3 children + 2 grandchildren)", got)
}
}
func TestCountWorkspaces_DeepNesting(t *testing.T) {
// chain of 5 levels
deep := []OrgWorkspace{
{Name: "L1", Children: []OrgWorkspace{
{Name: "L2", Children: []OrgWorkspace{
{Name: "L3", Children: []OrgWorkspace{
{Name: "L4", Children: []OrgWorkspace{
{Name: "L5"},
}},
}},
}},
}},
}
got := countWorkspaces(deep)
if got != 5 {
t.Errorf("deep chain: got %d, want 5", got)
}
}
// ─────────────────────────────────────────────────────────────────────────────
// envRequirementKey tests
// ─────────────────────────────────────────────────────────────────────────────
func TestEnvRequirementKey_SingleMember(t *testing.T) {
got := envRequirementKey([]string{"API_KEY"})
if got != "API_KEY" {
t.Errorf("single: got %q, want %q", got, "API_KEY")
}
}
func TestEnvRequirementKey_TwoMembers_OrderInsensitive(t *testing.T) {
keyAB := envRequirementKey([]string{"A", "B"})
keyBA := envRequirementKey([]string{"B", "A"})
if keyAB != keyBA {
t.Errorf("order-insensitive: [A,B]=%q, [B,A]=%q — must match", keyAB, keyBA)
}
}
func TestEnvRequirementKey_ThreeMembers_Sorted(t *testing.T) {
key := envRequirementKey([]string{"Z", "A", "M"})
// Should be "A\x00M\x00Z"
want := "A\x00M\x00Z"
if key != want {
t.Errorf("three members sorted: got %q, want %q", key, want)
}
}
func TestEnvRequirementKey_EmptyMembers(t *testing.T) {
got := envRequirementKey(nil)
if got != "" {
t.Errorf("nil: got %q, want empty", got)
}
got = envRequirementKey([]string{})
if got != "" {
t.Errorf("empty: got %q, want empty", got)
}
}
func TestEnvRequirementKey_DuplicateMembers(t *testing.T) {
// Duplicates should be preserved in sort; join still works
key := envRequirementKey([]string{"A", "A", "B"})
want := "A\x00A\x00B"
if key != want {
t.Errorf("duplicates: got %q, want %q", key, want)
}
}
func TestEnvRequirementKey_UsedForDedup(t *testing.T) {
// Real dedup case: {A,B} and {B,A} produce same key → dedup-eligible
// {A,B,C} produces a different key
keyAB := envRequirementKey([]string{"A", "B"})
keyBA := envRequirementKey([]string{"B", "A"})
keyABC := envRequirementKey([]string{"A", "B", "C"})
if keyAB != keyBA {
t.Errorf("AB vs BA: keys must match for dedup")
}
if keyAB == keyABC {
t.Errorf("AB vs ABC: keys must differ")
}
}
// ─────────────────────────────────────────────────────────────────────────────
// sanitizeEnvMembers tests
// ─────────────────────────────────────────────────────────────────────────────
// envVarNamePattern = ^[A-Z][A-Z0-9_]{0,127}$
func TestSanitizeEnvMembers_AllValid(t *testing.T) {
members := []string{"API_KEY", "MY_VAR_2", "A"}
got, ok := sanitizeEnvMembers(members, "test")
if !ok {
t.Error("all valid: ok should be true")
}
if len(got) != len(members) {
t.Errorf("all valid: got %v, want %v", got, members)
}
}
func TestSanitizeEnvMembers_SomeInvalid(t *testing.T) {
// Lowercase first char — invalid
members := []string{"API_KEY", "lowercase", "MY_VAR"}
got, ok := sanitizeEnvMembers(members, "test")
if !ok {
t.Error("one invalid: ok should be true (valid members remain)")
}
want := []string{"API_KEY", "MY_VAR"}
if len(got) != len(want) {
t.Errorf("one invalid: got %v, want %v", got, want)
}
}
func TestSanitizeEnvMembers_AllInvalid_DropsAll(t *testing.T) {
members := []string{"lowercase", "123_START", ""}
got, ok := sanitizeEnvMembers(members, "test")
if ok {
t.Error("all invalid: ok should be false")
}
if len(got) != 0 {
t.Errorf("all invalid: got %v, want empty", got)
}
}
func TestSanitizeEnvMembers_EmptyString_Skipped(t *testing.T) {
// Empty string is filtered but doesn't make ok=false
members := []string{"API_KEY", "", "MY_VAR"}
got, ok := sanitizeEnvMembers(members, "test")
if !ok {
t.Error("empty string in valid list: ok should be true")
}
if len(got) != 2 {
t.Errorf("empty string filtered: got %v, want [API_KEY, MY_VAR]", got)
}
}
func TestSanitizeEnvMembers_MaxLength(t *testing.T) {
// 128 chars: valid (1 prefix + 127 more = 128, all uppercase)
valid := "A" + strings.Repeat("B", 127)
got, ok := sanitizeEnvMembers([]string{valid}, "test")
if !ok {
t.Errorf("128 char valid: ok should be true, got %v", got)
}
// 129 chars: invalid (exceeds {0,127} suffix in regex)
tooLong := "A" + strings.Repeat("B", 128)
got, ok = sanitizeEnvMembers([]string{tooLong}, "test")
if ok {
t.Error("129 char invalid: ok should be false")
}
}
func TestSanitizeEnvMembers_DigitsAndUnderscore(t *testing.T) {
// regex ^[A-Z][A-Z0-9_]{0,127}$ — first char must be A-Z, not underscore
valid := []string{"A1", "A_2", "HTTP_200_OK", "ABC123"}
for _, v := range valid {
got, ok := sanitizeEnvMembers([]string{v}, "test")
if !ok {
t.Errorf("should be valid: %q", v)
}
if len(got) != 1 || got[0] != v {
t.Errorf("got %v, want [%q]", got, v)
}
}
}
// ─────────────────────────────────────────────────────────────────────────────
// flattenAndSortRequirements tests
// ─────────────────────────────────────────────────────────────────────────────
func TestFlattenAndSortRequirements_Empty(t *testing.T) {
got := flattenAndSortRequirements(map[string]EnvRequirement{})
if len(got) != 0 {
t.Errorf("empty: got %d, want 0", len(got))
}
}
func TestFlattenAndSortRequirements_SingleFirst(t *testing.T) {
// Singles come before groups; within singles, alphabetical
reqs := map[string]EnvRequirement{
envRequirementKey([]string{"ZETA"}): {Name: "ZETA"},
envRequirementKey([]string{"ALPHA"}): {Name: "ALPHA"},
}
got := flattenAndSortRequirements(reqs)
if len(got) != 2 {
t.Fatalf("got %d, want 2", len(got))
}
if got[0].Name != "ALPHA" {
t.Errorf("first: got %q, want ALPHA", got[0].Name)
}
if got[1].Name != "ZETA" {
t.Errorf("second: got %q, want ZETA", got[1].Name)
}
}
func TestFlattenAndSortRequirements_GroupsAfterSingles(t *testing.T) {
reqs := map[string]EnvRequirement{
envRequirementKey([]string{"X"}): {Name: "X"}, // single
envRequirementKey([]string{"A", "B"}): {AnyOf: []string{"A", "B"}}, // group
}
got := flattenAndSortRequirements(reqs)
if len(got) != 2 {
t.Fatalf("got %d, want 2", len(got))
}
// Single X comes before any group
if got[0].Name != "X" {
t.Errorf("first should be single X: got %+v", got[0])
}
if len(got[1].AnyOf) != 2 {
t.Errorf("second should be group: got %+v", got[1])
}
}
func TestFlattenAndSortRequirements_GroupsSortedByMemberKey(t *testing.T) {
// Groups sorted by their member-key (envRequirementKey sorts AnyOf members).
// {Z,A} → key "A\x00Z"; {B,C} → key "B\x00C". "A..." < "B..." → A,Z group first.
reqs := map[string]EnvRequirement{
envRequirementKey([]string{"Z", "A"}): {AnyOf: []string{"Z", "A"}}, // key: A\x00Z
envRequirementKey([]string{"B", "C"}): {AnyOf: []string{"B", "C"}}, // key: B\x00C
}
got := flattenAndSortRequirements(reqs)
if len(got) != 2 {
t.Fatalf("got %d, want 2", len(got))
}
// A\x00Z < B\x00C alphabetically, so the A,Z group sorts first
if len(got[0].AnyOf) != 2 || got[0].AnyOf[0] != "Z" {
t.Errorf("first group: got %+v, want [Z,A] (key A\\x00Z sorts before B\\x00C)", got[0])
}
}
// ─────────────────────────────────────────────────────────────────────────────
// collectOrgEnv tests
// ─────────────────────────────────────────────────────────────────────────────
func TestCollectOrgEnv_SingleRequired(t *testing.T) {
tmpl := &OrgTemplate{
RequiredEnv: []EnvRequirement{{Name: "API_KEY"}},
}
req, rec := collectOrgEnv(tmpl)
if len(req) != 1 {
t.Fatalf("got %d required, want 1", len(req))
}
if req[0].Name != "API_KEY" {
t.Errorf("name: got %q, want API_KEY", req[0].Name)
}
if len(rec) != 0 {
t.Errorf("recommended: got %d, want 0", len(rec))
}
}
func TestCollectOrgEnv_SingleRecommended(t *testing.T) {
tmpl := &OrgTemplate{
RecommendedEnv: []EnvRequirement{{Name: "DEBUG"}},
}
req, rec := collectOrgEnv(tmpl)
if len(req) != 0 {
t.Errorf("required: got %d, want 0", len(req))
}
if len(rec) != 1 {
t.Fatalf("got %d recommended, want 1", len(rec))
}
if rec[0].Name != "DEBUG" {
t.Errorf("name: got %q, want DEBUG", rec[0].Name)
}
}
func TestCollectOrgEnv_AnyOfGroup(t *testing.T) {
tmpl := &OrgTemplate{
RequiredEnv: []EnvRequirement{{AnyOf: []string{"AWS_KEY", "GCP_KEY", "AZURE_KEY"}}},
}
req, _ := collectOrgEnv(tmpl)
if len(req) != 1 {
t.Fatalf("got %d, want 1", len(req))
}
if len(req[0].AnyOf) != 3 {
t.Errorf("any_of members: got %v, want [AWS_KEY, GCP_KEY, AZURE_KEY]", req[0].AnyOf)
}
}
func TestCollectOrgEnv_InvalidNamesFiltered(t *testing.T) {
// "lowercase" and "" fail envVarNamePattern → silently dropped
tmpl := &OrgTemplate{
RequiredEnv: []EnvRequirement{{AnyOf: []string{"VALID_KEY", "lowercase", ""}}},
}
req, _ := collectOrgEnv(tmpl)
if len(req) != 1 {
t.Fatalf("invalid names filtered: got %d, want 1", len(req))
}
if len(req[0].AnyOf) != 1 || req[0].AnyOf[0] != "VALID_KEY" {
t.Errorf("valid names kept: got %v", req[0].AnyOf)
}
}
func TestCollectOrgEnv_GroupWithOneInvalid_KeepsRest(t *testing.T) {
// Mixed: one valid + one invalid → valid member is kept, invalid dropped
// regex requires ^[A-Z][A-Z0-9_]* — lowercase names are invalid
tmpl := &OrgTemplate{
RequiredEnv: []EnvRequirement{{AnyOf: []string{"GOOD_KEY", "lowercase_invalid"}}},
}
req, _ := collectOrgEnv(tmpl)
if len(req) != 1 {
t.Fatalf("got %d, want 1", len(req))
}
if len(req[0].AnyOf) != 1 || req[0].AnyOf[0] != "GOOD_KEY" {
t.Errorf("kept valid member: got %v, want [GOOD_KEY]", req[0].AnyOf)
}
}
func TestCollectOrgEnv_AllInvalidGroup_Dropped(t *testing.T) {
tmpl := &OrgTemplate{
RequiredEnv: []EnvRequirement{{AnyOf: []string{"lowercase", ""}}},
}
req, _ := collectOrgEnv(tmpl)
if len(req) != 0 {
t.Errorf("all-invalid group: got %d, want 0", len(req))
}
}
func TestCollectOrgEnv_RequiredSingleDominatesAnyOfGroup(t *testing.T) {
// Required: API_KEY (strict)
// Required: any_of [API_KEY, ALT_KEY]
// → the any_of group is redundant (API_KEY satisfies it already)
// → any_of group should be dropped from required
tmpl := &OrgTemplate{
RequiredEnv: []EnvRequirement{
{Name: "API_KEY"},
{AnyOf: []string{"API_KEY", "ALT_KEY"}},
},
}
req, _ := collectOrgEnv(tmpl)
if len(req) != 1 {
t.Fatalf("strict dominates group: got %d entries, want 1", len(req))
}
if req[0].Name != "API_KEY" {
t.Errorf("strict: got %+v, want name=API_KEY", req[0])
}
}
func TestCollectOrgEnv_RequiredSingleDominatesRecommendedAnyOf(t *testing.T) {
// Required: FOO (strict)
// Recommended: any_of [FOO, BAR]
// → FOO is already required; the recommended any_of is redundant
// → recommended any_of should be dropped
tmpl := &OrgTemplate{
RequiredEnv: []EnvRequirement{{Name: "FOO"}},
RecommendedEnv: []EnvRequirement{{AnyOf: []string{"FOO", "BAR"}}},
}
req, rec := collectOrgEnv(tmpl)
if len(req) != 1 || req[0].Name != "FOO" {
t.Errorf("required: got %+v", req)
}
if len(rec) != 0 {
t.Errorf("recommended any_of dominated by strict: got %d, want 0", len(rec))
}
}
func TestCollectOrgEnv_SameTierStrictDominatesGroup(t *testing.T) {
// Both in required: X (strict), any_of [X, Y] (group)
// Strict X makes the any_of redundant within the same tier
tmpl := &OrgTemplate{
RequiredEnv: []EnvRequirement{
{Name: "X"},
{AnyOf: []string{"X", "Y"}},
},
}
req, _ := collectOrgEnv(tmpl)
if len(req) != 1 {
t.Fatalf("got %d, want 1", len(req))
}
if req[0].Name != "X" {
t.Errorf("strict dominates same-tier group: got %+v", req[0])
}
}
func TestCollectOrgEnv_WorkspaceLevel(t *testing.T) {
// Workspaces can also declare required/recommended env
tmpl := &OrgTemplate{
Workspaces: []OrgWorkspace{
{
Name: "Dev",
RequiredEnv: []EnvRequirement{{Name: "DEV_KEY"}},
RecommendedEnv: []EnvRequirement{{Name: "DEV_TOOL"}},
},
},
}
req, rec := collectOrgEnv(tmpl)
if len(req) != 1 {
t.Fatalf("workspace required: got %d, want 1", len(req))
}
if req[0].Name != "DEV_KEY" {
t.Errorf("workspace required: got %v", req[0])
}
if len(rec) != 1 {
t.Fatalf("workspace recommended: got %d, want 1", len(rec))
}
if rec[0].Name != "DEV_TOOL" {
t.Errorf("workspace recommended: got %v", rec[0])
}
}
func TestCollectOrgEnv_DeepNesting(t *testing.T) {
// Nested children also contribute env requirements
tmpl := &OrgTemplate{
RequiredEnv: []EnvRequirement{{Name: "ORG_LEVEL"}},
Workspaces: []OrgWorkspace{
{
Name: "Root",
RequiredEnv: []EnvRequirement{{Name: "ROOT_LEVEL"}},
Children: []OrgWorkspace{
{
Name: "Child",
RequiredEnv: []EnvRequirement{{Name: "CHILD_LEVEL"}},
Children: []OrgWorkspace{
{Name: "GrandChild", RecommendedEnv: []EnvRequirement{{Name: "GRANDCHILD_TOOL"}}},
},
},
},
},
},
}
req, rec := collectOrgEnv(tmpl)
if len(req) != 3 {
t.Errorf("3 required levels: got %d: %+v", len(req), req)
}
if len(rec) != 1 {
t.Errorf("1 recommended: got %d: %+v", len(rec), rec)
}
}
func TestCollectOrgEnv_DedupAcrossTiers(t *testing.T) {
// Same key declared at org level AND workspace level → deduped to 1
tmpl := &OrgTemplate{
RequiredEnv: []EnvRequirement{{Name: "SHARED"}},
Workspaces: []OrgWorkspace{
{Name: "ws", RequiredEnv: []EnvRequirement{{Name: "SHARED"}}},
},
}
req, _ := collectOrgEnv(tmpl)
if len(req) != 1 {
t.Errorf("dedup across tiers: got %d, want 1", len(req))
}
}
func TestCollectOrgEnv_DedupWithinGroup(t *testing.T) {
// Same key declared multiple times within required → deduped
tmpl := &OrgTemplate{
RequiredEnv: []EnvRequirement{
{Name: "DUPE"},
{Name: "DUPE"},
},
}
req, _ := collectOrgEnv(tmpl)
if len(req) != 1 {
t.Errorf("dedup within tier: got %d, want 1", len(req))
}
}
func TestCollectOrgEnv_MixedCasePreservesSort(t *testing.T) {
// Sort order: singles first (alpha), then groups (by member-key)
tmpl := &OrgTemplate{
RequiredEnv: []EnvRequirement{
{Name: "ZETA"},
{Name: "ALPHA"},
{AnyOf: []string{"B", "A"}}, // key: A\x00B
{AnyOf: []string{"Y", "X"}}, // key: X\x00Y
},
}
req, _ := collectOrgEnv(tmpl)
if len(req) != 4 {
t.Fatalf("got %d, want 4", len(req))
}
// Singles first
if req[0].Name != "ALPHA" {
t.Errorf("single ALPHA first: got %+v", req[0])
}
if req[1].Name != "ZETA" {
t.Errorf("single ZETA second: got %+v", req[1])
}
// Groups after singles; A,B (key A\x00B) < X,Y (key X\x00Y)
if len(req[2].AnyOf) != 2 {
t.Errorf("third should be group: got %+v", req[2])
}
if req[2].AnyOf[0] != "B" { // "B" is first alphabetically in [A,B]
t.Errorf("A,B group should come first: got %+v", req[2])
}
}
@@ -1,242 +0,0 @@
package handlers
import (
"crypto/sha256"
"database/sql"
"net/http"
"net/http/httptest"
"testing"
"github.com/DATA-DOG/go-sqlmock"
"github.com/Molecule-AI/molecule-monorepo/platform/internal/ws"
"github.com/gin-gonic/gin"
)
// newSocketHandlerWithDB creates a SocketHandler with buffered Hub channels.
// The DB is set up via setupTestDB (called before this function in each test).
func newSocketHandlerWithDB(t *testing.T, hub *ws.Hub) *SocketHandler {
t.Helper()
if hub == nil {
hub = &ws.Hub{
Register: make(chan *ws.Client, 1),
Unregister: make(chan *ws.Client, 1),
}
}
return NewSocketHandler(hub)
}
// socketRequest builds a test request for the WebSocket connect endpoint.
func socketRequest(method, path, workspaceID, authHeader string) *http.Request {
req := httptest.NewRequest(method, path, nil)
if workspaceID != "" {
req.Header.Set("X-Workspace-ID", workspaceID)
}
if authHeader != "" {
req.Header.Set("Authorization", authHeader)
}
return req
}
// ─────────────────────────────────────────────────────────────────────────────
// Auth gate: DB error on HasAnyLiveToken → 500
// ─────────────────────────────────────────────────────────────────────────────
func TestSocketHandler_AuthGate_DBError_Returns500(t *testing.T) {
mock := setupTestDB(t)
handler := newSocketHandlerWithDB(t, nil)
// HasAnyLiveToken issues a query; make it return an error.
mock.ExpectQuery("SELECT COUNT").
WithArgs("ws-auth-db-err").
WillReturnError(sql.ErrConnDone)
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
c.Request = socketRequest("GET", "/ws", "ws-auth-db-err", "")
handler.HandleConnect(c)
if w.Code != http.StatusInternalServerError {
t.Errorf("DB error: expected 500, got %d", w.Code)
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("unmet mock expectations: %v", err)
}
}
// ─────────────────────────────────────────────────────────────────────────────
// Auth gate: workspace HAS live token, missing Bearer → 401
// ─────────────────────────────────────────────────────────────────────────────
func TestSocketHandler_AuthGate_HasLiveToken_MissingBearer_Returns401(t *testing.T) {
mock := setupTestDB(t)
handler := newSocketHandlerWithDB(t, nil)
// HasAnyLiveToken succeeds → workspace has a live token.
mock.ExpectQuery("SELECT COUNT").
WithArgs("ws-has-token-no-bearer").
WillReturnRows(sqlmock.NewRows([]string{"n"}).AddRow(1))
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
c.Request = socketRequest("GET", "/ws", "ws-has-token-no-bearer", "")
handler.HandleConnect(c)
if w.Code != http.StatusUnauthorized {
t.Errorf("hasLive but no bearer: expected 401, got %d", w.Code)
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("unmet mock expectations: %v", err)
}
}
// ─────────────────────────────────────────────────────────────────────────────
// Auth gate: workspace HAS live token, invalid Bearer → 401
// ─────────────────────────────────────────────────────────────────────────────
func TestSocketHandler_AuthGate_HasLiveToken_InvalidBearer_Returns401(t *testing.T) {
mock := setupTestDB(t)
handler := newSocketHandlerWithDB(t, nil)
wsID := "ws-invalid-token"
badToken := "not-a-valid-token"
// HasAnyLiveToken: workspace has a live token.
mock.ExpectQuery("SELECT COUNT").
WithArgs(wsID).
WillReturnRows(sqlmock.NewRows([]string{"n"}).AddRow(1))
// ValidateToken: lookupTokenByHash returns ErrNoRows for an unknown token.
// Any token hash is fine since the token doesn't exist — use AnyArg.
mock.ExpectQuery(`SELECT t\.id, t\.workspace_id.*FROM workspace_auth_tokens t.*JOIN`).
WithArgs(sqlmock.AnyArg()).
WillReturnError(sql.ErrNoRows)
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
c.Request = socketRequest("GET", "/ws", wsID, "Bearer "+badToken)
handler.HandleConnect(c)
if w.Code != http.StatusUnauthorized {
t.Errorf("hasLive but invalid bearer: expected 401, got %d", w.Code)
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("unmet mock expectations: %v", err)
}
}
// ─────────────────────────────────────────────────────────────────────────────
// Auth gate: workspace HAS live token, VALID Bearer → upgrade attempted.
// The WebSocket upgrade itself will fail in httptest (gorilla/websocket
// cannot write a real HTTP/1.1 handshake to httptest.ResponseRecorder), but
// the auth gate is passed so we verify no 401/500 was returned before the
// upgrade failure. This is the canvas-client success path.
// ─────────────────────────────────────────────────────────────────────────────
func TestSocketHandler_AuthGate_HasLiveToken_ValidBearer_AuthPassed(t *testing.T) {
mock := setupTestDB(t)
handler := newSocketHandlerWithDB(t, nil)
wsID := "ws-valid-token"
goodToken := "valid-ws-token-123"
// HasAnyLiveToken: workspace has a live token.
mock.ExpectQuery("SELECT COUNT").
WithArgs(wsID).
WillReturnRows(sqlmock.NewRows([]string{"n"}).AddRow(1))
// ValidateToken: token found and workspace is not removed.
// sha256TokenHash returns []byte; rational matcher compares as string.
mock.ExpectQuery(`SELECT t\.id, t\.workspace_id.*FROM workspace_auth_tokens t.*JOIN`).
WithArgs(sha256TokenHash(goodToken)).
WillReturnRows(sqlmock.NewRows([]string{"token_id", "workspace_id"}).
AddRow("tok-abc", wsID))
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
c.Request = socketRequest("GET", "/ws", wsID, "Bearer "+goodToken)
handler.HandleConnect(c)
// The WebSocket upgrade fails in httptest (httptest.ResponseRecorder is not
// a real TCP connection), but the auth gate itself succeeded — we should
// NOT see a 401 or 500 response code. The actual code depends on the
// upgrade error handling; the critical assertion is that auth passed.
if w.Code == http.StatusUnauthorized || w.Code == http.StatusInternalServerError {
t.Errorf("valid token: auth should have passed; got %d", w.Code)
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("unmet mock expectations: %v", err)
}
}
// ─────────────────────────────────────────────────────────────────────────────
// Canvas client (no X-Workspace-ID): auth gate bypassed, upgrade attempted.
// Same httptest limitation as above — we verify no 401/500 before the upgrade.
// ─────────────────────────────────────────────────────────────────────────────
func TestSocketHandler_CanvasClient_NoAuthGate(t *testing.T) {
mock := setupTestDB(t)
handler := newSocketHandlerWithDB(t, nil)
// No X-Workspace-ID header → no auth check → no DB queries expected.
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
c.Request = socketRequest("GET", "/ws", "", "") // no workspace ID
handler.HandleConnect(c)
// No auth gate hit → no 401/500. The WebSocket upgrade itself will fail
// in httptest, but that's expected (see TestSocketHandler_AuthGate_HasLiveToken_ValidBearer_AuthPassed).
if w.Code == http.StatusUnauthorized || w.Code == http.StatusInternalServerError {
t.Errorf("canvas client: expected no auth error; got %d", w.Code)
}
if err := mock.ExpectationsWereMet(); err != nil {
t.Errorf("unmet mock expectations: %v", err)
}
}
// ─────────────────────────────────────────────────────────────────────────────
// Legacy workspace: HAS live token flag but workspace exists AND ValidateToken
// is called. Since the workspace has a live token, the handler MUST validate
// the presented token (not grandfather through). This is the Phase 30.1/30.2
// contract — a workspace with tokens on file is NOT grandfathered.
// ─────────────────────────────────────────────────────────────────────────────
func TestSocketHandler_AuthGate_HasLiveToken_EmptyBearer_Returns401(t *testing.T) {
mock := setupTestDB(t)
handler := newSocketHandlerWithDB(t, nil)
wsID := "ws-has-live-token-empty-bearer"
// HasAnyLiveToken: workspace has a live token.
mock.ExpectQuery("SELECT COUNT").
WithArgs(wsID).
WillReturnRows(sqlmock.NewRows([]string{"n"}).AddRow(1))
// Authorization header is "Bearer " (empty token after "Bearer ").
// wsauth.BearerTokenFromHeader strips "Bearer " and gets "".
// ValidateToken is called with "" → returns ErrInvalidToken before DB hit.
w := httptest.NewRecorder()
c, _ := gin.CreateTestContext(w)
c.Request = socketRequest("GET", "/ws", wsID, "Bearer ")
handler.HandleConnect(c)
if w.Code != http.StatusUnauthorized {
t.Errorf("empty bearer after Bearer prefix: expected 401, got %d", w.Code)
}
}
// ─────────────────────────────────────────────────────────────────────────────
// helpers
// ─────────────────────────────────────────────────────────────────────────────
// sha256TokenHash returns the SHA256 hash of a plaintext token, matching what
// wsauth.ValidateToken does internally before querying the DB.
func sha256TokenHash(plaintext string) []byte {
h := sha256.Sum256([]byte(plaintext))
return h[:]
}
@@ -43,27 +43,6 @@ func (s *syncBuf) String() string {
return s.b.String()
}
// unwrapGoError extracts subprocess stderr from a Go-wrapped error that
// includes combined output. e.g. from sendSSHPublicKey's
// fmt.Errorf("send-ssh-public-key: %w (%s)", err, combinedOut), this
// returns the " (%s)" portion — the actionable subprocess signal like
// "AccessDeniedException: ... is not authorized to perform:
// ec2-instance-connect:OpenTunnel". Returns "" when the output is
// identical to the error string (no stderr captured).
func unwrapGoError(errMsg string) string {
// Extract content between the last '(' and trailing ')'. The
// sendSSHPublicKey wrapper uses fmt.Errorf("...: %w (%s)", err, combinedOut)
// so the subprocess stderr is always the last parenthesised segment,
// e.g. "send-ssh-public-key: exit status 1 (AccessDeniedException: ...)"
// — note the closing ')' is at the very end with no trailing space.
open := strings.LastIndex(errMsg, "(")
if open < 0 {
return ""
}
inner := errMsg[open+1:]
return strings.TrimSuffix(inner, ")")
}
// HandleDiagnose handles GET /workspaces/:id/terminal/diagnose. It runs the
// same per-step pipeline as HandleConnect (ssh-keygen → EIC send-key → tunnel
// → ssh) but non-interactively, captures the first failing step and its
@@ -235,18 +214,12 @@ func (h *TerminalHandler) diagnoseRemote(ctx context.Context, workspaceID, insta
}
// Step 2: send-ssh-public-key (AWS Instance Connect)
// mc#687: populate Detail so the E2E smoke sees the AWS permission error
// verbatim. The subprocess stderr (e.g. "AccessDeniedException: ... is not
// authorized to perform: ec2-instance-connect:OpenTunnel") is captured by
// sendSSHPublicKey's CombinedOutput() and embedded in the Go error string.
t0 = time.Now()
if err := sendSSHPublicKey(ctx, region, instanceID, osUser, strings.TrimSpace(string(pubKey))); err != nil {
errMsg := err.Error()
return stop("send-ssh-public-key", diagnoseStep{
Name: "send-ssh-public-key",
DurationMs: time.Since(t0).Milliseconds(),
Error: errMsg,
Detail: unwrapGoError(errMsg),
Error: err.Error(),
})
}
res.Steps = append(res.Steps, diagnoseStep{Name: "send-ssh-public-key", OK: true, DurationMs: time.Since(t0).Milliseconds()})
@@ -245,50 +245,3 @@ func TestDiagnoseRemote_StopsAtSSHProbe(t *testing.T) {
}
}
// TestUnwrapGoError pins the unwrapGoError helper that extracts subprocess
// stderr from the Go-wrapped error string produced by sendSSHPublicKey.
// Regression gate for mc#687: the E2E smoke now reads detail (not error),
// so detail MUST contain the actionable AWS permission signal.
func TestUnwrapGoError(t *testing.T) {
cases := []struct {
name string
input string
want string
}{
{
name: "AWS permission denied",
input: "send-ssh-public-key: exec: exit status 1 (AccessDeniedException: User: arn:aws:iam::123456789012:role/TestRole is not authorized to perform: ec2-instance-connect:OpenTunnel)",
want: "AccessDeniedException: User: arn:aws:iam::123456789012:role/TestRole is not authorized to perform: ec2-instance-connect:OpenTunnel",
},
{
name: "generic exec error no output",
input: "send-ssh-public-key: exec: exit status 1",
want: "",
},
{
name: "empty string",
input: "",
want: "",
},
{
name: "short string below threshold",
input: "err",
want: "",
},
{
name: "no parentheses",
input: "send-ssh-public-key: something went wrong",
want: "",
},
}
for _, tc := range cases {
t.Run(tc.name, func(t *testing.T) {
got := unwrapGoError(tc.input)
if got != tc.want {
t.Errorf("unwrapGoError(%q): got %q, want %q", tc.input, got, tc.want)
}
})
}
}