All checks were successful
audit-force-merge / audit (pull_request) Has been skipped
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 12s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 8s
Check migration collisions / Migration version collision check (pull_request) Successful in 37s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 32s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 11s
sop-tier-check / tier-check (pull_request) Successful in 9s
Ops Scripts Tests / Ops scripts (unittest) (pull_request) Successful in 39s
Runtime Pin Compatibility / PyPI-latest install + import smoke (pull_request) Successful in 2m0s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 3m3s
Sweep companion to PR#372 (ci.yml port), PR#378 (Cat A), PR#379 (Cat B).
Ports 9 workflow files from .github/workflows/ to .gitea/workflows/.
Each port applies the four-surface audit pattern per
feedback_gitea_actions_migration_audit_pattern:
1. YAML — dropped workflow_dispatch.inputs (Gitea 1.22.6 parser
rejects them per feedback_gitea_workflow_dispatch_inputs_unsupported),
dropped merge_group (no Gitea merge queue), workflow-level
env.GITHUB_SERVER_URL pinned per feedback_act_runner_github_server_url.
2. Cache — actions/setup-python cache:pip retained (works with Gitea
1.22.x cache server). No actions/cache@v4 usage in this batch.
3. Token — auto-injected GITHUB_TOKEN (Gitea-aliased) used; no
custom dispatch tokens.
4. Docs — top-of-file "Ported from .github/workflows/X.yml on
2026-05-11 per RFC internal#219 §1 sweep" comment on every file.
Per RFC §1: each job has `continue-on-error: true` so surfaced
defects do not block PRs. Follow-up PR (not in this sweep's scope)
flips to `continue-on-error: false` after triage.
Files ported:
- block-internal-paths.yml — forbidden-path PR gate. Standard port;
dropped merge_group + the merge_group-specific fetch step.
- cascade-list-drift-gate.yml — TEMPLATES vs manifest.json drift.
Passes WORKFLOW=.gitea/workflows/publish-runtime.yml to the script
(script's default is .github/... which Cat A removes).
- check-migration-collisions.yml — Postgres migration prefix
collision gate. The collision script already supports Gitea via
_gitea_api_url() / _gitea_token() — no script edit needed.
- lint-curl-status-capture.yml — workflow-bash anti-pattern lint.
Scanner glob and SELF self-skip path retargeted to .gitea/workflows/**.yml.
- runtime-pin-compat.yml — PyPI-latest install + import smoke.
Dropped workflow_dispatch + merge_group.
- runtime-prbuild-compat.yml — PR-built wheel import smoke.
dorny/paths-filter@v4 replaced with inline `git diff` per PR#372
pattern. detect-changes job + per-step if-gates retained.
- secret-pattern-drift.yml — canonical/consumer pattern set drift
lint. on.paths references the .gitea/ canonical path. Also edits
.github/scripts/lint_secret_pattern_drift.py CANONICAL_FILE
constant from `.github/workflows/secret-scan.yml` to
`.gitea/workflows/secret-scan.yml` (Cat A removes the .github/
one).
- test-ops-scripts.yml — scripts/ unittest runner. Dropped merge_group.
- railway-pin-audit.yml — daily Railway env var drift detection.
`actions/github-script@v9` blocks (which call github.rest.* — a
GitHub-specific JS API) replaced with curl calls against the
Gitea REST API (/api/v1/repos/.../issues|comments). Issue
open/comment-on-repeat/close-on-clean semantics preserved.
This Cat C-1 PR groups the "safer" gates/lints/audits. Categories
C-2 (E2E) and C-3 (deploy/publish/janitors) ship in separate PRs.
The original .github/ files are left in place per RFC §1 (deletion
is a Phase 4 follow-up). They are silently dead — Gitea Actions in
molecule-core only registers workflows under .gitea/workflows/ —
but keeping them documented in-repo eases the diff-review.
DO NOT MERGE without orchestrator-dispatched Five-Axis review +
@hongmingwang chat-go.
Cross-links:
- RFC: molecule-ai/internal#219
- Companion: PR#372 (ci.yml port), PR#378 (Cat A), PR#379 (Cat B)
- Runbook: runbooks/gitea-actions-migration-checklist.md (Cat B PR)
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
167 lines
6.5 KiB
Python
167 lines
6.5 KiB
Python
#!/usr/bin/env python3
|
|
"""Lint SECRET_PATTERNS drift across known consumers of molecule-core's canonical.
|
|
|
|
The canonical SECRET_PATTERNS array in
|
|
.github/workflows/secret-scan.yml is mirrored by every other side
|
|
that scans for credentials: the workspace-runtime's bundled
|
|
pre-commit hook, the molecule-controlplane inlined copy, etc. The
|
|
mirror is enforced socially today — when someone adds a new pattern
|
|
to canonical (e.g. the sk-cp- MiniMax token after F1088), the other
|
|
sides are supposed to be updated in lockstep.
|
|
|
|
This script automates the check. Diffs the canonical's pattern set
|
|
against each known public consumer and exits non-zero on any
|
|
mismatch. Wired into a daily cron + on-push gate via
|
|
.github/workflows/secret-pattern-drift.yml.
|
|
|
|
Private-repo consumers (currently molecule-controlplane's inlined
|
|
copy) are out of scope here because the molecule-core workflow's
|
|
GITHUB_TOKEN can't read other private repos in the org. They're
|
|
expected to self-monitor via their own copy of this script — not a
|
|
hard barrier, just a future expansion.
|
|
"""
|
|
|
|
from __future__ import annotations
|
|
|
|
import re
|
|
import sys
|
|
import urllib.request
|
|
from pathlib import Path
|
|
|
|
CANONICAL_FILE = Path(".gitea/workflows/secret-scan.yml")
|
|
|
|
# Public consumer mirrors. Each entry is (label, raw_url) — raw_url
|
|
# points at the file's RAW content on the consumer's default branch
|
|
# (or staging where applicable). Add an entry here when a new public
|
|
# repo starts shipping its own SECRET_PATTERNS array.
|
|
CONSUMERS: list[tuple[str, str]] = [
|
|
(
|
|
"molecule-ai-workspace-runtime/molecule_runtime/scripts/pre-commit-checks.sh",
|
|
"https://git.moleculesai.app/molecule-ai/molecule-ai-workspace-runtime/raw/branch/main/molecule_runtime/scripts/pre-commit-checks.sh",
|
|
),
|
|
]
|
|
|
|
# In-repo consumers — paths read locally from the workflow checkout.
|
|
# Read-from-disk avoids the staging→main lag that the URL fetcher
|
|
# would hit (a freshly-edited canonical wouldn't yet be on the
|
|
# consumer's default branch). Same drift semantics, no network.
|
|
LOCAL_CONSUMERS: list[tuple[str, Path]] = [
|
|
(
|
|
".githooks/pre-commit (molecule-core local hook)",
|
|
Path(".githooks/pre-commit"),
|
|
),
|
|
]
|
|
|
|
# Matches the SECRET_PATTERNS=( ... ) array in either yaml-indented
|
|
# (the canonical workflow's `run:` block) or shell-flat (runtime
|
|
# hook) format. Patterns inside are single-quoted Bash strings; we
|
|
# pull each via _PATTERN_RE.
|
|
#
|
|
# Closing `)` is anchored to the start of a line (possibly indented)
|
|
# because pattern comments like `# GitHub PAT (classic)` contain
|
|
# their own `)` mid-line — a non-anchored regex would match through
|
|
# the comment's paren and capture only the first pattern.
|
|
_ARRAY_RE = re.compile(r"SECRET_PATTERNS=\((.*?)^\s*\)", re.DOTALL | re.MULTILINE)
|
|
_PATTERN_RE = re.compile(r"'([^']+)'")
|
|
|
|
|
|
def extract_patterns(content: str, source_label: str) -> list[str]:
|
|
"""Pull the SECRET_PATTERNS list out of either format. Raises if missing."""
|
|
m = _ARRAY_RE.search(content)
|
|
if not m:
|
|
raise SystemExit(f"::error::{source_label}: SECRET_PATTERNS=(...) array not found")
|
|
return _PATTERN_RE.findall(m.group(1))
|
|
|
|
|
|
def fetch(url: str) -> str:
|
|
req = urllib.request.Request(
|
|
url, headers={"User-Agent": "secret-pattern-drift-lint/1"}
|
|
)
|
|
with urllib.request.urlopen(req, timeout=30) as resp:
|
|
return resp.read().decode("utf-8")
|
|
|
|
|
|
def diff_patterns(canonical: list[str], consumer: list[str]) -> tuple[list[str], list[str]]:
|
|
"""Return (missing_from_consumer, extra_in_consumer) — both sorted."""
|
|
canonical_set = set(canonical)
|
|
consumer_set = set(consumer)
|
|
return (
|
|
sorted(canonical_set - consumer_set),
|
|
sorted(consumer_set - canonical_set),
|
|
)
|
|
|
|
|
|
def main() -> int:
|
|
if not CANONICAL_FILE.exists():
|
|
print(f"::error::canonical not found at {CANONICAL_FILE}")
|
|
return 1
|
|
|
|
canonical = extract_patterns(CANONICAL_FILE.read_text(), str(CANONICAL_FILE))
|
|
print(f"canonical ({CANONICAL_FILE}): {len(canonical)} patterns")
|
|
|
|
drift = False
|
|
|
|
# In-repo consumers first — these are read from the workflow's own
|
|
# checkout, so they never lag behind the canonical and a missing
|
|
# file IS a real error (not a fetch warning).
|
|
for label, path in LOCAL_CONSUMERS:
|
|
if not path.exists():
|
|
print(f"::error::{label}: file not found at {path}")
|
|
drift = True
|
|
continue
|
|
consumer = extract_patterns(path.read_text(), label)
|
|
missing, extra = diff_patterns(canonical, consumer)
|
|
if not missing and not extra:
|
|
print(f" ✓ {label}: aligned ({len(consumer)} patterns)")
|
|
continue
|
|
drift = True
|
|
print(f"::error::DRIFT in {label}:")
|
|
for p in missing:
|
|
print(f" - missing from consumer: {p!r}")
|
|
for p in extra:
|
|
print(f" - extra in consumer (not in canonical): {p!r}")
|
|
|
|
for label, url in CONSUMERS:
|
|
try:
|
|
content = fetch(url)
|
|
except Exception as e:
|
|
# Fetch failures are warnings, not errors. A consumer
|
|
# whose default branch was just renamed (or whose file
|
|
# moved) shouldn't fail the lint until someone updates
|
|
# the URL above. Real drift is the failure mode this
|
|
# gate exists to catch — fetch reliability isn't.
|
|
print(f"::warning::{label}: fetch failed ({e}) — skipping")
|
|
continue
|
|
|
|
consumer = extract_patterns(content, label)
|
|
missing, extra = diff_patterns(canonical, consumer)
|
|
if not missing and not extra:
|
|
print(f" ✓ {label}: aligned ({len(consumer)} patterns)")
|
|
continue
|
|
|
|
drift = True
|
|
print(f"::error::DRIFT in {label}:")
|
|
for p in missing:
|
|
print(f" - missing from consumer: {p!r}")
|
|
for p in extra:
|
|
print(f" - extra in consumer (not in canonical): {p!r}")
|
|
|
|
if drift:
|
|
print()
|
|
print("::error::SECRET_PATTERNS drift detected. Bring consumer(s) into")
|
|
print("alignment with the canonical SECRET_PATTERNS array in")
|
|
print(f"{CANONICAL_FILE} by adding the missing patterns and removing")
|
|
print("any extras. The two sides must stay byte-aligned on the pattern")
|
|
print("list — the runtime hook is the developer's local pre-commit,")
|
|
print("the canonical is the org-wide CI gate, divergence means a token")
|
|
print("can pass one but get rejected by the other.")
|
|
return 1
|
|
|
|
print()
|
|
print("✓ All known consumers aligned with canonical SECRET_PATTERNS.")
|
|
return 0
|
|
|
|
|
|
if __name__ == "__main__":
|
|
sys.exit(main())
|