molecule-ci/.github/workflows/validate-workspace-template.yml
security-auditor 7e2bde9b77 fix(ci): force anon checkout of public molecule-ci to bypass Gitea cross-repo 404
After lowercasing the slug (molecule-ci#1) and flipping molecule-ci public,
plugin/template/org-template CI still failed at the SECOND actions/checkout
step (the one that fetches molecule-ci itself for canonical validator scripts).

Failure mode in act_runner log:
  Run actions/checkout@v4
    repository: molecule-ai/molecule-ci
    path: .molecule-ci-canonical
  Syncing repository: molecule-ai/molecule-ci
  [git config http.https://git.moleculesai.app/.extraheader AUTHORIZATION: basic ***]
  ::error::The target couldn't be found.
   Failure - Main actions/checkout@v4

Root cause: actions/checkout@v4 sends `Authorization: basic <github.token>` —
the per-job Gitea-issued token, scoped to the calling plugin/template repo
only. On Gitea, an authenticated request that lacks repo-permission 404s
instead of falling back to anonymous-public-read (a Gitea-vs-GitHub
behaviour difference). Anonymous git clone of molecule-ci succeeds; the auth
header is what trips the 404.

Fix: pass `token: ''` to force anonymous fetch on the cross-repo checkouts.
molecule-ci is public; no auth is needed for read.

3 sites updated:
  * validate-plugin.yml (1 site)
  * validate-workspace-template.yml (2 sites — both jobs in the file)
  * validate-org-template.yml (1 site)

Verified by: re-triggering plugin-molecule-careful-bash#2 will be GREEN
end-to-end after this lands. The 33 downstream lowercase-slug PRs are NOT
mass-merged until that verification.

Refs: internal#46

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 01:23:37 -07:00

215 lines
9.7 KiB
YAML

name: Validate Workspace Template
on:
workflow_call:
# Defense-in-depth on the GITHUB_TOKEN scope. This workflow runs
# untrusted-by-design code from the calling template repo — pip
# installs the template's requirements.txt (post-install hooks),
# imports adapter.py, and `docker build`s the Dockerfile (RUN
# steps). Each of those primitives can execute arbitrary code with
# the token in env. Pinning `contents: read` means the worst a
# malicious template PR can do with the token is read public repo
# state — no write to issues, no push to branches, no comment-spam,
# no workflow re-trigger.
#
# Fork-PR lockdown (#135): the workflow splits into two jobs:
#
# validate-static — file-content checks only (secret scan, YAML
# parse, AST inspection of adapter.py without
# import). Always runs, including external fork
# PRs. Safe because no third-party code executes.
#
# validate-runtime — pip install requirements.txt + import
# adapter.py + docker build. SKIPPED on fork
# PRs because each step is arbitrary code
# execution from the template repo's perspective.
# Internal PRs and post-merge runs still get
# the full coverage.
#
# What this prevents: a malicious external PR can no longer
# crypto-mine on the runner, DNS-exfiltrate runner metadata, or
# attempt to read GitHub-Actions internal env via a setup.py
# postinstall hook. They still get static feedback (secret scan
# is the most important security check anyway).
#
# What this does NOT prevent: malicious template metadata that
# passes static checks. The runtime job catches those once the PR
# merges (or an internal contributor reposts the change), at which
# point branch protection on staging/main blocks the merge if
# runtime validation fails.
permissions:
contents: read
jobs:
validate-static:
name: Template validation (static)
runs-on: ubuntu-latest
timeout-minutes: 5
steps:
# Calling template repo (Dockerfile + config.yaml + adapter.py).
- uses: actions/checkout@v4
# Canonical validator script lives in molecule-ci, fetched fresh on
# every run. The previous setup expected `.molecule-ci/scripts/` to
# be vendored INTO each template repo, which drifted across the 8
# template repos as the validator evolved. Single source of truth
# eliminates that drift class entirely — every template runs the
# same canonical contract check on every CI run.
- uses: actions/checkout@v4
with:
repository: molecule-ai/molecule-ci
path: .molecule-ci-canonical
# Force anonymous; see validate-plugin.yml note. molecule-ci is public.
token: ''
- uses: actions/setup-python@v5
with:
python-version: "3.11"
# Secret scan — the most important check. Always runs.
- name: Check for secrets
run: |
python3 - << 'PYEOF'
import os, re, sys
from pathlib import Path
PATTERNS = [
re.compile(r'''["']sk-ant-[a-zA-Z0-9]{50,}["']'''),
re.compile(r'''["']ghp_[a-zA-Z0-9]{36,}["']'''),
re.compile(r'''["']AKIA[A-Z0-9]{16}["']'''),
re.compile(r'''["'][a-zA-Z0-9/+=]{40}["']'''),
re.compile(r'''["']sk_test_[a-zA-Z0-9]{24,}["']'''),
re.compile(r'''["']Bearer\s+[a-zA-Z0-9_.-]{20,}["']'''),
re.compile(r'''ghp_[a-zA-Z0-9]{36,}'''),
re.compile(r'''sk-ant-[a-zA-Z0-9]{50,}'''),
]
SKIP_DIRS = {'.molecule-ci', '.git', 'node_modules', '__pycache__'}
EXTENSIONS = {'.yaml', '.yml', '.md', '.py', '.sh'}
def is_false_positive(line):
ctx = line.lower()
return '...' in ctx or '<example' in ctx or '</example' in ctx
root = Path(os.environ.get('GITHUB_WORKSPACE', '.'))
warnings = []
for dirpath, dirnames, filenames in os.walk(root):
dirnames[:] = [d for d in dirnames if d not in SKIP_DIRS]
for filename in filenames:
if Path(filename).suffix not in EXTENSIONS:
continue
filepath = Path(dirpath) / filename
try:
with open(filepath, 'r', encoding='utf-8', errors='ignore') as f:
for lineno, line in enumerate(f.readlines(), 1):
for pattern in PATTERNS:
for match in pattern.finditer(line):
if not is_false_positive(line):
warnings.append(f" {filepath}:{lineno}: {match.group(0)[:40]}...")
except Exception:
pass
if warnings:
print("::error::Potential secret found in committed files:")
for w in warnings:
print(w)
sys.exit(1)
else:
print("::notice::No secrets detected")
PYEOF
# Static-only validator — file existence checks, YAML parse,
# AST inspection of adapter.py (no import). Doesn't execute
# any third-party code; safe on fork PRs.
- run: pip install pyyaml -q
- run: python3 .molecule-ci-canonical/scripts/validate-workspace-template.py --static-only
validate-runtime:
name: Template validation (runtime)
runs-on: ubuntu-latest
timeout-minutes: 15
needs: validate-static
# Skip when the PR comes from a fork — those are external,
# untrusted, and would let attackers run pip install / docker
# build / adapter.py import on our runner. Internal PRs (head
# repo == base repo, fork == false) and push events to internal
# branches both keep full coverage.
#
# github.event.pull_request.head.repo.fork is null for non-PR
# events (push, schedule, etc.) — defaults to running.
if: github.event.pull_request.head.repo.fork != true
steps:
- uses: actions/checkout@v4
- uses: actions/checkout@v4
with:
repository: molecule-ai/molecule-ci
path: .molecule-ci-canonical
# Force anonymous; see validate-plugin.yml note. molecule-ci is public.
token: ''
- uses: actions/setup-python@v5
with:
python-version: "3.11"
# Cache pip against the calling repo's own requirements.txt
# (the file we install one step below). Pointing the cache key
# at the validator's own deps was decorative — pyyaml never
# changes, so the key never invalidated even when the template
# added a heavy dep like crewai.
cache: "pip"
cache-dependency-path: requirements.txt
- run: pip install pyyaml -q
# Install the template's runtime dependencies so the validator's
# `check_adapter_runtime_load()` can import adapter.py the same way
# the workspace container does at boot. Without this, a
# syntactically-valid adapter that ImportErrors on a missing
# transitive dep would build clean and crash on first user prompt.
# The fallback (no requirements.txt) installs the runtime alone so
# BaseAdapter is at least importable for the class-discovery check.
- if: hashFiles('requirements.txt') != ''
run: pip install -q -r requirements.txt
- if: hashFiles('requirements.txt') == ''
run: pip install -q molecule-ai-workspace-runtime
# Full validator — includes adapter.py import (exec_module).
- run: python3 .molecule-ci-canonical/scripts/validate-workspace-template.py
- name: Docker build smoke test
if: hashFiles('Dockerfile') != ''
run: docker build -t template-test . --no-cache 2>&1 | tail -5 && echo "✓ Docker build succeeded"
# Aggregator that emits a single `Template validation` check name —
# the caller's job (`validate:` in each template's ci.yml) plus this
# job's name produces `validate / Template validation`, which is what
# template-repo branch protection has historically required.
#
# Why it's needed: the workflow was refactored from one job into
# validate-static + validate-runtime (with matrix-suffixed display
# names) for fork-PR security. The matrix names never match the
# original required-check name, so PR auto-merge silently hung in
# BLOCKED forever on every template repo (caught while shipping
# fixes for the boot-smoke gate, openclaw#11 + hermes#29).
#
# `if: always()` so it reports out even when validate-static fails —
# without that, GitHub marks the aggregator as SKIPPED and branch
# protection still blocks because the required check never reports
# a final state.
#
# Fork-PR semantics: validate-runtime is intentionally skipped on
# fork PRs (security gate). Treat `skipped` as a pass for the
# aggregator on forks so static-only coverage doesn't make every
# external PR un-mergeable.
template-validation:
name: Template validation
runs-on: ubuntu-latest
needs: [validate-static, validate-runtime]
if: always()
timeout-minutes: 1
steps:
- name: Aggregate
run: |
static="${{ needs.validate-static.result }}"
runtime="${{ needs.validate-runtime.result }}"
echo "validate-static: $static"
echo "validate-runtime: $runtime"
if [ "$static" != "success" ]; then
echo "::error::validate-static did not succeed: $static"
exit 1
fi
if [ "$runtime" != "success" ] && [ "$runtime" != "skipped" ]; then
echo "::error::validate-runtime did not succeed: $runtime"
exit 1
fi
echo "::notice::Template validation aggregate passed (static=$static, runtime=$runtime)"