Anti-leak proposal item A. Companion to D (decision tree in role
prompts, separate PR on org-templates).
Why a local pre-commit hook
===========================
Agents try to `git add /research/foo.md` despite SHARED_RULES, the
.gitignore patterns, and the CI gate. Each leak attempt costs ~5 cycles
(PR opens, CI fails, agent retries with workaround) and pollutes git
history with reverts.
A pre-commit hook converts the failure from "PR opens then fails" →
"commit refused immediately, with the recovery command printed in the
same error message the agent reads." Agents act on what's in the
current response context — putting the redirect command literally in
the failure output is the highest-density feedback we can provide.
What changes
============
- molecule_runtime/scripts/pre-commit-block-internal-paths.sh —
bash hook. Checks `git remote get-url origin`, only enforces in
Molecule-AI/molecule-monorepo + molecule-core. In every other repo
(internal, plugins, templates, third-party) it's a no-op.
When forbidden paths are staged, refuses the commit with the redirect
recipe + the alternative public-facing paths + the workflow-edit path
for legitimate exceptions.
- molecule_runtime/precommit_hook.py — install_pre_commit_hook():
1. Extracts bundled hook to ~/.molecule-runtime/git-hooks/pre-commit
2. chmod +x
3. Sets core.hooksPath globally — UNLESS already set by an operator
(then logs a warning + skips, doesn't clobber)
- molecule_runtime/main.py — calls install_pre_commit_hook() at
step 0.2, right after install_credential_helper()
- pyproject.toml bumped to 0.1.11
Both A and D together close the loop: D ensures the agent knows the
right path before writing; A enforces it at the local git boundary if
the agent forgets. CI gate remains the third backstop for anything
that gets pushed.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
111 lines
3.9 KiB
Bash
111 lines
3.9 KiB
Bash
#!/bin/bash
|
|
# pre-commit hook — refuse commits that add internal-flavored paths to the
|
|
# public monorepo. Only enforces in the Molecule-AI public repos; no-op in
|
|
# every other repo (including the canonical internal one) so agents can
|
|
# still write `research/foo.md` inside `Molecule-AI/internal`.
|
|
#
|
|
# Why this hook exists
|
|
# ====================
|
|
#
|
|
# Despite SHARED_RULES.md, .gitignore, and a CI gate, agents still try to
|
|
# `git add /research/...` from their cwd in `molecule-monorepo`. Each leak
|
|
# attempt costs ~5 cycles (PR opens, CI fails, agent retries with
|
|
# workaround) and pollutes git history with reverts. This hook converts
|
|
# the failure mode from "PR fails" → "commit refused at the agent's local
|
|
# git" — instant feedback with the redirect command in the same error
|
|
# message.
|
|
#
|
|
# Installed via `core.hooksPath` set by molecule_runtime.precommit_hook
|
|
# at workspace startup.
|
|
|
|
set -e
|
|
|
|
# Skip silently when GIT_AUTHOR_EMAIL/USER is unset — likely a non-agent
|
|
# context (operator manually running git inside the container for debug).
|
|
# Agents always have the provisioner-set GIT_AUTHOR_NAME.
|
|
if [ -z "${GIT_AUTHOR_NAME:-}${GIT_COMMITTER_NAME:-}" ]; then
|
|
exit 0
|
|
fi
|
|
|
|
# Determine if we're in a public Molecule-AI repo. `git remote get-url`
|
|
# returns nothing in repos without a remote (fine — exit clean).
|
|
REMOTE=$(git remote get-url origin 2>/dev/null || echo "")
|
|
|
|
case "$REMOTE" in
|
|
*Molecule-AI/molecule-monorepo*|*Molecule-AI/molecule-core*)
|
|
# Continue — this is a public repo we enforce on.
|
|
;;
|
|
*)
|
|
# Non-target repo (internal, plugins, templates, third-party) — let it through.
|
|
exit 0
|
|
;;
|
|
esac
|
|
|
|
# Files added or modified in this commit. --diff-filter=AM excludes
|
|
# deletions so cleanup commits don't trip the gate.
|
|
STAGED=$(git diff --cached --name-only --diff-filter=AM)
|
|
[ -z "$STAGED" ] && exit 0
|
|
|
|
FORBIDDEN_PATTERNS=(
|
|
"^research/"
|
|
"^marketing/"
|
|
"^docs/marketing/"
|
|
"^comment-[0-9]+\.json$"
|
|
"^test-pmm.*\.(txt|md)$"
|
|
"^tick-reflections.*\.(txt|md)$"
|
|
".*-temp\.(md|txt)$"
|
|
)
|
|
|
|
OFFENDING=""
|
|
for path in $STAGED; do
|
|
for pattern in "${FORBIDDEN_PATTERNS[@]}"; do
|
|
if echo "$path" | grep -qE "$pattern"; then
|
|
OFFENDING="${OFFENDING} - ${path} (matched: ${pattern})\n"
|
|
break
|
|
fi
|
|
done
|
|
done
|
|
|
|
[ -z "$OFFENDING" ] && exit 0
|
|
|
|
# Refuse the commit with the redirect instructions in the same message.
|
|
{
|
|
echo
|
|
echo "Refusing commit: internal-flavored paths cannot live in the public monorepo."
|
|
echo
|
|
echo "Offending files:"
|
|
printf "$OFFENDING"
|
|
echo
|
|
echo "These belong in Molecule-AI/internal. Redirect:"
|
|
echo
|
|
echo " mkdir -p ~/repos"
|
|
echo " test -d ~/repos/internal || gh repo clone Molecule-AI/internal ~/repos/internal"
|
|
echo " cd ~/repos/internal"
|
|
echo " git pull origin main"
|
|
echo " git checkout -b <my-role>/<topic>-<date>"
|
|
echo " mkdir -p <area> # research, marketing, runbooks, etc."
|
|
echo " # move your file from the monorepo into <area>/<slug>.md"
|
|
echo " git add <area>/<slug>.md"
|
|
echo " git commit -m '<area>: add <slug>'"
|
|
echo " git push -u origin HEAD"
|
|
echo " gh pr create --base main --fill"
|
|
echo
|
|
echo "If your file is genuinely public-facing (final blog post, public"
|
|
echo "tutorial, customer-shippable doc), use one of these monorepo paths"
|
|
echo "instead — these are not blocked:"
|
|
echo " - docs/blog/<slug>.md"
|
|
echo " - docs/tutorials/<slug>.md"
|
|
echo " - docs/devrel/<slug>.md"
|
|
echo " - docs/api/<slug>.md"
|
|
echo
|
|
echo "If you legitimately need a new top-level path that matches a"
|
|
echo "forbidden pattern, edit:"
|
|
echo " .github/workflows/block-internal-paths.yml"
|
|
echo "with reviewer signoff and a public-facing justification — do NOT"
|
|
echo "work around the gate by renaming."
|
|
echo
|
|
echo "Hook source: molecule_runtime/scripts/pre-commit-block-internal-paths.sh"
|
|
} >&2
|
|
|
|
exit 1
|