Research on garrytan/gstack surfaced 5 patterns worth importing into our cron / agent setup. These are skills, not platform code — they guide how the cron and our own subagents work, not what the platform does at runtime. ## New skills 1. **cross-vendor-review** — adversarial second-model review for noteworthy PRs (auth, billing, data deletion, migrations). Catches the 15-30% of bugs single-model review misses. Inspired by gstack's /codex. 2. **careful-mode** — REFUSE/WARN/ALLOW lists for destructive commands. Refuses force-push to main, blocks merging draft PRs, prevents rm -rf outside scratch dirs. Inspired by gstack's /careful + /freeze. 3. **cron-learnings** — per-project JSONL of operational learnings appended at the end of every tick, replayed at the start of the next. Stops the cron from re-litigating decided issues. Inspired by gstack's /learn. 4. **cron-retro** — weekly retrospective auto-posted as a GitHub issue. Sunday 23:07 local. Tracks PR count, time-to-merge, gate failure trends, code-review severity over time. Inspired by gstack's /retro. 5. **llm-judge** — cheap LLM-as-judge eval to catch "agent shipped the wrong thing" — the failure mode unit tests miss. Plug into issue-pickup pipeline so worker-agent draft PRs get scored before being marked ready. Inspired by gstack's tier-3 test infra. ## Cron updates (session-only, c5074cd5 + 060d136c) - Hourly triage cron now opens with careful-mode activation + cron-learnings replay (Step 0) - code-review skill on every PR being considered for merge (Step 2 supplement A — already present, formalized) - cross-vendor-review on noteworthy PRs (Step 2 supplement B — new) - llm-judge on issue-pickup draft PRs before marking ready (Step 4) - Status report now includes cross-vendor pass/fail and llm-judge scores (Step 5) - End-of-tick cron-learnings append (Step 5) - New weekly cron at Sun 23:07 invokes the cron-retro skill ## What we did NOT take from gstack - Their browser fork — not our product - The 23 named roles — we have agent role templates already - Bun toolchain — adds yet another runtime to our stack - /design-shotgun and design-tool variants — we're not a design tool - /document-release — our update-docs skill already covers this See PR description for full research notes. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
3.1 KiB
3.1 KiB
| name | description |
|---|---|
| careful-mode | Refuse or warn before destructive irreversible commands (rm -rf, force push, DROP TABLE, gh pr close, gh issue close, mass DELETE). Inspired by gstack's /careful and /freeze. Activate at the start of any cron tick or when about to write to shared resources. |
careful-mode
Cron has merge authority + commit authority. That is enough rope to do permanent damage. This skill is the seatbelt.
Activate when
- The hourly cron tick starts
- About to call
gh pr merge/gh pr close/gh issue close - About to push to a branch other than your own draft
- About to run
git push --forcefor any reason - About to run
rm -rfon anything inside the repo - About to issue
DROP TABLE/TRUNCATE/DELETE FROM ... WHEREwithout a known small WHERE
Categories
REFUSE — hard stop
git push --forcetomain,master, or any protected branchgh pr mergeon a PR that:- has CI failing
- has
state: draft - has unresolved review comments from a non-bot author
- was created in the same conversation context (need 1 tick of distance)
git reset --hardagainst a branch that has commits I haven't seen pushed to a remoterm -rfagainst any path matching**/migrations/**,.git/,~/.molecule/, or repo rootDROP TABLE,TRUNCATE TABLEagainst any table in the molecule schemaDELETE FROM workspaceswithout aWHERE id = $known_uuidclause
WARN — proceed only with explicit confirmation in the prompt
gh pr closeon a PR not authored by megh issue closeon any issuegit push --force-with-lease(safer than--force, still requires care)rm -rf node_modules / dist /(safe, but worth a one-line "yes I meant this")chmod -Ron anything outside the current PR's diff- Mass curl-DELETE loops over
/workspaces(the cleanup-rogue-workspaces.sh pattern is OK but document the prefix)
ALLOW
- Anything against
/tmp/, the agent's own scratch dir, or test artifacts - Reads of any kind
- Standard merges via
gh pr merge --merge --delete-branchonce the gates pass - Single-row updates / deletes with explicit WHERE on a known-uuid
Freeze mode
When debugging a tricky issue, lock edits to one directory. Example invocation:
careful-mode freeze platform/internal/handlers/
# now any Edit/Write outside that path refuses
careful-mode unfreeze
This is conceptually like gstack's /freeze — prevents accidental scope creep when an agent is spelunking.
How to honor this skill
The skill is enforced by the AGENT, not by the harness. When making a tool call that lands in the REFUSE / WARN list, the agent must:
- Stop
- State the exact command + which list it falls under
- Explain why this case is or isn't safe
- For WARN, ask for explicit user confirmation
- For REFUSE, decline and propose a safer alternative
Why this exists
The cron has merge authority. gstack documented several near-misses where Claude wiped working directories or force-pushed to main. We avoid those by making the rules explicit and machine-readable, applied at the start of every tick.