Research on garrytan/gstack surfaced 5 patterns worth importing into our cron / agent setup. These are skills, not platform code — they guide how the cron and our own subagents work, not what the platform does at runtime. ## New skills 1. **cross-vendor-review** — adversarial second-model review for noteworthy PRs (auth, billing, data deletion, migrations). Catches the 15-30% of bugs single-model review misses. Inspired by gstack's /codex. 2. **careful-mode** — REFUSE/WARN/ALLOW lists for destructive commands. Refuses force-push to main, blocks merging draft PRs, prevents rm -rf outside scratch dirs. Inspired by gstack's /careful + /freeze. 3. **cron-learnings** — per-project JSONL of operational learnings appended at the end of every tick, replayed at the start of the next. Stops the cron from re-litigating decided issues. Inspired by gstack's /learn. 4. **cron-retro** — weekly retrospective auto-posted as a GitHub issue. Sunday 23:07 local. Tracks PR count, time-to-merge, gate failure trends, code-review severity over time. Inspired by gstack's /retro. 5. **llm-judge** — cheap LLM-as-judge eval to catch "agent shipped the wrong thing" — the failure mode unit tests miss. Plug into issue-pickup pipeline so worker-agent draft PRs get scored before being marked ready. Inspired by gstack's tier-3 test infra. ## Cron updates (session-only, c5074cd5 + 060d136c) - Hourly triage cron now opens with careful-mode activation + cron-learnings replay (Step 0) - code-review skill on every PR being considered for merge (Step 2 supplement A — already present, formalized) - cross-vendor-review on noteworthy PRs (Step 2 supplement B — new) - llm-judge on issue-pickup draft PRs before marking ready (Step 4) - Status report now includes cross-vendor pass/fail and llm-judge scores (Step 5) - End-of-tick cron-learnings append (Step 5) - New weekly cron at Sun 23:07 invokes the cron-retro skill ## What we did NOT take from gstack - Their browser fork — not our product - The 23 named roles — we have agent role templates already - Bun toolchain — adds yet another runtime to our stack - /design-shotgun and design-tool variants — we're not a design tool - /document-release — our update-docs skill already covers this See PR description for full research notes. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
61 lines
2.8 KiB
Markdown
61 lines
2.8 KiB
Markdown
---
|
|
name: cron-learnings
|
|
description: At the end of every cron tick, append 1-3 lines of operational learnings (what worked, what surprised, what should change next tick) to a per-project JSONL. Replay at start of next tick. Inspired by gstack's /learn skill.
|
|
---
|
|
|
|
# cron-learnings
|
|
|
|
Each tick, the cron does a lot of work. Half the lessons are forgotten by the next tick. This skill is the compounding layer.
|
|
|
|
## Storage
|
|
|
|
Per-project file at:
|
|
```
|
|
~/.claude/projects/<sanitized-project-path>/memory/cron-learnings.jsonl
|
|
```
|
|
|
|
For molecule-monorepo, that's:
|
|
```
|
|
~/.claude/projects/-Users-hongming-Documents-GitHub-molecule-monorepo/memory/cron-learnings.jsonl
|
|
```
|
|
|
|
One JSON object per line:
|
|
```json
|
|
{"ts": "2026-04-14T05:17:00Z", "tick_id": "5939aa3f-001", "category": "gate-fail", "summary": "Gate 4 (security) flagged token!=secret in PR #28; requireInternalAPISecret needs subtle.ConstantTimeCompare", "next_action": "When reviewing auth-gate code, grep for `subtle.ConstantTimeCompare`. Flag plain == on tokens."}
|
|
```
|
|
|
|
Categories:
|
|
- `gate-fail` — a verification gate caught something
|
|
- `mechanical-fix` — fixed a gate failure on-branch
|
|
- `false-positive` — a code-review finding turned out to be wrong; record so we don't keep flagging it
|
|
- `tool-error` — an MCP tool / CLI flaked; note the workaround
|
|
- `repo-state` — something about the repo's state that next tick should know
|
|
- `pattern` — a cross-PR pattern worth remembering (e.g., "every cron loop adds itself as `noreply@anthropic.com`; reviewers OK with it")
|
|
|
|
## When to write
|
|
|
|
End of every cron tick (Step 5 of the cron prompt). 1-3 lines max — be terse.
|
|
|
|
## When to read
|
|
|
|
Start of every cron tick. Read the last 20 lines (most recent first) before Step 1. Use them to:
|
|
- Skip false-positive paths the previous tick flagged
|
|
- Apply learned patterns (e.g., "PR #28 found INTERNAL_API_SECRET missing from .env.example — when reviewing future security PRs, always check .env.example sync as a first move")
|
|
- Avoid re-litigating decided design choices
|
|
|
|
## Pruning
|
|
|
|
Cap at 500 lines. When exceeded, the next write also drops the oldest 100 lines. The point is recent operational memory, not an audit log.
|
|
|
|
## Format discipline
|
|
|
|
- One line per event
|
|
- ASCII-only for grep-friendliness
|
|
- No PII, no tokens, no URLs with auth
|
|
- `summary` is what HAPPENED; `next_action` is what FUTURE-YOU should DO
|
|
- If you can't think of a concrete next_action, it's not worth logging
|
|
|
|
## Why this exists
|
|
|
|
gstack's `/learn` showed that AI sessions repeatedly make the same mistakes because the lessons live only in the conversation that produced them. Writing them to disk lets every tick start with the accumulated wisdom of every prior tick, at zero cost. The awareness MCP we have is fine for cross-session human/agent memory — this file is specifically for the cron's own automation.
|