Addresses PR #82 code review: 🟡×3 + 🔵×5.
- Fly registry login username: 'x' → 'molecule-ai' + explanatory comment.
- Build & push split into two steps (GHCR / Fly registry) so a single-
registry outage can't fail the other. Second step uses 'if: always()'
to ensure Fly mirror runs even if GHCR push flakes.
- docs/runbooks/saas-secrets.md: full secret map + rotation procedures
for every SaaS credential, with danger-case callouts. Documents the
coupled FLY_API_TOKEN (lives in GHA secret AND fly secrets — must be
rotated in both).
- CLAUDE.md: new 'SaaS ops' section linking to the runbook.
Keeps ghcr.io/molecule-ai/platform private (per CEO direction — open-
source when full SaaS ships) while still letting the private control
plane's Fly provisioner boot tenant machines: Fly auto-authenticates
same-org machines against registry.fly.io, no per-tenant pull
credentials to wire.
Workflow now logs into both GHCR (using built-in GITHUB_TOKEN) and
Fly registry (using FLY_API_TOKEN secret) and pushes the same image to
four tags total:
- ghcr.io/molecule-ai/platform:latest
- ghcr.io/molecule-ai/platform:sha-<short>
- registry.fly.io/molecule-tenant:latest
- registry.fly.io/molecule-tenant:sha-<short>
Secret added via `gh secret set FLY_API_TOKEN` on the public repo.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Phase B.2 companion to the private molecule-controlplane provisioner PR.
On every push to main that touches platform/**, builds platform/Dockerfile
and pushes to GHCR with two tags:
- :latest (floating, always main's tip)
- :sha-<short-commit> (immutable, pin-friendly)
Cache via GitHub Actions cache (cache-from: type=gha). Workflow_dispatch
trigger so we can re-publish after a docs-only merge if needed.
The private molecule-controlplane sets TENANT_IMAGE=ghcr.io/molecule-ai/platform:<tag>
and the provisioner creates each tenant Fly Machine from this image. Staying
on the same base image across tenants keeps upgrades atomic.
CLAUDE.md updated to document the new workflow in the CI pipeline section.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Phase 32 foundation. The SaaS control plane (private molecule-controlplane
repo) provisions one platform instance per customer org on Fly Machines
and sets MOLECULE_ORG_ID=<uuid> on the machine. Its subdomain router
forwards requests with X-Molecule-Org-Id=<uuid>.
TenantGuard:
- When MOLECULE_ORG_ID is set → every non-allowlisted request must carry a
matching X-Molecule-Org-Id header. Mismatched/missing header → 404 (not
403 — don't leak tenant existence by letting probers distinguish "wrong
org" from "route doesn't exist").
- When unset → passthrough. Self-hosted / dev / CI behavior unchanged.
- Allowlist is exact-match, not prefix — /health and /metrics only.
No orgs table, no signup, no billing, no Fly provisioning in this repo —
all that lives in the private control plane. The public repo's SaaS
surface is exactly this one middleware.
6 tests covering: unset-is-passthrough, matching header, mismatched
header 404 (with empty body), missing header 404, allowlist bypass, and
allowlist-is-exact-match.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Addresses code-review warnings on PR #76:
- Migration 022 now backfills pre-existing workspace_schedules rows to
source='template' before flipping NOT NULL + DEFAULT 'runtime'. Legacy
rows (all seeded via org/import historically) stay refreshable on
re-import. Down migration drops the CHECK constraint too.
- Extracted the import UPSERT into const orgImportScheduleSQL so the shape
test asserts against the const directly instead of file-scraping org.go.
Removed the os.ReadFile helper.
- scheduleResponse.Source gets json:\",omitempty\" so old clients that
predate the migration don't see an empty string they can't explain.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Addresses code-review warnings on PR #75:
- renderCategoryRoutingYAML now builds yaml.Node + yaml.Marshal, escaping
YAML-reserved chars in role names correctly (was JSON-as-YAML, fragile on
unicode line separators).
- New appendYAMLBlock helper guarantees a newline boundary when concatenating
YAML fragments into config.yaml (category_routing + initial_prompt both
used to risk merging into the previous line).
- Fixed struct comment (replace-per-key, not UNION).
- Added TestCategoryRouting_EscapesYAMLSpecials and TestAppendYAMLBlock_NewlineGuard.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Resolves#24 per CEO direction.
DB is source of truth for workspace_schedules. POST /org/import becomes
idempotent — only touches rows it owns (source='template'); runtime-added
schedules (Canvas / API) are preserved across re-imports.
- Migration 022: adds source TEXT NOT NULL DEFAULT 'runtime' CHECK in
('template','runtime'); unique index on (workspace_id, name) so the
org/import upsert can use ON CONFLICT.
- org.go: schedule INSERT becomes
INSERT ... 'template' ON CONFLICT (workspace_id, name) DO UPDATE
SET ... WHERE workspace_schedules.source='template'.
Never DELETEs.
- schedules.go: runtime POST writes 'runtime' explicitly; List handler
surfaces the source field on the response so Canvas can render badges.
- 3 new unit tests assert source='runtime' default for runtime CRUD,
the SQL shape contract for org/import (additive + idempotent +
runtime-preserving + never-DELETE), and List response surface.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add a category_routing block to org.yaml schema (defaults + per-workspace,
UNION semantics with per-key replace). The merged routing table is rendered
into each workspace's config.yaml at import time.
PM's system prompt loses the hardcoded security/ui/infra → role mapping
from PR #50; instead it reads category_routing from /configs/config.yaml
and delegates to whatever roles the org template lists for the incoming
audit-summary's category. Future org templates ship their own routing
without prompt churn.
Tests: 4 new TestCategoryRouting_* cases covering YAML parse, UNION+drop
semantics, deterministic config.yaml render, and empty-map handling.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
#71 just merged — per-workspace `plugins:` now UNIONs with `defaults.plugins`
instead of replacing it. Simplifies every override in molecule-dev/ from
"defaults+1 = list 10 items" to "defaults+1 = list 1 item":
PM: 11 items → 2 (workflow-triage + workflow-retro)
Research Lead: 10 items → 1 (browser-automation)
Market Analyst: 10 items → 1
Technical Researcher: 10 items → 1
Competitive Intel: 10 items → 1
Security Auditor: 12 items → 3 (code-review + cross-vendor-review + llm-judge)
UIUX Designer: 10 items → 1 (browser-automation)
Every workspace still receives the full 9-plugin default set (ecc,
molecule-dev, superpowers, careful-bash, prompt-watchdog, audit-trail,
session-context, cron-learnings, update-docs) — verified by reading
mergePlugins() in platform/internal/handlers/org.go:645.
Also drops the stale "REPLACE not UNION" warning comments and points
defaults' header comment at the new union behaviour.
Net diff: ~30 lines removed, ~10 added. Template is now meaningfully
easier to extend — each new defaults.plugin propagates everywhere
without sweeping per-role lists.
Closes follow-up scope from PR #70.
Merged after 7-gate verification.
Gates: 1 (CI 6/6 + 1 skip) pass, 2 (build/vet) pass, 3 (5 new TestPlugins_* + backward-compat) pass, 4 (security) pass, 5 (design) pass with 1 yellow, 6 (line review) pass, 7 N/A.
Backward-compat verified: molecule-dev/org.yaml re-lists [ecc, molecule-dev, superpowers, browser-automation] in each role; under new UNION+dedupe the merged set is identical to the prior REPLACE result. PR #70's 1 yellow (REPLACE verbosity / re-listing chore) is now closed by this change — orgs can drop the re-listing once confident.
Cross-vendor-review: second-model tooling unavailable in this worktree; Claude-only review applied per standing rule fallback.
Yellow (non-blocking, follow-up): opt-out semantics (`!plugin` / `-plugin`) are documented only in the code comment. Safety plugins like `molecule-careful-bash` can be disabled by an org.yaml using `!molecule-careful-bash` — this is operator-controlled config per I-2 and therefore acceptable, but docs/plugins/ should get an "overriding defaults" page in a follow-up.
noteworthy: plugin-semantics-change
- docs/edit-history/2026-04-14.md — append tick-5 section covering PR #69
(PLAN.md backlog stale-ref cleanup) and PR #70 (wire 12 modular plugins
from PR #63 into the default molecule-dev org template; defaults 3 → 9
plus PM + Security Auditor role extras).
- PLAN.md — add tick-5 entries under "Recently launched" noting PR #70
activated the tick-4 plugins and PR #69 cleaned up stale backlog refs.
Both merges are docs/template-only. No code surface moved, no new env
vars, no test-count drift. CLAUDE.md, .env.example, README.md, and
README.zh-CN.md unchanged.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Per-workspace `plugins:` now UNIONS with `defaults.plugins` instead of
replacing. A leading `!` or `-` on a per-workspace entry opts a default
out. Backward-compatible: re-listing defaults still dedupes to the same
list.
Refactored the inline REPLACE logic into a pure helper `mergePlugins`
in org.go so it's unit-testable. Five TestPlugins_* cases added.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
PR #63 just merged 12 new modular plugins (split from a single guardrails
bundle) and the audit pipeline (Security/UIUX/QA crons) is now producing
PRs continuously. Time to wire the new plugins into the molecule-dev
template so every workspace + every cron tick benefits.
## Defaults — universal additions (was 3, now 9)
- molecule-careful-bash — refuse rm -rf, push --force main, DROP TABLE
- molecule-prompt-watchdog — warn on destructive user prompts
- molecule-audit-trail — append every Edit/Write to .claude/audit.jsonl
- molecule-session-context — auto-load cron learnings + PR/issue counts on SessionStart
- molecule-skill-cron-learnings — per-tick learning JSONL format (pairs with session-context)
- molecule-skill-update-docs — keep architecture/README/edit-history aligned
Kept: ecc, molecule-dev, superpowers.
## Per-role overrides
- PM: defaults + molecule-workflow-triage + molecule-workflow-retro
(the /triage and /retro slash commands match PM's coordination role)
- Security Auditor: defaults + molecule-skill-code-review +
molecule-skill-cross-vendor-review + molecule-skill-llm-judge
(security PRs benefit from multi-criteria review, adversarial cross-vendor
second opinion, and an LLM-judge gate that catches "agent shipped the
wrong thing")
- Research Lead + 3 researchers + UIUX Designer: defaults + browser-automation
(existing override; just synced to the new default set)
Other 5 dev roles (Dev Lead, BE, FE, DevOps, QA) inherit defaults — the
new universal set is rich enough for them; code-review skill is a runtime
opt-in if Dev Lead decides per-PR.
## REPLACE-semantics verbosity
`platform/internal/handlers/org.go:~345` treats per-workspace plugins as
REPLACE not UNION. Every override has to re-list the 9 defaults to add 1
extra. Tracked as #68 with a union-proposal; once that lands the per-role
lists shrink to just the additions.
## Test plan
- [x] YAML valid (`python -c "import yaml; yaml.safe_load(...)"`)
- [x] defaults.plugins count = 9
- [ ] After merge + re-import: every workspace's /configs/plugins/ contains
the full set; PM has /triage and /retro commands; Security Auditor
can invoke cross-vendor-review on its findings.
Backlog items 11-14 used sequential enumeration (#64/#65/#66/#67) as
intra-doc bookkeeping. Those numbers now collide with actual merged
PRs and open issues with completely different scopes:
- PR #64 = auto-refresh global_secrets (not "delegations list")
- PR #65 = restart context Layer 1 (not "per-agent repo access")
- Issue #66 = restart_prompt Layer 2 (not "SDK swallows stderr")
- PR #67 = docs sync tick-4 (not "MCP localhost default")
Strip the misleading refs and add a footnote explaining the cleanup.
If/when any of these items get prioritized, file real GitHub issues.
Tracked in cron-learnings tick-3 entry.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- edit-history/2026-04-14.md: append tick-4 section covering the 12
modular guardrail plugins (#63), global-secrets auto-restart fan-out
(#64, fixes issue #15), and synthetic restart-context A2A message
(#65, fixes issue #19 Layer 1; Layer 2 deferred to issue #66).
- CLAUDE.md: bump Go test count 699 -> 726 (measured); note global
secrets auto-restart on SetGlobal/DeleteGlobal in the route table;
add Workspace Lifecycle paragraph for the restart-context message
and its system:restart-context caller prefix.
- PLAN.md: bump Go test count in the coverage table; record issues
#15 and #19 Layer 1 as launched; add new Backlog entry for the
Layer 2 follow-up (issue #66).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
After a workspace restart (HTTP /restart or programmatic RestartByID) and
re-registration, the platform sends a synthetic A2A message/send to the
workspace containing:
- restart timestamp
- previous session end timestamp + human duration
- env-var keys now available (keys only — never values)
The message is rendered in the format proposed in #19 and marked with
metadata.kind=restart_context so agents can detect and handle it
specifically if they choose.
Skip path: if the workspace doesn't re-register within 30s, log and drop.
The Restart HTTP response is unaffected by delivery success.
Layer 2 (user-defined restart_prompt via config.yaml / org.yaml) is
deferred — tracked as a separate follow-up issue.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Global secrets (e.g. CLAUDE_CODE_OAUTH_TOKEN) are injected as container env
vars at Start() time. Until now, rotating one only propagated to a workspace
on the next full restart-from-zero, which manual ops had to drive via a
`POST /workspaces/:id/restart` loop. Tier-3 Claude Code agents hit the
stale-token path first and surfaced as 401s inside the SDK.
Restart-time re-read of global_secrets + workspace_secrets was already
correct in `provisionWorkspaceOpts` — the missing piece was the trigger.
SetGlobal / DeleteGlobal now enqueue RestartByID for every non-paused,
non-removed, non-external workspace that does NOT shadow the key with a
workspace-level override. Matches the existing behaviour of workspace-scoped
`Set` / `Delete`.
Adds two sqlmock-backed tests exercising both branches.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
PR #63 code-review caught that the SDK copy of AgentskillsAdaptor uses
json.loads/json.dumps in _merge_settings_fragment + _rewrite_hook_paths
+ _deep_merge_hooks but never imports json. The runtime copy
(workspace-template/plugins_registry/builtins.py) already has the
import; this brings the SDK side in line.
Bug surfaces only when a plugin shipping settings-fragment.json (any
of the 5 hook plugins or 2 workflow plugins in this PR) is installed
through the SDK path — would NameError on the first json.loads call.
The drift test catches behavioral drift via fixture install scenarios
but not import-level drift in helper code paths.
Verified: json is now importable (`hasattr(molecule_plugin.builtins,
'json')` → True), drift test still passes.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
New section before the Temporal footnote capturing the gap analysis
between today's self-hosted posture and a multi-tenant cloud SaaS:
- Tier 1 blockers: multi-tenancy (org_id everywhere), WorkOS AuthKit
for human auth, Fly Machines for container isolation, Stripe
billing, per-org quotas, managed Postgres/Redis (Neon/Upstash),
KMS-backed secrets, migrations out of app boot
- Tier 1 follow-ups: Sentry + Grafana, per-org rate limiting,
Cloudflare, onboarding flow, transactional email, admin panel,
ToS/DPA
- Tier 2 tech-stack upgrades (non-blocking): pgx/v5 + sqlc, River
for platform async (NOT Temporal — that stays in workspace-template
as an agent tool), TanStack Query, Turbopack, uv for Python,
Python MCP client, shadcn/ui CLI
- Tier 3 explicitly NOT doing: Kubernetes, ORMs, framework swaps,
build-auth-yourself, canvas library swaps — with reasons
- Tier 4 compliance (post-revenue): SOC 2, status page, staging,
canary deploys, load testing
- Success criteria: sign-up-to-first-message < 5 min, tenant
isolation red-teamed, Fly Machines cost documented, Stripe
end-to-end, first paying design partner
Derived from a tech-stack audit run against the 2026 best-in-class
landscape (pgx won Postgres, River eats Temporal's small-company
slot, WorkOS beats Clerk for per-org SSO, Fly Machines is the only
isolation option without an SRE).
Co-authored-by: Hongming Wang <hongmingwang.rabbit@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replaces the proposed monolithic molecule-guardrails plugin with 12
single-purpose plugins users can install à la carte. Powered by a
small extension to the AgentskillsAdaptor base class so any plugin can
ship hooks/, commands/, and a settings-fragment.json without writing a
custom adapter.
## Base adapter changes
workspace-template/plugins_registry/builtins.py + sdk/python/molecule_plugin/builtins.py
(both copies — drift-tested):
- New _install_claude_layer() helper called at the end of install()
- Conditionally copies hooks/ → /configs/.claude/hooks/ (preserving exec bit)
- Conditionally copies commands/*.md → /configs/.claude/commands/
- Conditionally merges settings-fragment.json into /configs/.claude/settings.json
with ${CLAUDE_DIR} placeholder rewritten to the workspace's absolute install
path. Existing user hooks are preserved (deep-merge by event name).
- All steps no-op when the plugin doesn't ship the corresponding files,
so existing skill+rule plugins (molecule-dev, superpowers, ecc,
browser-automation) are unchanged.
Drift test (tests/test_plugins_builtins_drift.py) still passes.
## 12 new plugins
Hook plugins (ambient enforcement):
- molecule-careful-bash — refuses destructive bash; ships careful-mode skill
- molecule-freeze-scope — locks edits via .claude/freeze
- molecule-audit-trail — appends every Edit/Write to audit.jsonl
- molecule-session-context — auto-loads cron-learnings at session start
- molecule-prompt-watchdog — injects warnings on destructive prompt keywords
Skill plugins (on-demand):
- molecule-skill-code-review — 16-criteria multi-axis review
- molecule-skill-cross-vendor-review — adversarial second-model review
- molecule-skill-llm-judge — deliverable-vs-request scoring
- molecule-skill-update-docs — post-merge doc sync
- molecule-skill-cron-learnings — operational-memory JSONL format
Workflow plugins (slash commands):
- molecule-workflow-triage — /triage full PR-triage cycle
- molecule-workflow-retro — /retro + cron-retro skill, weekly retrospective
Each ships only what it needs — most have just plugin.yaml + skills/ or
hooks/ + adapter (one-line stub: `from plugins_registry.builtins import
AgentskillsAdaptor as Adaptor`). Total ~120 files but each plugin is
small and self-contained.
## Verification
- python3 -m molecule_plugin validate plugins/molecule-* → all 13 valid
(12 new + pre-existing molecule-dev)
- End-to-end install smoke test on representative samples: hook plugin
(molecule-careful-bash), skill-only plugin (molecule-skill-code-review),
workflow plugin (molecule-workflow-triage). All produce expected
/configs/ tree, settings.json paths rewritten, exec bits preserved,
zero warnings.
- workspace-template pytest tests/test_plugins_builtins_drift.py → passes
(SDK + runtime stay in sync).
## CLAUDE.md repo-doc updated
Lists all 12 new plugins under the existing Plugins section, organized
by category (hook / skill / workflow). Each entry one line, recommend-
together hints where dependencies make sense.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Skills are opt-in (I have to remember to invoke them). Hooks are
ambient — they fire on every matching event automatically. This PR
moves the careful-mode and learnings discipline from "doc I should
read" to "harness-enforced behavior I cannot bypass".
## 6 new hooks (.claude/hooks/)
- pre-bash-careful — REFUSES git push --force to main, rm -rf at root,
DROP TABLE against prod schema. WARNs on force-with-lease, gh pr/
issue close. Tested: blocks the destructive case, allows safe ones.
- pre-edit-freeze — implements /freeze. When .claude/freeze contains
a path glob, edits outside it are denied. Tested: edits to PLAN.md
blocked when scope locked to platform/internal/handlers/.
- session-start-context — auto-loads last 20 cron-learnings, freeze
status, open-PR/issue counts as additionalContext at session start.
Tested: emits valid SessionStart JSON.
- post-edit-audit — appends every Edit/Write to .claude/audit.jsonl
(gitignored). One-line records {ts, tool, file, ok}. Tested writes.
- user-prompt-tag — injects context warnings when prompt mentions
force-push, drop-table, "delete all", "push to main", etc. Tested:
emits warning for "force push the fix to main".
- subagent-stop-judge — off by default; touch .claude/judge-subagents
to enable. When on, prompts orchestrator to verify subagent's last
message addresses the original task. Cost-free MVP (no LLM call yet).
All hooks are Python (jq isn't on the hook PATH on macOS — Python is).
Shared helpers in _lib.py: read_input, deny_pretooluse, add_context,
warn_to_stderr.
## settings.json — wires all 6 hooks
Adds SessionStart, UserPromptSubmit, SubagentStop event handlers.
Existing PreToolUse:Bash + PostToolUse:Edit chains gain the new hooks
alongside the existing ones (check-inbox.sh, echo reminder).
Adds @modelcontextprotocol/server-sequential-thinking MCP server for
structured chain-of-thought scratchpad — useful when triaging multiple
PRs in parallel without losing context.
## .claude/commands/triage.md — slash command shortcut
Manual /triage runs the same flow as the c5074cd5 hourly cron, on
demand. Saves ~4KB of prompt every invocation by pulling the cron
prompt out of working memory.
## CLAUDE.md additions
New "Agent operating rules (auto-loaded — read first)" section right
after Ecosystem Context. Documents:
- Cron / triage discipline (read learnings, treat docs PRs touching
CLAUDE.md/PLAN.md as noteworthy, write per-tick reflections)
- Table of all 6 hooks active in this repo
- List of skills and how to invoke them
- Standing rules (inviolable) consolidated for the agent
This block auto-loads into every conversation context — free behavior
change without me remembering to opt in.
## .gitignore
audit.jsonl, freeze, judge-subagents, per-tick-reflections.md are all
local operational state, never committed.
## Verification
- echo '{"tool_input":{"command":"git push --force origin main"}}' |
bash pre-bash-careful.sh → emits deny JSON ✓
- Same for git status (safe command) → empty output, exit 0 ✓
- pre-edit-freeze with .claude/freeze=platform/handlers/ blocks
edits to PLAN.md, allows edits inside the locked path ✓
- post-edit-audit appends valid JSONL ✓
- session-start-context emits additionalContext with PR/issue counts ✓
- user-prompt-tag emits warning for "force push to main" prompt ✓
- python3 -c "json.load(open('.claude/settings.json'))" → valid ✓
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Research on garrytan/gstack surfaced 5 patterns worth importing into
our cron / agent setup. These are skills, not platform code — they
guide how the cron and our own subagents work, not what the platform
does at runtime.
## New skills
1. **cross-vendor-review** — adversarial second-model review for
noteworthy PRs (auth, billing, data deletion, migrations). Catches
the 15-30% of bugs single-model review misses. Inspired by
gstack's /codex.
2. **careful-mode** — REFUSE/WARN/ALLOW lists for destructive
commands. Refuses force-push to main, blocks merging draft PRs,
prevents rm -rf outside scratch dirs. Inspired by gstack's
/careful + /freeze.
3. **cron-learnings** — per-project JSONL of operational learnings
appended at the end of every tick, replayed at the start of the
next. Stops the cron from re-litigating decided issues.
Inspired by gstack's /learn.
4. **cron-retro** — weekly retrospective auto-posted as a GitHub
issue. Sunday 23:07 local. Tracks PR count, time-to-merge, gate
failure trends, code-review severity over time. Inspired by
gstack's /retro.
5. **llm-judge** — cheap LLM-as-judge eval to catch "agent shipped
the wrong thing" — the failure mode unit tests miss. Plug into
issue-pickup pipeline so worker-agent draft PRs get scored before
being marked ready. Inspired by gstack's tier-3 test infra.
## Cron updates (session-only, c5074cd5 + 060d136c)
- Hourly triage cron now opens with careful-mode activation +
cron-learnings replay (Step 0)
- code-review skill on every PR being considered for merge
(Step 2 supplement A — already present, formalized)
- cross-vendor-review on noteworthy PRs (Step 2 supplement B — new)
- llm-judge on issue-pickup draft PRs before marking ready (Step 4)
- Status report now includes cross-vendor pass/fail and llm-judge
scores (Step 5)
- End-of-tick cron-learnings append (Step 5)
- New weekly cron at Sun 23:07 invokes the cron-retro skill
## What we did NOT take from gstack
- Their browser fork — not our product
- The 23 named roles — we have agent role templates already
- Bun toolchain — adds yet another runtime to our stack
- /design-shotgun and design-tool variants — we're not a design tool
- /document-release — our update-docs skill already covers this
See PR description for full research notes.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Resolves#14. ApplyTierConfig now reads TIER{2,3,4}_MEMORY_MB and
TIER{2,3,4}_CPU_SHARES env vars, falling back to the compiled defaults
agreed in the issue:
- T2: 512 MiB / 1024 shares (1 CPU) — unchanged baseline
- T3: 2048 MiB / 2048 shares (2 CPU) — new cap (previously uncapped)
- T4: 4096 MiB / 4096 shares (4 CPU) — new cap (previously uncapped)
CPU_SHARES follows Docker's 1024 = 1 CPU convention; internally the
value is translated to NanoCPUs for a hard allocation so behaviour
remains deterministic across hosts. Malformed or non-positive env
values silently fall back to the default.
Behaviour change note: T3 and T4 previously had no explicit cap.
Operators who relied on unlimited can set very large TIERn_MEMORY_MB /
TIERn_CPU_SHARES values; a follow-up can add unset-means-unlimited
semantics if required.
Tests:
- TestGetTierMemoryMB_DefaultsMatchLegacy
- TestGetTierMemoryMB_EnvOverride (covers malformed + zero fallback)
- TestGetTierCPUShares_EnvOverride
- TestApplyTierConfig_T3_UsesEnvOverride (wiring)
- TestApplyTierConfig_T3_DefaultCap (documents the new cap)
Docs: .env.example section + CLAUDE.md platform env-vars list updated.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Resolves#12. The claude-code SDK stores conversations in
/root/.claude/sessions/ and Postgres tracks current_session_id, but the
container filesystem was recreated on every restart — next agent message
failed with "No conversation found with session ID: <uuid>".
Add a per-workspace named Docker volume (ws-<id>-claude-sessions) mounted
read-write at /root/.claude/sessions. Gated by runtime=claude-code so
other runtimes don't pay for a path they don't use. Volume is cleaned up
in RemoveVolume alongside the config volume.
Two opt-outs discard the volume before restart for a fresh session:
- env WORKSPACE_RESET_SESSION=1 on the container
- POST /workspaces/:id/restart?reset=true (or {"reset": true} body)
Plumbed via new ResetClaudeSession field on WorkspaceConfig +
provisionWorkspaceOpts helper so the flag stays request-scoped (not
persisted on CreateWorkspacePayload).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- docs/edit-history/2026-04-14.md: append tick-3 section covering the
admin test-token route (#53), the prior-tick doc-sync PR (#54), and
the hermes required_env alignment (#55). Record measured test counts
(Go +4 for the TestAdminTestToken_* quartet).
- CLAUDE.md: bump Go test count 695 → 699 with a note pointing at the
new quartet. Route-table row and env-var mentions for the admin
route already landed with #53; verified on main.
- .env.example: add MOLECULE_ENABLE_TEST_TOKENS with a comment about
the prod-hidden default. Closes the code-review doc-sync flag from
#53 (var was in CLAUDE.md but missing from .env.example).
No PLAN.md / README.md / README.zh-CN.md update needed — none of the
three merges expose a user-visible surface.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>