molecule-core

Author	SHA1	Message	Date
Hongming Wang	3ddd0cffbf	Merge pull request #76 from Molecule-AI/fix/issue-24-schedules-db-authoritative fix(org): DB-authoritative schedules; org/import is additive on template rows (#24)	2026-04-14 14:40:54 -07:00
Hongming Wang	bdb21a2d70	Merge pull request #75 from Molecule-AI/feat/issue-51-category-routing feat(platform): generic category_routing replaces hardcoded audit dispatch (#51)	2026-04-14 14:40:51 -07:00
Hongming Wang	bcabafd0cc	Merge pull request #74 from Molecule-AI/chore/template-plugin-union-cleanup chore(template): simplify per-role plugin lists using #71 union semantics	2026-04-14 14:40:48 -07:00
Hongming Wang	3356505f3a	Merge pull request #73 from Molecule-AI/docs/sync-2026-04-14-tick-6 docs: sync documentation with 2026-04-14 tick-6 merges (#71, #72)	2026-04-14 14:40:44 -07:00
Hongming Wang	b15e30ccde	fix(schedules): backfill legacy rows to 'template' + extract import SQL const Addresses code-review warnings on PR #76: - Migration 022 now backfills pre-existing workspace_schedules rows to source='template' before flipping NOT NULL + DEFAULT 'runtime'. Legacy rows (all seeded via org/import historically) stay refreshable on re-import. Down migration drops the CHECK constraint too. - Extracted the import UPSERT into const orgImportScheduleSQL so the shape test asserts against the const directly instead of file-scraping org.go. Removed the os.ReadFile helper. - scheduleResponse.Source gets json:\",omitempty\" so old clients that predate the migration don't see an empty string they can't explain. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 14:30:22 -07:00
Hongming Wang	c47898568c	fix(org): use yaml.Marshal for category_routing + newline-guard block appends Addresses code-review warnings on PR #75: - renderCategoryRoutingYAML now builds yaml.Node + yaml.Marshal, escaping YAML-reserved chars in role names correctly (was JSON-as-YAML, fragile on unicode line separators). - New appendYAMLBlock helper guarantees a newline boundary when concatenating YAML fragments into config.yaml (category_routing + initial_prompt both used to risk merging into the previous line). - Fixed struct comment (replace-per-key, not UNION). - Added TestCategoryRouting_EscapesYAMLSpecials and TestAppendYAMLBlock_NewlineGuard. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 14:28:22 -07:00
Hongming Wang	2e9fb51ff9	fix(org): DB-authoritative schedules; org/import is additive on template rows (#24 ) Resolves #24 per CEO direction. DB is source of truth for workspace_schedules. POST /org/import becomes idempotent — only touches rows it owns (source='template'); runtime-added schedules (Canvas / API) are preserved across re-imports. - Migration 022: adds source TEXT NOT NULL DEFAULT 'runtime' CHECK in ('template','runtime'); unique index on (workspace_id, name) so the org/import upsert can use ON CONFLICT. - org.go: schedule INSERT becomes INSERT ... 'template' ON CONFLICT (workspace_id, name) DO UPDATE SET ... WHERE workspace_schedules.source='template'. Never DELETEs. - schedules.go: runtime POST writes 'runtime' explicitly; List handler surfaces the source field on the response so Canvas can render badges. - 3 new unit tests assert source='runtime' default for runtime CRUD, the SQL shape contract for org/import (additive + idempotent + runtime-preserving + never-DELETE), and List response surface. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 14:09:44 -07:00
Hongming Wang	d4140ee244	feat(platform): generic category_routing replaces hardcoded audit dispatch (#51 ) Add a category_routing block to org.yaml schema (defaults + per-workspace, UNION semantics with per-key replace). The merged routing table is rendered into each workspace's config.yaml at import time. PM's system prompt loses the hardcoded security/ui/infra → role mapping from PR #50; instead it reads category_routing from /configs/config.yaml and delegates to whatever roles the org template lists for the incoming audit-summary's category. Future org templates ship their own routing without prompt churn. Tests: 4 new TestCategoryRouting_* cases covering YAML parse, UNION+drop semantics, deterministic config.yaml render, and empty-map handling. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 14:06:47 -07:00
rabbitblood	3269db3781	chore(template): simplify per-role plugin lists using #71 union semantics #71 just merged — per-workspace `plugins:` now UNIONs with `defaults.plugins` instead of replacing it. Simplifies every override in molecule-dev/ from "defaults+1 = list 10 items" to "defaults+1 = list 1 item": PM: 11 items → 2 (workflow-triage + workflow-retro) Research Lead: 10 items → 1 (browser-automation) Market Analyst: 10 items → 1 Technical Researcher: 10 items → 1 Competitive Intel: 10 items → 1 Security Auditor: 12 items → 3 (code-review + cross-vendor-review + llm-judge) UIUX Designer: 10 items → 1 (browser-automation) Every workspace still receives the full 9-plugin default set (ecc, molecule-dev, superpowers, careful-bash, prompt-watchdog, audit-trail, session-context, cron-learnings, update-docs) — verified by reading mergePlugins() in platform/internal/handlers/org.go:645. Also drops the stale "REPLACE not UNION" warning comments and points defaults' header comment at the new union behaviour. Net diff: ~30 lines removed, ~10 added. Template is now meaningfully easier to extend — each new defaults.plugin propagates everywhere without sweeping per-role lists. Closes follow-up scope from PR #70.	2026-04-14 14:05:43 -07:00
Hongming Wang	3a105fa1cb	docs: sync documentation with 2026-04-14 tick-6 merges (#71 , #72 ) - docs/edit-history/2026-04-14.md: append tick-6 covering PR #71 (plugins UNION) and PR #72 (tick-5 docs-sync) - CLAUDE.md: Go test count 726 -> 731 (+5 TestPlugins_*); add Plugins section note on UNION + !/- opt-out semantics - PLAN.md: add "Recently launched (2026-04-14 tick-6)" entry noting issue #68 is resolved by PR #71 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 13:45:02 -07:00
Hongming Wang	dd61714c55	Merge pull request #71 from Molecule-AI/fix/issue-68-plugins-union Merged after 7-gate verification. Gates: 1 (CI 6/6 + 1 skip) pass, 2 (build/vet) pass, 3 (5 new TestPlugins_* + backward-compat) pass, 4 (security) pass, 5 (design) pass with 1 yellow, 6 (line review) pass, 7 N/A. Backward-compat verified: molecule-dev/org.yaml re-lists [ecc, molecule-dev, superpowers, browser-automation] in each role; under new UNION+dedupe the merged set is identical to the prior REPLACE result. PR #70's 1 yellow (REPLACE verbosity / re-listing chore) is now closed by this change — orgs can drop the re-listing once confident. Cross-vendor-review: second-model tooling unavailable in this worktree; Claude-only review applied per standing rule fallback. Yellow (non-blocking, follow-up): opt-out semantics (`!plugin` / `-plugin`) are documented only in the code comment. Safety plugins like `molecule-careful-bash` can be disabled by an org.yaml using `!molecule-careful-bash` — this is operator-controlled config per I-2 and therefore acceptable, but docs/plugins/ should get an "overriding defaults" page in a follow-up. noteworthy: plugin-semantics-change	2026-04-14 13:42:30 -07:00
Hongming Wang	ecf93acc17	Merge pull request #72 from Molecule-AI/docs/sync-2026-04-14-tick-5 docs: sync documentation with 2026-04-14 tick-5 merges (#69, #70)	2026-04-14 13:41:45 -07:00
Hongming Wang	b56fc66367	docs: sync documentation with 2026-04-14 tick-5 merges (#69 , #70 ) - docs/edit-history/2026-04-14.md — append tick-5 section covering PR #69 (PLAN.md backlog stale-ref cleanup) and PR #70 (wire 12 modular plugins from PR #63 into the default molecule-dev org template; defaults 3 → 9 plus PM + Security Auditor role extras). - PLAN.md — add tick-5 entries under "Recently launched" noting PR #70 activated the tick-4 plugins and PR #69 cleaned up stale backlog refs. Both merges are docs/template-only. No code surface moved, no new env vars, no test-count drift. CLAUDE.md, .env.example, README.md, and README.zh-CN.md unchanged. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 13:21:30 -07:00
Hongming Wang	eea64f06ec	fix(org): per-workspace plugins UNION with defaults; '!' prefix opts out (#68 ) Per-workspace `plugins:` now UNIONS with `defaults.plugins` instead of replacing. A leading `!` or `-` on a per-workspace entry opts a default out. Backward-compatible: re-listing defaults still dedupes to the same list. Refactored the inline REPLACE logic into a pure helper `mergePlugins` in org.go so it's unit-testable. Five TestPlugins_* cases added. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 13:21:23 -07:00
Hongming Wang	ea2a872018	Merge pull request #70 from Molecule-AI/chore/template-plugin-enrichment chore(template): wire 9 new guardrail/skill plugins into defaults; PM + Security Auditor get role extras	2026-04-14 13:18:46 -07:00
Hongming Wang	4ef0250e75	Merge pull request #69 from Molecule-AI/docs/cleanup-stale-backlog-refs docs(plan): drop stale sequential refs from Backlog items 11-14	2026-04-14 13:18:30 -07:00
rabbitblood	f3b0b9e572	chore(template): wire 9 new guardrail/skill plugins into defaults; PM + Security Auditor get role extras PR #63 just merged 12 new modular plugins (split from a single guardrails bundle) and the audit pipeline (Security/UIUX/QA crons) is now producing PRs continuously. Time to wire the new plugins into the molecule-dev template so every workspace + every cron tick benefits. ## Defaults — universal additions (was 3, now 9) - molecule-careful-bash — refuse rm -rf, push --force main, DROP TABLE - molecule-prompt-watchdog — warn on destructive user prompts - molecule-audit-trail — append every Edit/Write to .claude/audit.jsonl - molecule-session-context — auto-load cron learnings + PR/issue counts on SessionStart - molecule-skill-cron-learnings — per-tick learning JSONL format (pairs with session-context) - molecule-skill-update-docs — keep architecture/README/edit-history aligned Kept: ecc, molecule-dev, superpowers. ## Per-role overrides - PM: defaults + molecule-workflow-triage + molecule-workflow-retro (the /triage and /retro slash commands match PM's coordination role) - Security Auditor: defaults + molecule-skill-code-review + molecule-skill-cross-vendor-review + molecule-skill-llm-judge (security PRs benefit from multi-criteria review, adversarial cross-vendor second opinion, and an LLM-judge gate that catches "agent shipped the wrong thing") - Research Lead + 3 researchers + UIUX Designer: defaults + browser-automation (existing override; just synced to the new default set) Other 5 dev roles (Dev Lead, BE, FE, DevOps, QA) inherit defaults — the new universal set is rich enough for them; code-review skill is a runtime opt-in if Dev Lead decides per-PR. ## REPLACE-semantics verbosity `platform/internal/handlers/org.go:~345` treats per-workspace plugins as REPLACE not UNION. Every override has to re-list the 9 defaults to add 1 extra. Tracked as #68 with a union-proposal; once that lands the per-role lists shrink to just the additions. ## Test plan - [x] YAML valid (`python -c "import yaml; yaml.safe_load(...)"`) - [x] defaults.plugins count = 9 - [ ] After merge + re-import: every workspace's /configs/plugins/ contains the full set; PM has /triage and /retro commands; Security Auditor can invoke cross-vendor-review on its findings.	2026-04-14 13:07:05 -07:00
Hongming Wang	e305851821	docs(plan): drop stale sequential refs #64-#67 from Backlog items 11-14 Backlog items 11-14 used sequential enumeration (#64/#65/#66/#67) as intra-doc bookkeeping. Those numbers now collide with actual merged PRs and open issues with completely different scopes: - PR #64 = auto-refresh global_secrets (not "delegations list") - PR #65 = restart context Layer 1 (not "per-agent repo access") - Issue #66 = restart_prompt Layer 2 (not "SDK swallows stderr") - PR #67 = docs sync tick-4 (not "MCP localhost default") Strip the misleading refs and add a footnote explaining the cleanup. If/when any of these items get prioritized, file real GitHub issues. Tracked in cron-learnings tick-3 entry. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 13:05:08 -07:00
Hongming Wang	9ed9755ff8	Merge pull request #67 from Molecule-AI/docs/sync-2026-04-14-tick-4 docs: sync documentation with 2026-04-14 evening-tick merges (#63, #64, #65)	2026-04-14 13:03:18 -07:00
Hongming Wang	59a96e3888	docs: sync documentation with 2026-04-14 evening-tick merges (#63 , #64 , #65 ) - edit-history/2026-04-14.md: append tick-4 section covering the 12 modular guardrail plugins (#63), global-secrets auto-restart fan-out (#64, fixes issue #15), and synthetic restart-context A2A message (#65, fixes issue #19 Layer 1; Layer 2 deferred to issue #66). - CLAUDE.md: bump Go test count 699 -> 726 (measured); note global secrets auto-restart on SetGlobal/DeleteGlobal in the route table; add Workspace Lifecycle paragraph for the restart-context message and its system:restart-context caller prefix. - PLAN.md: bump Go test count in the coverage table; record issues #15 and #19 Layer 1 as launched; add new Backlog entry for the Layer 2 follow-up (issue #66). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 12:54:04 -07:00
Hongming Wang	3c7c65ffd3	Merge pull request #64 from Molecule-AI/fix/issue-15-refresh-oauth-on-restart fix(secrets): auto-refresh global_secrets on workspace restart (#15)	2026-04-14 12:49:19 -07:00
Hongming Wang	87bb4f96d5	Merge pull request #65 from Molecule-AI/fix/issue-19-restart-context-layer1 feat(platform): inject restart context system message (#19 Layer 1)	2026-04-14 12:48:19 -07:00
Hongming Wang	d5806440d5	feat(plugins): split guardrails into 12 modular plugins (#63 ) Noteworthy: large-addition (+1601 lines, 12 new plugins) + modifies core AgentskillsAdaptor (SDK + runtime copies, drift-guarded). All 7 gates pass, 0 critical findings. Cross-vendor review skipped (tool unavailable).	2026-04-14 12:47:24 -07:00
Hongming Wang	a36047f3d8	feat(platform): inject restart context system message (#19 Layer 1) After a workspace restart (HTTP /restart or programmatic RestartByID) and re-registration, the platform sends a synthetic A2A message/send to the workspace containing: - restart timestamp - previous session end timestamp + human duration - env-var keys now available (keys only — never values) The message is rendered in the format proposed in #19 and marked with metadata.kind=restart_context so agents can detect and handle it specifically if they choose. Skip path: if the workspace doesn't re-register within 30s, log and drop. The Restart HTTP response is unaffected by delivery success. Layer 2 (user-defined restart_prompt via config.yaml / org.yaml) is deferred — tracked as a separate follow-up issue. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 12:41:01 -07:00
Hongming Wang	1b432f6ffd	fix(secrets): auto-restart workspaces on global secret change (#15 ) Global secrets (e.g. CLAUDE_CODE_OAUTH_TOKEN) are injected as container env vars at Start() time. Until now, rotating one only propagated to a workspace on the next full restart-from-zero, which manual ops had to drive via a `POST /workspaces/:id/restart` loop. Tier-3 Claude Code agents hit the stale-token path first and surfaced as 401s inside the SDK. Restart-time re-read of global_secrets + workspace_secrets was already correct in `provisionWorkspaceOpts` — the missing piece was the trigger. SetGlobal / DeleteGlobal now enqueue RestartByID for every non-paused, non-removed, non-external workspace that does NOT shadow the key with a workspace-level override. Matches the existing behaviour of workspace-scoped `Set` / `Delete`. Adds two sqlmock-backed tests exercising both branches. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 12:39:00 -07:00
Hongming Wang	741c782f6d	fix(gate-4): add missing import json in sdk/python/molecule_plugin/builtins.py PR #63 code-review caught that the SDK copy of AgentskillsAdaptor uses json.loads/json.dumps in _merge_settings_fragment + _rewrite_hook_paths + _deep_merge_hooks but never imports json. The runtime copy (workspace-template/plugins_registry/builtins.py) already has the import; this brings the SDK side in line. Bug surfaces only when a plugin shipping settings-fragment.json (any of the 5 hook plugins or 2 workflow plugins in this PR) is installed through the SDK path — would NameError on the first json.loads call. The drift test catches behavioral drift via fixture install scenarios but not import-level drift in helper code paths. Verified: json is now importable (`hasattr(molecule_plugin.builtins, 'json')` → True), drift test still passes. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 12:29:32 -07:00
Hongming Wang	7d4b9885cb	Merge pull request #57 from Molecule-AI/fix/issue-12-preserve-claude-sessions fix(provisioner): preserve Claude session directory across restart (#12)	2026-04-14 12:26:12 -07:00
Hongming Wang	b0d779e4b4	Merge pull request #61 from Molecule-AI/feat/claude-hooks-upgrade feat(.claude): ambient hooks + sequential-thinking MCP + /triage command	2026-04-14 12:25:54 -07:00
Hongming Wang	03bcb33792	Merge pull request #60 from Molecule-AI/feat/gstack-inspired-cron-upgrades feat(.claude): 5 gstack-inspired skills + cron upgrades	2026-04-14 12:25:19 -07:00
Hongming Wang	49cd28ac17	Merge pull request #58 from Molecule-AI/feat/issue-14-configurable-tier-limits noteworthy: behavior-change — T3/T4 caps introduced where previously unlimited; defaults match issue #14 spec; operators can override via env	2026-04-14 12:25:00 -07:00
Hongming Wang	0081c29ead	docs(plan): add Phase 32 — Cloud SaaS launch roadmap (#59 ) New section before the Temporal footnote capturing the gap analysis between today's self-hosted posture and a multi-tenant cloud SaaS: - Tier 1 blockers: multi-tenancy (org_id everywhere), WorkOS AuthKit for human auth, Fly Machines for container isolation, Stripe billing, per-org quotas, managed Postgres/Redis (Neon/Upstash), KMS-backed secrets, migrations out of app boot - Tier 1 follow-ups: Sentry + Grafana, per-org rate limiting, Cloudflare, onboarding flow, transactional email, admin panel, ToS/DPA - Tier 2 tech-stack upgrades (non-blocking): pgx/v5 + sqlc, River for platform async (NOT Temporal — that stays in workspace-template as an agent tool), TanStack Query, Turbopack, uv for Python, Python MCP client, shadcn/ui CLI - Tier 3 explicitly NOT doing: Kubernetes, ORMs, framework swaps, build-auth-yourself, canvas library swaps — with reasons - Tier 4 compliance (post-revenue): SOC 2, status page, staging, canary deploys, load testing - Success criteria: sign-up-to-first-message < 5 min, tenant isolation red-teamed, Fly Machines cost documented, Stripe end-to-end, first paying design partner Derived from a tech-stack audit run against the 2026 best-in-class landscape (pgx won Postgres, River eats Temporal's small-company slot, WorkOS beats Clerk for per-org SSO, Fly Machines is the only isolation option without an SRE). Co-authored-by: Hongming Wang <hongmingwang.rabbit@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 12:24:59 -07:00
Hongming Wang	8da43984f7	Merge pull request #56 from Molecule-AI/docs/sync-2026-04-14-tick-3 docs: sync documentation with 2026-04-14 tick-3 merges (#53, #54, #55)	2026-04-14 12:24:16 -07:00
Hongming Wang	119b02c544	feat(plugins): split guardrails into 12 modular plugins Replaces the proposed monolithic molecule-guardrails plugin with 12 single-purpose plugins users can install à la carte. Powered by a small extension to the AgentskillsAdaptor base class so any plugin can ship hooks/, commands/, and a settings-fragment.json without writing a custom adapter. ## Base adapter changes workspace-template/plugins_registry/builtins.py + sdk/python/molecule_plugin/builtins.py (both copies — drift-tested): - New _install_claude_layer() helper called at the end of install() - Conditionally copies hooks/ → /configs/.claude/hooks/ (preserving exec bit) - Conditionally copies commands/.md → /configs/.claude/commands/ - Conditionally merges settings-fragment.json into /configs/.claude/settings.json with ${CLAUDE_DIR} placeholder rewritten to the workspace's absolute install path. Existing user hooks are preserved (deep-merge by event name). - All steps no-op when the plugin doesn't ship the corresponding files, so existing skill+rule plugins (molecule-dev, superpowers, ecc, browser-automation) are unchanged. Drift test (tests/test_plugins_builtins_drift.py) still passes. ## 12 new plugins Hook plugins (ambient enforcement): - molecule-careful-bash — refuses destructive bash; ships careful-mode skill - molecule-freeze-scope — locks edits via .claude/freeze - molecule-audit-trail — appends every Edit/Write to audit.jsonl - molecule-session-context — auto-loads cron-learnings at session start - molecule-prompt-watchdog — injects warnings on destructive prompt keywords Skill plugins (on-demand): - molecule-skill-code-review — 16-criteria multi-axis review - molecule-skill-cross-vendor-review — adversarial second-model review - molecule-skill-llm-judge — deliverable-vs-request scoring - molecule-skill-update-docs — post-merge doc sync - molecule-skill-cron-learnings — operational-memory JSONL format Workflow plugins (slash commands): - molecule-workflow-triage — /triage full PR-triage cycle - molecule-workflow-retro — /retro + cron-retro skill, weekly retrospective Each ships only what it needs — most have just plugin.yaml + skills/ or hooks/ + adapter (one-line stub: `from plugins_registry.builtins import AgentskillsAdaptor as Adaptor`). Total ~120 files but each plugin is small and self-contained. ## Verification - python3 -m molecule_plugin validate plugins/molecule- → all 13 valid (12 new + pre-existing molecule-dev) - End-to-end install smoke test on representative samples: hook plugin (molecule-careful-bash), skill-only plugin (molecule-skill-code-review), workflow plugin (molecule-workflow-triage). All produce expected /configs/ tree, settings.json paths rewritten, exec bits preserved, zero warnings. - workspace-template pytest tests/test_plugins_builtins_drift.py → passes (SDK + runtime stay in sync). ## CLAUDE.md repo-doc updated Lists all 12 new plugins under the existing Plugins section, organized by category (hook / skill / workflow). Each entry one line, recommend- together hints where dependencies make sense. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 12:20:04 -07:00
Hongming Wang	eea36b9f92	feat(.claude): ambient hooks + sequential-thinking MCP + /triage command Skills are opt-in (I have to remember to invoke them). Hooks are ambient — they fire on every matching event automatically. This PR moves the careful-mode and learnings discipline from "doc I should read" to "harness-enforced behavior I cannot bypass". ## 6 new hooks (.claude/hooks/) - pre-bash-careful — REFUSES git push --force to main, rm -rf at root, DROP TABLE against prod schema. WARNs on force-with-lease, gh pr/ issue close. Tested: blocks the destructive case, allows safe ones. - pre-edit-freeze — implements /freeze. When .claude/freeze contains a path glob, edits outside it are denied. Tested: edits to PLAN.md blocked when scope locked to platform/internal/handlers/. - session-start-context — auto-loads last 20 cron-learnings, freeze status, open-PR/issue counts as additionalContext at session start. Tested: emits valid SessionStart JSON. - post-edit-audit — appends every Edit/Write to .claude/audit.jsonl (gitignored). One-line records {ts, tool, file, ok}. Tested writes. - user-prompt-tag — injects context warnings when prompt mentions force-push, drop-table, "delete all", "push to main", etc. Tested: emits warning for "force push the fix to main". - subagent-stop-judge — off by default; touch .claude/judge-subagents to enable. When on, prompts orchestrator to verify subagent's last message addresses the original task. Cost-free MVP (no LLM call yet). All hooks are Python (jq isn't on the hook PATH on macOS — Python is). Shared helpers in _lib.py: read_input, deny_pretooluse, add_context, warn_to_stderr. ## settings.json — wires all 6 hooks Adds SessionStart, UserPromptSubmit, SubagentStop event handlers. Existing PreToolUse:Bash + PostToolUse:Edit chains gain the new hooks alongside the existing ones (check-inbox.sh, echo reminder). Adds @modelcontextprotocol/server-sequential-thinking MCP server for structured chain-of-thought scratchpad — useful when triaging multiple PRs in parallel without losing context. ## .claude/commands/triage.md — slash command shortcut Manual /triage runs the same flow as the c5074cd5 hourly cron, on demand. Saves ~4KB of prompt every invocation by pulling the cron prompt out of working memory. ## CLAUDE.md additions New "Agent operating rules (auto-loaded — read first)" section right after Ecosystem Context. Documents: - Cron / triage discipline (read learnings, treat docs PRs touching CLAUDE.md/PLAN.md as noteworthy, write per-tick reflections) - Table of all 6 hooks active in this repo - List of skills and how to invoke them - Standing rules (inviolable) consolidated for the agent This block auto-loads into every conversation context — free behavior change without me remembering to opt in. ## .gitignore audit.jsonl, freeze, judge-subagents, per-tick-reflections.md are all local operational state, never committed. ## Verification - echo '{"tool_input":{"command":"git push --force origin main"}}' \| bash pre-bash-careful.sh → emits deny JSON ✓ - Same for git status (safe command) → empty output, exit 0 ✓ - pre-edit-freeze with .claude/freeze=platform/handlers/ blocks edits to PLAN.md, allows edits inside the locked path ✓ - post-edit-audit appends valid JSONL ✓ - session-start-context emits additionalContext with PR/issue counts ✓ - user-prompt-tag emits warning for "force push to main" prompt ✓ - python3 -c "json.load(open('.claude/settings.json'))" → valid ✓ Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 12:00:35 -07:00
Hongming Wang	239883920e	feat(.claude): 5 gstack-inspired skills + cron upgrades Research on garrytan/gstack surfaced 5 patterns worth importing into our cron / agent setup. These are skills, not platform code — they guide how the cron and our own subagents work, not what the platform does at runtime. ## New skills 1. cross-vendor-review — adversarial second-model review for noteworthy PRs (auth, billing, data deletion, migrations). Catches the 15-30% of bugs single-model review misses. Inspired by gstack's /codex. 2. careful-mode — REFUSE/WARN/ALLOW lists for destructive commands. Refuses force-push to main, blocks merging draft PRs, prevents rm -rf outside scratch dirs. Inspired by gstack's /careful + /freeze. 3. cron-learnings — per-project JSONL of operational learnings appended at the end of every tick, replayed at the start of the next. Stops the cron from re-litigating decided issues. Inspired by gstack's /learn. 4. cron-retro — weekly retrospective auto-posted as a GitHub issue. Sunday 23:07 local. Tracks PR count, time-to-merge, gate failure trends, code-review severity over time. Inspired by gstack's /retro. 5. llm-judge — cheap LLM-as-judge eval to catch "agent shipped the wrong thing" — the failure mode unit tests miss. Plug into issue-pickup pipeline so worker-agent draft PRs get scored before being marked ready. Inspired by gstack's tier-3 test infra. ## Cron updates (session-only, c5074cd5 + 060d136c) - Hourly triage cron now opens with careful-mode activation + cron-learnings replay (Step 0) - code-review skill on every PR being considered for merge (Step 2 supplement A — already present, formalized) - cross-vendor-review on noteworthy PRs (Step 2 supplement B — new) - llm-judge on issue-pickup draft PRs before marking ready (Step 4) - Status report now includes cross-vendor pass/fail and llm-judge scores (Step 5) - End-of-tick cron-learnings append (Step 5) - New weekly cron at Sun 23:07 invokes the cron-retro skill ## What we did NOT take from gstack - Their browser fork — not our product - The 23 named roles — we have agent role templates already - Bun toolchain — adds yet another runtime to our stack - /design-shotgun and design-tool variants — we're not a design tool - /document-release — our update-docs skill already covers this See PR description for full research notes. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 11:36:55 -07:00
Hongming Wang	34fb3fd471	feat(provisioner): configurable per-tier memory/CPU limits (#14 ) Resolves #14. ApplyTierConfig now reads TIER{2,3,4}_MEMORY_MB and TIER{2,3,4}_CPU_SHARES env vars, falling back to the compiled defaults agreed in the issue: - T2: 512 MiB / 1024 shares (1 CPU) — unchanged baseline - T3: 2048 MiB / 2048 shares (2 CPU) — new cap (previously uncapped) - T4: 4096 MiB / 4096 shares (4 CPU) — new cap (previously uncapped) CPU_SHARES follows Docker's 1024 = 1 CPU convention; internally the value is translated to NanoCPUs for a hard allocation so behaviour remains deterministic across hosts. Malformed or non-positive env values silently fall back to the default. Behaviour change note: T3 and T4 previously had no explicit cap. Operators who relied on unlimited can set very large TIERn_MEMORY_MB / TIERn_CPU_SHARES values; a follow-up can add unset-means-unlimited semantics if required. Tests: - TestGetTierMemoryMB_DefaultsMatchLegacy - TestGetTierMemoryMB_EnvOverride (covers malformed + zero fallback) - TestGetTierCPUShares_EnvOverride - TestApplyTierConfig_T3_UsesEnvOverride (wiring) - TestApplyTierConfig_T3_DefaultCap (documents the new cap) Docs: .env.example section + CLAUDE.md platform env-vars list updated. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 10:49:37 -07:00
Hongming Wang	4ff65b82c7	fix(provisioner): preserve Claude session directory across restart (#12 ) Resolves #12. The claude-code SDK stores conversations in /root/.claude/sessions/ and Postgres tracks current_session_id, but the container filesystem was recreated on every restart — next agent message failed with "No conversation found with session ID: <uuid>". Add a per-workspace named Docker volume (ws-<id>-claude-sessions) mounted read-write at /root/.claude/sessions. Gated by runtime=claude-code so other runtimes don't pay for a path they don't use. Volume is cleaned up in RemoveVolume alongside the config volume. Two opt-outs discard the volume before restart for a fresh session: - env WORKSPACE_RESET_SESSION=1 on the container - POST /workspaces/:id/restart?reset=true (or {"reset": true} body) Plumbed via new ResetClaudeSession field on WorkspaceConfig + provisionWorkspaceOpts helper so the flag stays request-scoped (not persisted on CreateWorkspacePayload). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 10:45:30 -07:00
Hongming Wang	a16a25b1f1	docs: sync documentation with 2026-04-14 tick-3 merges (#53 , #54 , #55 ) - docs/edit-history/2026-04-14.md: append tick-3 section covering the admin test-token route (#53), the prior-tick doc-sync PR (#54), and the hermes required_env alignment (#55). Record measured test counts (Go +4 for the TestAdminTestToken_* quartet). - CLAUDE.md: bump Go test count 695 → 699 with a note pointing at the new quartet. Route-table row and env-var mentions for the admin route already landed with #53; verified on main. - .env.example: add MOLECULE_ENABLE_TEST_TOKENS with a comment about the prod-hidden default. Closes the code-review doc-sync flag from #53 (var was in CLAUDE.md but missing from .env.example). No PLAN.md / README.md / README.zh-CN.md update needed — none of the three merges expose a user-visible surface. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 10:37:42 -07:00
Hongming Wang	1ab65ca736	Merge pull request #53 from Molecule-AI/feat/issue-6-admin-test-token feat(platform): GET /admin/workspaces/:id/test-token for E2E (#6)	2026-04-14 10:33:59 -07:00
Hongming Wang	585fcfc1ea	Merge pull request #55 from Molecule-AI/fix/hermes-config-env-mismatch fix(hermes): align config.yaml required_env with executor (HERMES_API_KEY)	2026-04-14 10:29:06 -07:00
Hongming Wang	1adfb77be2	Merge pull request #54 from Molecule-AI/docs/sync-2026-04-14-tick-2 docs: sync documentation with 2026-04-14 tick-2 merges (#50, #52)	2026-04-14 10:28:43 -07:00
Hongming Wang	2e2261ab9c	fix(hermes): align config.yaml required_env with executor (HERMES_API_KEY) The hermes config required NOUS_API_KEY but the executor (workspace-template/adapters/hermes/executor.py from PR #49) checks HERMES_API_KEY and OPENROUTER_API_KEY. A workspace created from this template would have the provisioner block on a missing NOUS_API_KEY even when HERMES_API_KEY was set, or pass provisioning but fail at executor init. .env.example already documents HERMES_API_KEY. Fix: rename the required_env entry to HERMES_API_KEY and update the comments to match the executor's actual fallback order (HERMES_API_KEY first, OPENROUTER_API_KEY second). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 10:19:55 -07:00
Hongming Wang	8f87acc9df	docs: sync documentation with 2026-04-14 tick-2 merges (#50 , #52 ) Two template-only merges this tick, both editing org-templates/molecule-dev/org.yaml: - #50 PM system prompt — audit summaries are dispatch triggers - #52 UIUX Designer cron installs playwright-chromium (closes #23) No code / env / API / test-count drift. Only docs/edit-history/2026-04-14.md created. CLAUDE.md, PLAN.md, README.md, README.zh-CN.md intentionally untouched. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 09:37:24 -07:00
Hongming Wang	496dee8e13	feat(platform): GET /admin/workspaces/:id/test-token for E2E (#6 ) Adds a gated admin endpoint that mints a fresh workspace bearer token on demand, eliminating the register-race currently used by test_comprehensive_e2e.sh (PR #5 follow-up). - New handler admin_test_token.go: returns 404 unless MOLECULE_ENV != production or MOLECULE_ENABLE_TEST_TOKENS=1. Hides route existence in prod (404 not 403). - Mints via wsauth.IssueToken; logs at INFO without the token itself. - Verifies workspace exists before minting (missing -> 404, never 500). - Tests cover prod-hidden, enable-flag-overrides-prod, missing workspace, and happy-path + token-validates round trip. - tests/e2e/_lib.sh gains e2e_mint_test_token helper for downstream adoption. - CLAUDE.md updated with route + env vars. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 09:35:26 -07:00
Hongming Wang	b99495b2df	Merge pull request #52 from Molecule-AI/chore/template-uiux-chromium-recipe closes #23	2026-04-14 09:32:16 -07:00
Hongming Wang	018eb7f4fd	Merge pull request #50 from Molecule-AI/chore/template-pm-dispatcher chore(template): PM system prompt — treat audit summaries as dispatch triggers, not FYIs	2026-04-14 09:32:08 -07:00
rabbitblood	65b01cef83	chore(template): bake working Chromium recipe into UIUX Designer cron (closes #23 ) UIUX Designer figured out at runtime (Run 6, 2026-04-14) how to get Playwright working without a Dockerfile change: LD_LIBRARY_PATH="/home/agent/.cache/ms-playwright/firefox-1509/firefox" node script.cjs Using @sparticuz/chromium + puppeteer-core, and borrowing the NSS/NSPR libs bundled with Playwright's Firefox binary. This resolves every missing lib on the container without needing apt-get or image rebuild. Agent memory persists the trick across restarts, but a fresh org-template import (new user) would have to rediscover it. Baking the recipe into the cron prompt so every clone inherits day-one screenshot capability. Evidence it works (from Run 6 memory): - 14 screenshots captured and vision-analysed - Found 2 new criticals (C4 onboarding-guide a11y, C5 settings panel white refresh button confirmed in production) that only surface via live DOM - Full user-flow coverage: home → create → settings → help → templates → mobile 375 → responsive 1280 Replaces the previous "best-effort + fall back to HTML" wording with a specific, proven command path. Falls back on HTML only if the browser genuinely won't launch (e.g. host.docker.internal:3000 down). Template-level fix; the general platform-level path would be to ship these libs in the workspace-template image directly (future Dockerfile change — out of scope here).	2026-04-14 09:01:03 -07:00
Hongming Wang	cd4eb9c590	Merge pull request #49 from Molecule-AI/feat/hermes-pr2 feat(hermes): implement create_executor() with HERMES_API_KEY / OPENROUTER_API_KEY fallback + smoke tests	2026-04-14 08:16:15 -07:00
rabbitblood	4ac6cdb293	chore(template): PM system prompt — treat audit summaries as dispatch triggers, not FYIs Observed 2026-04-14 morning: audit crons (Security, UIUX, QA) were flowing messages into PM per the PR #26 contract, but PM stopped sub-delegating to Dev Lead ~10 hours ago. Meanwhile audits started opening PRs directly (bypassing Dev Lead), and Dev Lead / BE / FE / DevOps / QA sat idle for 17+ maintenance cycles despite PRs continuing to land. Root cause: PM's system prompt defined delegation behavior for "tasks from CEO" but didn't explicitly treat audit summaries as tasks. PM was reading "audit of SHA X, filed issue #N, top recommendation: fix Y" as a status report and committing it to memory without triggering the dispatch chain. Adds a dedicated "Audit Routing" section to PM's prompt that: - Treats every audit summary with open issue numbers as a dispatch trigger - Specifies routing by category (security→BE, ui→FE, infra→DevOps, qa→QA) - Requires parallel `delegate_task_async` when issues span categories - Makes clean-cycle acks the only no-op case This turns PM from a receptionist into a dispatcher — which was the original intent of the audit-routing contract in #26. Aligns with the north-star goal (keep the team running 24/7): dead idle windows when audits had live issue numbers is a defect in orchestration, not a quiet period.	2026-04-14 08:13:42 -07:00
Hongming Wang	fad03b7db3	Merge pull request #48 from Molecule-AI/fix/issue-17-rogue-restart-loop fix(provisioner): stop rogue config-missing restart loop (#17)	2026-04-14 08:12:30 -07:00

1 2 3

143 Commits