molecule-core

Author	SHA1	Message	Date
Hongming Wang	730bcc4e9f	docs(plan): drop stale sequential refs #64-#67 from Backlog items 11-14 Backlog items 11-14 used sequential enumeration (#64/#65/#66/#67) as intra-doc bookkeeping. Those numbers now collide with actual merged PRs and open issues with completely different scopes: - PR #64 = auto-refresh global_secrets (not "delegations list") - PR #65 = restart context Layer 1 (not "per-agent repo access") - Issue #66 = restart_prompt Layer 2 (not "SDK swallows stderr") - PR #67 = docs sync tick-4 (not "MCP localhost default") Strip the misleading refs and add a footnote explaining the cleanup. If/when any of these items get prioritized, file real GitHub issues. Tracked in cron-learnings tick-3 entry. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 13:05:08 -07:00
Hongming Wang	b9b96c9cff	Merge pull request #67 from Molecule-AI/docs/sync-2026-04-14-tick-4 docs: sync documentation with 2026-04-14 evening-tick merges (#63, #64, #65)	2026-04-14 13:03:18 -07:00
Hongming Wang	2fa6f7c6cd	docs: sync documentation with 2026-04-14 evening-tick merges (#63 , #64 , #65 ) - edit-history/2026-04-14.md: append tick-4 section covering the 12 modular guardrail plugins (#63), global-secrets auto-restart fan-out (#64, fixes issue #15), and synthetic restart-context A2A message (#65, fixes issue #19 Layer 1; Layer 2 deferred to issue #66). - CLAUDE.md: bump Go test count 699 -> 726 (measured); note global secrets auto-restart on SetGlobal/DeleteGlobal in the route table; add Workspace Lifecycle paragraph for the restart-context message and its system:restart-context caller prefix. - PLAN.md: bump Go test count in the coverage table; record issues #15 and #19 Layer 1 as launched; add new Backlog entry for the Layer 2 follow-up (issue #66). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 12:54:04 -07:00
Hongming Wang	383582fbbf	Merge pull request #64 from Molecule-AI/fix/issue-15-refresh-oauth-on-restart fix(secrets): auto-refresh global_secrets on workspace restart (#15)	2026-04-14 12:49:19 -07:00
Hongming Wang	3ea8cda5b0	Merge pull request #65 from Molecule-AI/fix/issue-19-restart-context-layer1 feat(platform): inject restart context system message (#19 Layer 1)	2026-04-14 12:48:19 -07:00
Hongming Wang	8b896b1a56	feat(plugins): split guardrails into 12 modular plugins (#63 ) Noteworthy: large-addition (+1601 lines, 12 new plugins) + modifies core AgentskillsAdaptor (SDK + runtime copies, drift-guarded). All 7 gates pass, 0 critical findings. Cross-vendor review skipped (tool unavailable).	2026-04-14 12:47:24 -07:00
Hongming Wang	c4240e32c1	feat(platform): inject restart context system message (#19 Layer 1) After a workspace restart (HTTP /restart or programmatic RestartByID) and re-registration, the platform sends a synthetic A2A message/send to the workspace containing: - restart timestamp - previous session end timestamp + human duration - env-var keys now available (keys only — never values) The message is rendered in the format proposed in #19 and marked with metadata.kind=restart_context so agents can detect and handle it specifically if they choose. Skip path: if the workspace doesn't re-register within 30s, log and drop. The Restart HTTP response is unaffected by delivery success. Layer 2 (user-defined restart_prompt via config.yaml / org.yaml) is deferred — tracked as a separate follow-up issue. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 12:41:01 -07:00
Hongming Wang	e658f86c08	fix(secrets): auto-restart workspaces on global secret change (#15 ) Global secrets (e.g. CLAUDE_CODE_OAUTH_TOKEN) are injected as container env vars at Start() time. Until now, rotating one only propagated to a workspace on the next full restart-from-zero, which manual ops had to drive via a `POST /workspaces/:id/restart` loop. Tier-3 Claude Code agents hit the stale-token path first and surfaced as 401s inside the SDK. Restart-time re-read of global_secrets + workspace_secrets was already correct in `provisionWorkspaceOpts` — the missing piece was the trigger. SetGlobal / DeleteGlobal now enqueue RestartByID for every non-paused, non-removed, non-external workspace that does NOT shadow the key with a workspace-level override. Matches the existing behaviour of workspace-scoped `Set` / `Delete`. Adds two sqlmock-backed tests exercising both branches. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 12:39:00 -07:00
Hongming Wang	d0eaa814de	fix(gate-4): add missing import json in sdk/python/molecule_plugin/builtins.py PR #63 code-review caught that the SDK copy of AgentskillsAdaptor uses json.loads/json.dumps in _merge_settings_fragment + _rewrite_hook_paths + _deep_merge_hooks but never imports json. The runtime copy (workspace-template/plugins_registry/builtins.py) already has the import; this brings the SDK side in line. Bug surfaces only when a plugin shipping settings-fragment.json (any of the 5 hook plugins or 2 workflow plugins in this PR) is installed through the SDK path — would NameError on the first json.loads call. The drift test catches behavioral drift via fixture install scenarios but not import-level drift in helper code paths. Verified: json is now importable (`hasattr(molecule_plugin.builtins, 'json')` → True), drift test still passes. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 12:29:32 -07:00
Hongming Wang	9c7f57688c	Merge pull request #57 from Molecule-AI/fix/issue-12-preserve-claude-sessions fix(provisioner): preserve Claude session directory across restart (#12)	2026-04-14 12:26:12 -07:00
Hongming Wang	d0c5626df1	Merge pull request #61 from Molecule-AI/feat/claude-hooks-upgrade feat(.claude): ambient hooks + sequential-thinking MCP + /triage command	2026-04-14 12:25:54 -07:00
Hongming Wang	bab8110d34	Merge pull request #60 from Molecule-AI/feat/gstack-inspired-cron-upgrades feat(.claude): 5 gstack-inspired skills + cron upgrades	2026-04-14 12:25:19 -07:00
Hongming Wang	18a5d1a538	Merge pull request #58 from Molecule-AI/feat/issue-14-configurable-tier-limits noteworthy: behavior-change — T3/T4 caps introduced where previously unlimited; defaults match issue #14 spec; operators can override via env	2026-04-14 12:25:00 -07:00
Hongming Wang	2e873cc2e8	docs(plan): add Phase 32 — Cloud SaaS launch roadmap (#59 ) New section before the Temporal footnote capturing the gap analysis between today's self-hosted posture and a multi-tenant cloud SaaS: - Tier 1 blockers: multi-tenancy (org_id everywhere), WorkOS AuthKit for human auth, Fly Machines for container isolation, Stripe billing, per-org quotas, managed Postgres/Redis (Neon/Upstash), KMS-backed secrets, migrations out of app boot - Tier 1 follow-ups: Sentry + Grafana, per-org rate limiting, Cloudflare, onboarding flow, transactional email, admin panel, ToS/DPA - Tier 2 tech-stack upgrades (non-blocking): pgx/v5 + sqlc, River for platform async (NOT Temporal — that stays in workspace-template as an agent tool), TanStack Query, Turbopack, uv for Python, Python MCP client, shadcn/ui CLI - Tier 3 explicitly NOT doing: Kubernetes, ORMs, framework swaps, build-auth-yourself, canvas library swaps — with reasons - Tier 4 compliance (post-revenue): SOC 2, status page, staging, canary deploys, load testing - Success criteria: sign-up-to-first-message < 5 min, tenant isolation red-teamed, Fly Machines cost documented, Stripe end-to-end, first paying design partner Derived from a tech-stack audit run against the 2026 best-in-class landscape (pgx won Postgres, River eats Temporal's small-company slot, WorkOS beats Clerk for per-org SSO, Fly Machines is the only isolation option without an SRE). Co-authored-by: Hongming Wang <hongmingwang.rabbit@users.noreply.github.com> Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 12:24:59 -07:00
Hongming Wang	b123294cf2	Merge pull request #56 from Molecule-AI/docs/sync-2026-04-14-tick-3 docs: sync documentation with 2026-04-14 tick-3 merges (#53, #54, #55)	2026-04-14 12:24:16 -07:00
Hongming Wang	90a513d1d0	feat(plugins): split guardrails into 12 modular plugins Replaces the proposed monolithic molecule-guardrails plugin with 12 single-purpose plugins users can install à la carte. Powered by a small extension to the AgentskillsAdaptor base class so any plugin can ship hooks/, commands/, and a settings-fragment.json without writing a custom adapter. ## Base adapter changes workspace-template/plugins_registry/builtins.py + sdk/python/molecule_plugin/builtins.py (both copies — drift-tested): - New _install_claude_layer() helper called at the end of install() - Conditionally copies hooks/ → /configs/.claude/hooks/ (preserving exec bit) - Conditionally copies commands/.md → /configs/.claude/commands/ - Conditionally merges settings-fragment.json into /configs/.claude/settings.json with ${CLAUDE_DIR} placeholder rewritten to the workspace's absolute install path. Existing user hooks are preserved (deep-merge by event name). - All steps no-op when the plugin doesn't ship the corresponding files, so existing skill+rule plugins (molecule-dev, superpowers, ecc, browser-automation) are unchanged. Drift test (tests/test_plugins_builtins_drift.py) still passes. ## 12 new plugins Hook plugins (ambient enforcement): - molecule-careful-bash — refuses destructive bash; ships careful-mode skill - molecule-freeze-scope — locks edits via .claude/freeze - molecule-audit-trail — appends every Edit/Write to audit.jsonl - molecule-session-context — auto-loads cron-learnings at session start - molecule-prompt-watchdog — injects warnings on destructive prompt keywords Skill plugins (on-demand): - molecule-skill-code-review — 16-criteria multi-axis review - molecule-skill-cross-vendor-review — adversarial second-model review - molecule-skill-llm-judge — deliverable-vs-request scoring - molecule-skill-update-docs — post-merge doc sync - molecule-skill-cron-learnings — operational-memory JSONL format Workflow plugins (slash commands): - molecule-workflow-triage — /triage full PR-triage cycle - molecule-workflow-retro — /retro + cron-retro skill, weekly retrospective Each ships only what it needs — most have just plugin.yaml + skills/ or hooks/ + adapter (one-line stub: `from plugins_registry.builtins import AgentskillsAdaptor as Adaptor`). Total ~120 files but each plugin is small and self-contained. ## Verification - python3 -m molecule_plugin validate plugins/molecule- → all 13 valid (12 new + pre-existing molecule-dev) - End-to-end install smoke test on representative samples: hook plugin (molecule-careful-bash), skill-only plugin (molecule-skill-code-review), workflow plugin (molecule-workflow-triage). All produce expected /configs/ tree, settings.json paths rewritten, exec bits preserved, zero warnings. - workspace-template pytest tests/test_plugins_builtins_drift.py → passes (SDK + runtime stay in sync). ## CLAUDE.md repo-doc updated Lists all 12 new plugins under the existing Plugins section, organized by category (hook / skill / workflow). Each entry one line, recommend- together hints where dependencies make sense. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 12:20:04 -07:00
Hongming Wang	3f8eb7406f	feat(.claude): ambient hooks + sequential-thinking MCP + /triage command Skills are opt-in (I have to remember to invoke them). Hooks are ambient — they fire on every matching event automatically. This PR moves the careful-mode and learnings discipline from "doc I should read" to "harness-enforced behavior I cannot bypass". ## 6 new hooks (.claude/hooks/) - pre-bash-careful — REFUSES git push --force to main, rm -rf at root, DROP TABLE against prod schema. WARNs on force-with-lease, gh pr/ issue close. Tested: blocks the destructive case, allows safe ones. - pre-edit-freeze — implements /freeze. When .claude/freeze contains a path glob, edits outside it are denied. Tested: edits to PLAN.md blocked when scope locked to platform/internal/handlers/. - session-start-context — auto-loads last 20 cron-learnings, freeze status, open-PR/issue counts as additionalContext at session start. Tested: emits valid SessionStart JSON. - post-edit-audit — appends every Edit/Write to .claude/audit.jsonl (gitignored). One-line records {ts, tool, file, ok}. Tested writes. - user-prompt-tag — injects context warnings when prompt mentions force-push, drop-table, "delete all", "push to main", etc. Tested: emits warning for "force push the fix to main". - subagent-stop-judge — off by default; touch .claude/judge-subagents to enable. When on, prompts orchestrator to verify subagent's last message addresses the original task. Cost-free MVP (no LLM call yet). All hooks are Python (jq isn't on the hook PATH on macOS — Python is). Shared helpers in _lib.py: read_input, deny_pretooluse, add_context, warn_to_stderr. ## settings.json — wires all 6 hooks Adds SessionStart, UserPromptSubmit, SubagentStop event handlers. Existing PreToolUse:Bash + PostToolUse:Edit chains gain the new hooks alongside the existing ones (check-inbox.sh, echo reminder). Adds @modelcontextprotocol/server-sequential-thinking MCP server for structured chain-of-thought scratchpad — useful when triaging multiple PRs in parallel without losing context. ## .claude/commands/triage.md — slash command shortcut Manual /triage runs the same flow as the c5074cd5 hourly cron, on demand. Saves ~4KB of prompt every invocation by pulling the cron prompt out of working memory. ## CLAUDE.md additions New "Agent operating rules (auto-loaded — read first)" section right after Ecosystem Context. Documents: - Cron / triage discipline (read learnings, treat docs PRs touching CLAUDE.md/PLAN.md as noteworthy, write per-tick reflections) - Table of all 6 hooks active in this repo - List of skills and how to invoke them - Standing rules (inviolable) consolidated for the agent This block auto-loads into every conversation context — free behavior change without me remembering to opt in. ## .gitignore audit.jsonl, freeze, judge-subagents, per-tick-reflections.md are all local operational state, never committed. ## Verification - echo '{"tool_input":{"command":"git push --force origin main"}}' \| bash pre-bash-careful.sh → emits deny JSON ✓ - Same for git status (safe command) → empty output, exit 0 ✓ - pre-edit-freeze with .claude/freeze=platform/handlers/ blocks edits to PLAN.md, allows edits inside the locked path ✓ - post-edit-audit appends valid JSONL ✓ - session-start-context emits additionalContext with PR/issue counts ✓ - user-prompt-tag emits warning for "force push to main" prompt ✓ - python3 -c "json.load(open('.claude/settings.json'))" → valid ✓ Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 12:00:35 -07:00
Hongming Wang	9d914193d2	feat(.claude): 5 gstack-inspired skills + cron upgrades Research on garrytan/gstack surfaced 5 patterns worth importing into our cron / agent setup. These are skills, not platform code — they guide how the cron and our own subagents work, not what the platform does at runtime. ## New skills 1. cross-vendor-review — adversarial second-model review for noteworthy PRs (auth, billing, data deletion, migrations). Catches the 15-30% of bugs single-model review misses. Inspired by gstack's /codex. 2. careful-mode — REFUSE/WARN/ALLOW lists for destructive commands. Refuses force-push to main, blocks merging draft PRs, prevents rm -rf outside scratch dirs. Inspired by gstack's /careful + /freeze. 3. cron-learnings — per-project JSONL of operational learnings appended at the end of every tick, replayed at the start of the next. Stops the cron from re-litigating decided issues. Inspired by gstack's /learn. 4. cron-retro — weekly retrospective auto-posted as a GitHub issue. Sunday 23:07 local. Tracks PR count, time-to-merge, gate failure trends, code-review severity over time. Inspired by gstack's /retro. 5. llm-judge — cheap LLM-as-judge eval to catch "agent shipped the wrong thing" — the failure mode unit tests miss. Plug into issue-pickup pipeline so worker-agent draft PRs get scored before being marked ready. Inspired by gstack's tier-3 test infra. ## Cron updates (session-only, c5074cd5 + 060d136c) - Hourly triage cron now opens with careful-mode activation + cron-learnings replay (Step 0) - code-review skill on every PR being considered for merge (Step 2 supplement A — already present, formalized) - cross-vendor-review on noteworthy PRs (Step 2 supplement B — new) - llm-judge on issue-pickup draft PRs before marking ready (Step 4) - Status report now includes cross-vendor pass/fail and llm-judge scores (Step 5) - End-of-tick cron-learnings append (Step 5) - New weekly cron at Sun 23:07 invokes the cron-retro skill ## What we did NOT take from gstack - Their browser fork — not our product - The 23 named roles — we have agent role templates already - Bun toolchain — adds yet another runtime to our stack - /design-shotgun and design-tool variants — we're not a design tool - /document-release — our update-docs skill already covers this See PR description for full research notes. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 11:36:55 -07:00
Hongming Wang	479f1776a8	feat(provisioner): configurable per-tier memory/CPU limits (#14 ) Resolves #14. ApplyTierConfig now reads TIER{2,3,4}_MEMORY_MB and TIER{2,3,4}_CPU_SHARES env vars, falling back to the compiled defaults agreed in the issue: - T2: 512 MiB / 1024 shares (1 CPU) — unchanged baseline - T3: 2048 MiB / 2048 shares (2 CPU) — new cap (previously uncapped) - T4: 4096 MiB / 4096 shares (4 CPU) — new cap (previously uncapped) CPU_SHARES follows Docker's 1024 = 1 CPU convention; internally the value is translated to NanoCPUs for a hard allocation so behaviour remains deterministic across hosts. Malformed or non-positive env values silently fall back to the default. Behaviour change note: T3 and T4 previously had no explicit cap. Operators who relied on unlimited can set very large TIERn_MEMORY_MB / TIERn_CPU_SHARES values; a follow-up can add unset-means-unlimited semantics if required. Tests: - TestGetTierMemoryMB_DefaultsMatchLegacy - TestGetTierMemoryMB_EnvOverride (covers malformed + zero fallback) - TestGetTierCPUShares_EnvOverride - TestApplyTierConfig_T3_UsesEnvOverride (wiring) - TestApplyTierConfig_T3_DefaultCap (documents the new cap) Docs: .env.example section + CLAUDE.md platform env-vars list updated. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 10:49:37 -07:00
Hongming Wang	7ad3173c10	fix(provisioner): preserve Claude session directory across restart (#12 ) Resolves #12. The claude-code SDK stores conversations in /root/.claude/sessions/ and Postgres tracks current_session_id, but the container filesystem was recreated on every restart — next agent message failed with "No conversation found with session ID: <uuid>". Add a per-workspace named Docker volume (ws-<id>-claude-sessions) mounted read-write at /root/.claude/sessions. Gated by runtime=claude-code so other runtimes don't pay for a path they don't use. Volume is cleaned up in RemoveVolume alongside the config volume. Two opt-outs discard the volume before restart for a fresh session: - env WORKSPACE_RESET_SESSION=1 on the container - POST /workspaces/:id/restart?reset=true (or {"reset": true} body) Plumbed via new ResetClaudeSession field on WorkspaceConfig + provisionWorkspaceOpts helper so the flag stays request-scoped (not persisted on CreateWorkspacePayload). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 10:45:30 -07:00
Hongming Wang	dcf8a07887	docs: sync documentation with 2026-04-14 tick-3 merges (#53 , #54 , #55 ) - docs/edit-history/2026-04-14.md: append tick-3 section covering the admin test-token route (#53), the prior-tick doc-sync PR (#54), and the hermes required_env alignment (#55). Record measured test counts (Go +4 for the TestAdminTestToken_* quartet). - CLAUDE.md: bump Go test count 695 → 699 with a note pointing at the new quartet. Route-table row and env-var mentions for the admin route already landed with #53; verified on main. - .env.example: add MOLECULE_ENABLE_TEST_TOKENS with a comment about the prod-hidden default. Closes the code-review doc-sync flag from #53 (var was in CLAUDE.md but missing from .env.example). No PLAN.md / README.md / README.zh-CN.md update needed — none of the three merges expose a user-visible surface. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 10:37:42 -07:00
Hongming Wang	639c32045d	Merge pull request #53 from Molecule-AI/feat/issue-6-admin-test-token feat(platform): GET /admin/workspaces/:id/test-token for E2E (#6)	2026-04-14 10:33:59 -07:00
Hongming Wang	0485585031	Merge pull request #55 from Molecule-AI/fix/hermes-config-env-mismatch fix(hermes): align config.yaml required_env with executor (HERMES_API_KEY)	2026-04-14 10:29:06 -07:00
Hongming Wang	c9f0a915c1	Merge pull request #54 from Molecule-AI/docs/sync-2026-04-14-tick-2 docs: sync documentation with 2026-04-14 tick-2 merges (#50, #52)	2026-04-14 10:28:43 -07:00
Hongming Wang	fd9e603f29	fix(hermes): align config.yaml required_env with executor (HERMES_API_KEY) The hermes config required NOUS_API_KEY but the executor (workspace-template/adapters/hermes/executor.py from PR #49) checks HERMES_API_KEY and OPENROUTER_API_KEY. A workspace created from this template would have the provisioner block on a missing NOUS_API_KEY even when HERMES_API_KEY was set, or pass provisioning but fail at executor init. .env.example already documents HERMES_API_KEY. Fix: rename the required_env entry to HERMES_API_KEY and update the comments to match the executor's actual fallback order (HERMES_API_KEY first, OPENROUTER_API_KEY second). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 10:19:55 -07:00
Hongming Wang	35aa945164	docs: sync documentation with 2026-04-14 tick-2 merges (#50 , #52 ) Two template-only merges this tick, both editing org-templates/molecule-dev/org.yaml: - #50 PM system prompt — audit summaries are dispatch triggers - #52 UIUX Designer cron installs playwright-chromium (closes #23) No code / env / API / test-count drift. Only docs/edit-history/2026-04-14.md created. CLAUDE.md, PLAN.md, README.md, README.zh-CN.md intentionally untouched. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 09:37:24 -07:00
Hongming Wang	0832f997f0	feat(platform): GET /admin/workspaces/:id/test-token for E2E (#6 ) Adds a gated admin endpoint that mints a fresh workspace bearer token on demand, eliminating the register-race currently used by test_comprehensive_e2e.sh (PR #5 follow-up). - New handler admin_test_token.go: returns 404 unless MOLECULE_ENV != production or MOLECULE_ENABLE_TEST_TOKENS=1. Hides route existence in prod (404 not 403). - Mints via wsauth.IssueToken; logs at INFO without the token itself. - Verifies workspace exists before minting (missing -> 404, never 500). - Tests cover prod-hidden, enable-flag-overrides-prod, missing workspace, and happy-path + token-validates round trip. - tests/e2e/_lib.sh gains e2e_mint_test_token helper for downstream adoption. - CLAUDE.md updated with route + env vars. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 09:35:26 -07:00
Hongming Wang	347faab6df	Merge pull request #52 from Molecule-AI/chore/template-uiux-chromium-recipe closes #23	2026-04-14 09:32:16 -07:00
Hongming Wang	14fc30f87d	Merge pull request #50 from Molecule-AI/chore/template-pm-dispatcher chore(template): PM system prompt — treat audit summaries as dispatch triggers, not FYIs	2026-04-14 09:32:08 -07:00
rabbitblood	40158c3753	chore(template): bake working Chromium recipe into UIUX Designer cron (closes #23 ) UIUX Designer figured out at runtime (Run 6, 2026-04-14) how to get Playwright working without a Dockerfile change: LD_LIBRARY_PATH="/home/agent/.cache/ms-playwright/firefox-1509/firefox" node script.cjs Using @sparticuz/chromium + puppeteer-core, and borrowing the NSS/NSPR libs bundled with Playwright's Firefox binary. This resolves every missing lib on the container without needing apt-get or image rebuild. Agent memory persists the trick across restarts, but a fresh org-template import (new user) would have to rediscover it. Baking the recipe into the cron prompt so every clone inherits day-one screenshot capability. Evidence it works (from Run 6 memory): - 14 screenshots captured and vision-analysed - Found 2 new criticals (C4 onboarding-guide a11y, C5 settings panel white refresh button confirmed in production) that only surface via live DOM - Full user-flow coverage: home → create → settings → help → templates → mobile 375 → responsive 1280 Replaces the previous "best-effort + fall back to HTML" wording with a specific, proven command path. Falls back on HTML only if the browser genuinely won't launch (e.g. host.docker.internal:3000 down). Template-level fix; the general platform-level path would be to ship these libs in the workspace-template image directly (future Dockerfile change — out of scope here).	2026-04-14 09:01:03 -07:00
Hongming Wang	a2ea1b183b	Merge pull request #49 from Molecule-AI/feat/hermes-pr2 feat(hermes): implement create_executor() with HERMES_API_KEY / OPENROUTER_API_KEY fallback + smoke tests	2026-04-14 08:16:15 -07:00
rabbitblood	3beb09df03	chore(template): PM system prompt — treat audit summaries as dispatch triggers, not FYIs Observed 2026-04-14 morning: audit crons (Security, UIUX, QA) were flowing messages into PM per the PR #26 contract, but PM stopped sub-delegating to Dev Lead ~10 hours ago. Meanwhile audits started opening PRs directly (bypassing Dev Lead), and Dev Lead / BE / FE / DevOps / QA sat idle for 17+ maintenance cycles despite PRs continuing to land. Root cause: PM's system prompt defined delegation behavior for "tasks from CEO" but didn't explicitly treat audit summaries as tasks. PM was reading "audit of SHA X, filed issue #N, top recommendation: fix Y" as a status report and committing it to memory without triggering the dispatch chain. Adds a dedicated "Audit Routing" section to PM's prompt that: - Treats every audit summary with open issue numbers as a dispatch trigger - Specifies routing by category (security→BE, ui→FE, infra→DevOps, qa→QA) - Requires parallel `delegate_task_async` when issues span categories - Makes clean-cycle acks the only no-op case This turns PM from a receptionist into a dispatcher — which was the original intent of the audit-routing contract in #26. Aligns with the north-star goal (keep the team running 24/7): dead idle windows when audits had live issue numbers is a defect in orchestration, not a quiet period.	2026-04-14 08:13:42 -07:00
Hongming Wang	cc9f181e8d	Merge pull request #48 from Molecule-AI/fix/issue-17-rogue-restart-loop fix(provisioner): stop rogue config-missing restart loop (#17)	2026-04-14 08:12:30 -07:00
Hongming Wang	56068a7698	docs(hermes): document HERMES_API_KEY env var and runtime-table row Adds HERMES_API_KEY to .env.example with a cross-reference to the OPENROUTER_API_KEY fallback, and adds the hermes runtime row to the CLAUDE.md runtime table so the new adapter is discoverable alongside its siblings (langgraph, claude-code, openclaw, crewai, autogen, deepagents). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 08:11:37 -07:00
Hongming Wang	af54fe89de	Merge pull request #47 from Molecule-AI/fix/issue-13-workspace-chown fix(workspace): chown /workspace when root-owned bind mount (#13)	2026-04-14 08:10:58 -07:00
Hongming Wang	f7683e3adf	fix(provisioner): stop rogue config-missing restart loop (#17 ) Resolves #17. Part A: scripts/cleanup-rogue-workspaces.sh deletes workspaces whose id or name starts with known test placeholder prefixes (aaaaaaaa-, etc.) and force-removes the paired Docker container. Documented in tests/README.md. Part B: add a pre-flight check in provisionWorkspace() — when neither a template path nor in-memory configFiles supplies config.yaml, probe the existing named volume via a throwaway alpine container. If the volume lacks config.yaml, mark the workspace status='failed' with a clear last_sample_error instead of handing it to Docker's unless-stopped restart policy (which otherwise loops forever on FileNotFoundError). New pure helper provisioner.ValidateConfigSource + unit tests. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 07:32:58 -07:00
Hongming Wang	cb47e89aa8	fix(workspace): recursive chown when /workspace bind mount is root-owned (#13 ) On Docker Desktop (macOS/Windows), host-path bind mounts often appear root-owned inside the container. The previous entrypoint only chowned /workspace top-level, so agents (uid 1000) still couldn't write to /workspace/repo/* — git clone, pip install, and file edits failed with EACCES and fell back to /tmp. Detect the root-owned-contents case by sampling the first entry; if it's root-owned, recursively chown the tree. On normal Linux Docker with matching uids this is a no-op, so the fast-startup path is preserved for the common case. Part B of the issue (private-repo initial_prompt clone) was addressed by PR #20. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-14 07:29:30 -07:00
Hongming Wang	5ab75532d0	Merge pull request #43 from Molecule-AI/fix/reduced-motion fix(a11y): prefers-reduced-motion WCAG 2.3.3 compliance	2026-04-14 07:20:19 -07:00
Hongming Wang	652fc31d9b	Merge pull request #45 from Molecule-AI/feat/zoom-to-team-shortcut feat(canvas): Z shortcut + help entry for double-click zoom-to-team	2026-04-14 07:19:23 -07:00
Hongming Wang	cfe1912997	Merge pull request #46 from Molecule-AI/fix/a2a-client-auth-headers fix(security): complete Phase 30.6 auth headers in a2a_client — fixes post-deploy break in get_peers	2026-04-14 07:18:16 -07:00
Dev Lead Agent	b99497cd3f	fix(security): complete Phase 30.6 auth headers in a2a_client get_peers and discover_peer get_peers() was sending no auth headers to /registry/:id/peers — this would return 401 for every workspace agent after PR #31 (WorkspaceAuth middleware) deploys, breaking peer discovery entirely. discover_peer() had X-Workspace-ID but was missing the bearer token, also required by Phase 30.6 for /registry/discover/:id. Both functions now send {"X-Workspace-ID": WORKSPACE_ID, **auth_headers()}. get_workspace_info() was already correct (auth_headers() present since PR #39). Adds test_request_sends_workspace_id_header to TestGetPeers; hardens the discover_peer header assertion to use presence-check rather than exact equality. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-14 13:23:44 +00:00
Dev Lead Agent	7c336c680d	feat(canvas): Z shortcut + help entry for double-click zoom-to-team Adds Z as a keyboard equivalent for the existing double-click zoom-to-team gesture (WCAG 2.1.1). When a team node is selected, pressing Z dispatches molecule:zoom-to-team, which fitBounds to the parent and all children. Input elements are guarded so Z still types normally in text fields. Adds a 6th help panel entry documenting the Dbl-click / Z gesture. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-14 11:36:41 +00:00
Hongming Wang	36ae95f6c2	Merge pull request #42 from Molecule-AI/fix/a11y-audit-11 fix: ARIA tablist for side panel, Radix Dialog for create modal, aria-live for loading states (audit 11)	2026-04-14 04:27:35 -07:00
Dev Lead Agent	95abca2f4f	fix(a11y): prefers-reduced-motion WCAG 2.3.3 compliance globals.css: append @media (prefers-reduced-motion: reduce) block that zeroes animation/transition durations, disables .animate-in/.slide-in-from-* entry animations (Toaster, ApprovalBanner, SidePanel slide), strips dashdraw and node-appear keyframes from React Flow elements. Components: replace all bare animate-pulse (13 occurrences across WorkspaceNode, StatusDot, Toolbar, SidePanel, Legend, SearchDialog, TerminalTab, TemplatePalette) with motion-safe:animate-pulse so status indicator pulsing stops for users with vestibular disorders. Replace 3 animate-bounce occurrences in ChatTab typing indicator with motion-safe:animate-bounce. Tests: new canvas/src/__tests__/reduced-motion.test.ts (12 tests) verifies the @media block is present in globals.css and that every component file uses the motion-safe: variant rather than bare animation classes. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-14 11:25:23 +00:00
Dev Lead Agent	9fe334779f	fix: Radix Dialog for create modal, ARIA tablist for side panel, aria-live for loading states (audit 11) - CreateWorkspaceDialog: replace plain div modal with @radix-ui/react-dialog (focus-trap, Escape-to-close, aria-labelledby auto-wired); tier selector uses role=radiogroup/radio + aria-checked; error uses role=alert; required fields annotate with sr-only "(required)" - SidePanel: WAI-ARIA tablist pattern — role=tablist + aria-label, role=tab + aria-selected + aria-controls + id, roving tabIndex (0/−1), ArrowRight/Left/Home/End keyboard nav with wrap, role=tabpanel + id + aria-labelledby on content area, tab icons are aria-hidden - TemplatePalette: loading and empty-state divs gain role=status + aria-live=polite - Canvas: sr-only role=status live region announces workspace count to screen readers - Tests: 7 new a11y tests for CreateWorkspaceDialog (Radix role=dialog, aria-labelledby, data-state, Cancel close, role=alert validation, role=radio tier); 12 new tab tests for SidePanel (tablist, 12 tabs, aria-selected, roving tabIndex, aria-controls, tabpanel, ArrowRight/Left/Home/End) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-14 10:31:34 +00:00
Hongming Wang	a81ae1a0a3	Merge pull request #40 from Molecule-AI/fix/keyboard-a11y fix: keyboard navigation — ContextMenu ARIA menu pattern + SearchDialog combobox (WCAG 2.1.1)	2026-04-14 03:26:27 -07:00
Hongming Wang	b5eb14e40d	Merge pull request #41 from Molecule-AI/fix/security-h3-m4 noteworthy: secrets-handling — H3 github_pat_ redaction + M4 atomic 0600 token write. 7-gate verification PASS.	2026-04-14 03:21:49 -07:00
Dev Lead Agent	1440bd732e	fix(security): H3 github_pat_ redaction + M4 atomic token write (audit cycle 10) H3 (compliance.py): GitHub fine-grained PATs use the github_pat_ prefix with an 82-character alphanumeric+underscore suffix — different from classic tokens (36 chars). Add the missing pattern to _PII_PATTERNS so fine-grained PATs are redacted in compliance logs alongside classic tokens. M4 (platform_auth.py): Replace write_text()+chmod() in save_token() with os.open(O_WRONLY\|O_CREAT\|O_TRUNC, 0o600) + os.write(). The old approach had a TOCTOU window where a concurrent reader could access the token file before chmod restricted permissions. os.open with explicit mode creates the file with 0600 permissions atomically in a single syscall. H2 (a2a_client.py): Already fixed in commit `bea0e96` (Cycle 5); no-op. Tests: 1136 passed, 2 skipped (workspace-template pytest suite) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-14 09:34:27 +00:00
Dev Lead Agent	0725a818e7	fix: keyboard navigation for ContextMenu (WCAG 2.1.1) and SearchDialog combobox pattern - ContextMenu: role=menu/menuitem/separator, aria-label, aria-disabled, focus-visible ring, auto-focus first enabled item on open, ArrowDown/Up roving focus (wrapping), Escape + Tab dismiss, aria-hidden on decorative icons/status dot - SearchDialog: role=dialog+aria-modal, combobox pattern on input (role=combobox, aria-expanded, aria-autocomplete, aria-controls, aria-activedescendant), focusedIndex state, ArrowDown/Up/Enter keyboard navigation, role=listbox+option, aria-selected, role=status + aria-live=polite on empty state, footer hints updated with ↑↓ - Add 10 ContextMenu keyboard tests (role, aria-label, menuitem, separator, Escape, Tab, ArrowDown, wrap, ArrowUp wrap, null guard) - Add 13 SearchDialog keyboard tests (dialog, aria-modal, combobox, listbox, option, ArrowDown, double-ArrowDown, clamp, ArrowUp-clamp, Enter select, Enter noop, query reset, activedescendant) Tests: 406 passed (383 existing + 23 new) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>	2026-04-14 09:28:10 +00:00
Hongming Wang	264d490e06	Merge pull request #39 from Molecule-AI/fix/n1-python-auth-headers fix(security): N1 — Python callers missing auth headers for /workspaces/* routes	2026-04-14 02:25:36 -07:00

1 2 3

126 Commits