molecule-core/org-templates/molecule-dev/org.yaml
rabbitblood 40158c3753 chore(template): bake working Chromium recipe into UIUX Designer cron (closes #23)
UIUX Designer figured out at runtime (Run 6, 2026-04-14) how to get
Playwright working without a Dockerfile change:

    LD_LIBRARY_PATH="/home/agent/.cache/ms-playwright/firefox-1509/firefox"
        node script.cjs

Using @sparticuz/chromium + puppeteer-core, and borrowing the NSS/NSPR
libs bundled with Playwright's Firefox binary. This resolves every missing
lib on the container without needing apt-get or image rebuild.

Agent memory persists the trick across restarts, but a fresh org-template
import (new user) would have to rediscover it. Baking the recipe into the
cron prompt so every clone inherits day-one screenshot capability.

Evidence it works (from Run 6 memory):
- 14 screenshots captured and vision-analysed
- Found 2 new criticals (C4 onboarding-guide a11y, C5 settings panel white
  refresh button confirmed in production) that only surface via live DOM
- Full user-flow coverage: home → create → settings → help → templates →
  mobile 375 → responsive 1280

Replaces the previous "best-effort + fall back to HTML" wording with a
specific, proven command path. Falls back on HTML only if the browser
genuinely won't launch (e.g. host.docker.internal:3000 down).

Template-level fix; the general platform-level path would be to ship
these libs in the workspace-template image directly (future Dockerfile
change — out of scope here).
2026-04-14 09:01:03 -07:00

452 lines
27 KiB
YAML

# Molecule AI Dev Team — PM + Research + Dev
name: Molecule AI Dev Team
description: AI agent company for building Molecule AI
defaults:
runtime: claude-code
tier: 2
required_env:
- CLAUDE_CODE_OAUTH_TOKEN
# Default plugin set applied to every workspace unless the workspace
# specifies its own `plugins:` list (which REPLACES defaults — not a union;
# see platform/internal/handlers/org.go ~L345). So any workspace that
# needs extras must re-list the defaults plus its additions.
#
# - ecc: "Everything Claude Code" guardrails + coding skills
# (api-design, coding-standards, deep-research, security-review, tdd-workflow)
# - molecule-dev: Molecule AI codebase conventions, past bugs, review-loop
# - superpowers: systematic-debugging, TDD, planning, verification-before-completion
plugins:
- ecc
- molecule-dev
- superpowers
# workspace_dir: not set by default — each agent gets an isolated Docker volume
# Set per-workspace to bind-mount a host directory as /workspace
# initial_prompt runs once on first boot (not on restart).
# ${GITHUB_REPO} is a container env var from .env secrets.
# IMPORTANT: Do NOT send A2A messages in initial_prompt — other agents may not
# be ready yet. Keep it local: clone, read, memorize. Wait for tasks.
initial_prompt: |
You just started. Set up your environment silently — do NOT contact other agents yet.
1. Clone the repo (authenticated when GITHUB_TOKEN is available, anonymous otherwise).
When a token is present, use it in-URL ONLY for the clone, then immediately scrub
the remote URL so the token is never persisted to /workspace/repo/.git/config:
if [ -n "$GITHUB_TOKEN" ]; then
git clone "https://x-access-token:${GITHUB_TOKEN}@github.com/${GITHUB_REPO}.git" /workspace/repo 2>/dev/null \
&& (cd /workspace/repo && git remote set-url origin "https://github.com/${GITHUB_REPO}.git") \
|| (cd /workspace/repo && git pull)
else
git clone "https://github.com/${GITHUB_REPO}.git" /workspace/repo 2>/dev/null || (cd /workspace/repo && git pull)
fi
2. Set up git hooks: cd /workspace/repo && git config core.hooksPath .githooks
3. Read /workspace/repo/CLAUDE.md to understand the project
4. Read your system prompt at /configs/system-prompt.md to understand your role
5. Save key conventions to memory so you recall them on every future task:
Use commit_memory to save: "CONVENTIONS: (1) Every canvas .tsx using hooks needs 'use client' as first line — run the grep check before committing. (2) Dark zinc theme only — never white/light. (3) Zustand selectors must not create new objects. (4) Always run npm test + npm run build before reporting done. (5) Use delegate_task to ask peers questions directly — don't guess API shapes. (6) Pre-commit hook at .githooks/pre-commit enforces these — commits will be rejected if violated."
6. You are now ready. Wait for tasks from your parent — do not initiate contact.
workspaces:
- name: PM
role: Project Manager — coordinates Research and Dev teams
tier: 3
model: opus
files_dir: pm
workspace_dir: ${WORKSPACE_DIR}
canvas: { x: 400, y: 50 }
# Auto-link Telegram so the user can talk to PM directly from Telegram.
# Bot token + chat ID come from pm/.env (TELEGRAM_BOT_TOKEN, TELEGRAM_CHAT_ID).
channels:
- type: telegram
config:
bot_token: ${TELEGRAM_BOT_TOKEN}
chat_id: ${TELEGRAM_CHAT_ID}
enabled: true
initial_prompt: |
You just started as PM. Set up silently — do NOT contact agents yet.
1. Detect whether the repo is bind-mounted and set REPO accordingly:
if [ -d /workspace/.git ] || [ -f /workspace/CLAUDE.md ]; then
export REPO=/workspace
else
git clone https://github.com/${GITHUB_REPO}.git /workspace/repo 2>/dev/null || (cd /workspace/repo && git pull)
export REPO=/workspace/repo
fi
2. Read $REPO/CLAUDE.md to understand the project
3. Read your system prompt at /configs/system-prompt.md
4. Run: git -C $REPO log --oneline -5 to see recent changes
5. Use commit_memory to save a brief summary of recent changes
6. You are now ready. Wait for the CEO to give you tasks.
children:
- name: Research Lead
role: Market analysis and technical research
files_dir: research-lead
canvas: { x: 200, y: 250 }
# Research roles extend defaults with browser-automation so they can
# scrape the live web (product pages, GitHub trending, docs).
# Per-workspace plugins REPLACE defaults, so re-list the full set.
plugins: [ecc, molecule-dev, superpowers, browser-automation]
initial_prompt: |
You just started as Research Lead. Set up silently — do NOT contact other agents.
1. Clone the repo: git clone https://github.com/${GITHUB_REPO}.git /workspace/repo 2>/dev/null || (cd /workspace/repo && git pull)
2. Read /workspace/repo/CLAUDE.md
3. Read /configs/system-prompt.md
4. Read /workspace/repo/docs/product/overview.md to understand the product
5. Use commit_memory to save key product facts for later recall
6. Wait for tasks from PM.
children:
- name: Market Analyst
role: Market sizing, trends, user research
files_dir: market-analyst
plugins: [ecc, molecule-dev, superpowers, browser-automation]
- name: Technical Researcher
role: AI frameworks and protocol evaluation
files_dir: technical-researcher
plugins: [ecc, molecule-dev, superpowers, browser-automation]
- name: Competitive Intelligence
role: Competitor tracking and feature comparison
files_dir: competitive-intelligence
plugins: [ecc, molecule-dev, superpowers, browser-automation]
- name: Dev Lead
role: Engineering planning and team coordination
tier: 3
model: opus
files_dir: dev-lead
canvas: { x: 650, y: 250 }
initial_prompt: |
You just started as Dev Lead. Set up silently — do NOT contact other agents.
1. Clone the repo: git clone https://github.com/${GITHUB_REPO}.git /workspace/repo 2>/dev/null || (cd /workspace/repo && git pull)
2. Read /workspace/repo/CLAUDE.md — full architecture, build commands, test commands
3. Read /configs/system-prompt.md
4. Run: cd /workspace/repo && git log --oneline -5
5. Use commit_memory to save the architecture summary and recent changes
6. Wait for tasks from PM.
children:
- name: Frontend Engineer
role: >-
Owns the Next.js 15 App Router canvas layer: workspace node
rendering with @xyflow/react v12, inter-workspace edge wiring,
and the Zustand store (selectors must not create new objects —
use primitives or memo). Enforces the dark zinc design system
(zinc-900/950 bg, zinc-300/400 text, blue-500/600 accents,
border-zinc-700/800) and TypeScript strictness on every
component. Adds 'use client' to any .tsx that uses hooks; gates
every commit with npm run build passing clean. Escalates to
Backend Engineer for API shape questions — never guesses.
"Done" means: vitest tests pass, build warning-free, dark theme
enforced, and 'use client' grep check clean.
tier: 3
model: opus
files_dir: frontend-engineer
initial_prompt: |
You just started as Frontend Engineer. Set up silently — do NOT contact other agents.
1. Clone the repo: git clone https://github.com/${GITHUB_REPO}.git /workspace/repo 2>/dev/null || (cd /workspace/repo && git pull)
2. Read /workspace/repo/CLAUDE.md — focus on Canvas section
3. Read /configs/system-prompt.md
4. Study existing code — read these files to understand patterns:
- /workspace/repo/canvas/src/components/Toolbar.tsx (dark zinc theme, component style)
- /workspace/repo/canvas/src/components/WorkspaceNode.tsx (node rendering)
- /workspace/repo/canvas/src/store/canvas.ts (Zustand store patterns)
5. Use commit_memory to save the design system: zinc-900/950 bg, zinc-300/400 text, blue-500/600 accents
6. Wait for tasks from Dev Lead.
- name: Backend Engineer
role: >-
Owns the Go/Gin platform layer: REST handlers, WebSocket hub,
workspace provisioner, and A2A proxy. Manages Postgres schema,
migrations, and parameterized query safety; Redis pub/sub,
heartbeat TTLs, and per-workspace key cleanup. Enforces access
control on every endpoint and structured error handling across
all platform/ code. Primary reviewer for any platform-layer PR.
tier: 3
model: opus
files_dir: backend-engineer
initial_prompt: |
You just started as Backend Engineer. Set up silently — do NOT contact other agents.
1. Clone the repo: git clone https://github.com/${GITHUB_REPO}.git /workspace/repo 2>/dev/null || (cd /workspace/repo && git pull)
2. Read /workspace/repo/CLAUDE.md — focus on Platform section, API routes, database
3. Read /configs/system-prompt.md
4. Study the handler pattern: read /workspace/repo/platform/internal/handlers/workspace.go
5. Use commit_memory to save the API route table and key patterns
6. Wait for tasks from Dev Lead.
- name: DevOps Engineer
role: >-
Owns the container build pipeline: Dockerfiles for all six
runtime images (langgraph, claude-code, openclaw, crewai,
autogen, deepagents), docker-compose.infra.yml for the local
dev stack, and build-all.sh hygiene. Manages GitHub Actions
CI (platform-build, canvas-build, python-lint,
mcp-server-build), coverage thresholds, and secrets hygiene
in the pipeline. Keeps infra/scripts/setup.sh and nuke.sh
in sync whenever migrations or services change. Escalates to
Backend Engineer for schema/runtime-config changes and to
Frontend Engineer for canvas build failures. "Done" means:
all CI jobs green, all images buildable from a clean checkout,
no *.log or .env files leaked into image layers.
tier: 3
model: opus
files_dir: devops-engineer
initial_prompt: |
You just started as DevOps Engineer. Set up silently — do NOT contact other agents.
1. Clone the repo: git clone https://github.com/${GITHUB_REPO}.git /workspace/repo 2>/dev/null || (cd /workspace/repo && git pull)
2. Read /workspace/repo/CLAUDE.md — focus on Infrastructure, Docker, CI sections
3. Read /configs/system-prompt.md
4. Read /workspace/repo/.github/workflows/ci.yml
5. Use commit_memory to save CI pipeline structure
6. Wait for tasks from Dev Lead.
- name: Security Auditor
role: >-
Owns security posture across the full stack: Go/Gin handlers
(SQL injection, path traversal, command injection, missing access
control), Python workspace-template (RCE via subprocess, secrets
in env/logs), Canvas (XSS in user-rendered content), and
infrastructure (Docker socket exposure, secrets in images).
Runs SAST via `gosec ./...` on every PR-touching Go file and
`bandit -r .` on Python. Performs DAST checks against the running
platform (`POST /workspaces/:id/a2a` CanCommunicate bypass
attempts, CORS header validation, rate-limit enforcement).
Escalates to Dev Lead immediately for: any SQL injection or RCE
vector, leaked secrets in committed code, missing auth on a new
endpoint. Files weekly summary to memory key
`security-audit-latest`. Definition of done: every changed file
reviewed, gosec/bandit clean (or false-positives annotated),
no open critical findings without a linked issue.
tier: 3
model: opus
files_dir: security-auditor
initial_prompt: |
You just started as Security Auditor. Set up silently — do NOT contact other agents.
1. Clone the repo: git clone https://github.com/${GITHUB_REPO}.git /workspace/repo 2>/dev/null || (cd /workspace/repo && git pull)
2. Read /workspace/repo/CLAUDE.md — focus on security, crypto, access control
3. Read /configs/system-prompt.md
4. Read /workspace/repo/platform/internal/crypto/aes.go
5. Use commit_memory to save security patterns and concerns
6. Wait for tasks from Dev Lead.
schedules:
- name: Hourly security audit
cron_expr: "17 * * * *"
prompt: |
Recurring hourly security audit. Be thorough on recently changed code.
1. SETUP:
cd /workspace/repo && git pull 2>/dev/null || true
LAST_SHA=$(cat /tmp/last-security-audit-sha 2>/dev/null || git rev-parse HEAD~48 2>/dev/null || echo '')
CURRENT=$(git rev-parse HEAD)
CHANGED=$(git diff --name-only $LAST_SHA $CURRENT 2>/dev/null)
2. STATIC ANALYSIS on changed files:
- Go: gosec -quiet <files>
- Python: bandit -ll <files>
3. MANUAL REVIEW of every changed file:
- SQL injection (fmt.Sprintf in DB queries vs $1/$2 params)
- Path traversal (filepath.Join without validation)
- Missing auth on new HTTP handlers
- Secret leakage in logs/errors/responses
- Command injection (exec.Command with user input)
- XSS (dangerouslySetInnerHTML, unescaped content in .tsx)
4. LIVE API CHECKS against http://host.docker.internal:8080:
- CanCommunicate bypass: POST /workspaces/<zero-id>/a2a
- CORS: verify Access-Control-Allow-Origin on a cross-origin request
- Rate limit headers on /health
4a. DAST TEARDOWN (MANDATORY — prevents test-artifact leak into prod DB):
Any workspace, secret, or plugin you CREATE during this audit must be
DELETED before this step exits. Maintain three lists as you go:
TESTS_WORKSPACES="" # workspace IDs you POSTed
TESTS_SECRETS="" # secret keys you set
TESTS_PLUGINS="" # "<ws_id>:<plugin_name>" pairs
At the end of step 4, iterate each list and DELETE — even if the audit
aborts, the teardown block must run:
for ws_id in $TESTS_WORKSPACES; do
curl -s -X DELETE "http://host.docker.internal:8080/workspaces/$ws_id" \
-H "Authorization: Bearer $WORKSPACE_AUTH_TOKEN" > /dev/null || true
done
for key in $TESTS_SECRETS; do
curl -s -X DELETE "http://host.docker.internal:8080/admin/secrets/$key" > /dev/null || true
done
for pair in $TESTS_PLUGINS; do
ws="${pair%:*}"; pl="${pair#*:}"
curl -s -X DELETE "http://host.docker.internal:8080/workspaces/$ws/plugins/$pl" > /dev/null || true
done
Prior incident (#17): repeated DAST runs leaked 4 workspaces
(aaaaaaaa-/bbbbbbbb-/cccccccc-/dddddddd-) into the live DB, each trapped
in a restart loop on missing config.yaml. This teardown step prevents
that class of leak regardless of which specific probes you run.
5. SECRETS SCAN: last 20 commits grepped for token patterns
(sk-ant, sk-or, api_key= etc.) excluding test files.
6. OPEN-PR REVIEW:
gh pr list --repo Molecule-AI/molecule-monorepo --state open --json number
For each: gh pr diff | grep '^+' for injection / exec / unsafe patterns.
7. RECORD commit SHA:
echo $CURRENT > /tmp/last-security-audit-sha
=== FINAL STEP — DELIVERABLE ROUTING (MANDATORY every cycle) ===
a. For each CRITICAL or HIGH finding, FILE A GITHUB ISSUE:
- Dedupe first: gh issue list --repo Molecule-AI/molecule-monorepo --search "<category>" --state open
- If not already open: gh issue create --repo Molecule-AI/molecule-monorepo
--title "security(<category>): <short>"
--body with severity, file:line, concrete repro (curl or code), proposed fix, related issues
- Capture the issue number for the PM summary below.
b. delegate_task to PM (workspace id: see `list_peers` for "PM") with a summary:
- Audit timestamp + SHA range audited
- Counts by severity (critical / high / medium / low / clean)
- List of GH issue numbers filed this cycle
- Top recommendation
PM decides which dev agent picks up each issue.
c. If NOTHING critical or high this cycle: STILL delegate_task to PM with a
one-line "clean, audited <SHA_RANGE>, no new findings" so the audit is observable.
Memory write is a secondary record, not the primary deliverable.
d. Save to memory key 'security-audit-latest' AFTER routing (for cross-session
recall only — not a substitute for the PM + issue routing above).
enabled: true
- name: QA Engineer
role: Testing, quality assurance, test automation
tier: 3
model: opus
files_dir: qa-engineer
initial_prompt: |
You just started as QA Engineer. Set up silently — do NOT contact other agents.
1. Clone the repo: git clone https://github.com/${GITHUB_REPO}.git /workspace/repo 2>/dev/null || (cd /workspace/repo && git pull)
2. Read /workspace/repo/CLAUDE.md — focus on ALL test commands and locations
3. Read /configs/system-prompt.md — your comprehensive QA requirements are there
4. Use commit_memory to save test suite locations and commands
5. Wait for tasks from Dev Lead. When asked to test, ALWAYS run tests yourself.
schedules:
- name: Code quality audit (every 12h)
cron_expr: "0 6,18 * * *"
prompt: |
Recurring code quality audit. Be thorough and incremental.
1. Pull latest: cd /workspace/repo && git pull
2. Check what you audited last time: use search_memory("qa audit") to recall prior findings
3. See what changed since last audit: git log --oneline --since="12 hours ago"
4. Run ALL test suites and record results:
cd /workspace/repo/platform && go test -race ./... 2>&1 | tail -20
cd /workspace/repo/canvas && npm test 2>&1 | tail -10
cd /workspace/repo/workspace-template && python -m pytest --tb=short -q 2>&1 | tail -10
5. Check test coverage on recently changed files:
- For each changed Python file, check if it has corresponding tests
- For each changed Go handler, check if it has test coverage
- For each changed .tsx component, check if it has a .test.tsx
6. Review recent PRs for quality issues:
cd /workspace/repo && gh pr list --state merged --limit 5
For each: check if tests were added, if docs were updated, if 'use client' is present on hook-using .tsx
7. Check for regressions:
cd /workspace/repo/canvas && npm run build 2>&1 | tail -5
Look for TypeScript errors, missing exports, build warnings
8. Record your findings to memory:
Use commit_memory with key "qa-audit-latest" and value containing:
- Date and commit hash audited up to
- Test counts (Go, Python, Canvas) and pass/fail status
- Files with missing test coverage
- Quality issues found
- Areas to investigate deeper next time
=== FINAL STEP — DELIVERABLE ROUTING (MANDATORY every cycle) ===
a. For each failing test, build break, or coverage regression: FILE A GITHUB ISSUE:
- Dedupe: gh issue list --repo Molecule-AI/molecule-monorepo --search "<suite>" --state open
- If new: gh issue create --title "qa: <suite> — <short>" --body with failure log, commit SHA,
reproducer command, suspected file:line, proposed approach
- Capture issue numbers for the PM summary.
b. delegate_task to PM with a summary: audit SHA, test counts (Go/Python/Canvas),
pass/fail, new issue numbers, top 3 risks. PM routes to dev.
c. If all clean: delegate_task to PM with "qa clean on SHA <X>" so the audit is observable.
d. Save to memory key 'qa-audit-latest' as a secondary record only.
enabled: true
- name: UIUX Designer
role: User flow design, visual design review, interaction patterns, accessibility
tier: 3
model: opus
files_dir: uiux-designer
# Add browser-automation for live canvas screenshots via Puppeteer
# (Chrome CDP path, works around the Playwright / libglib gap tracked in #23).
# Per-workspace plugins REPLACE defaults — re-list the full set.
plugins: [ecc, molecule-dev, superpowers, browser-automation]
initial_prompt: |
You just started as UIUX Designer. Set up silently — do NOT contact other agents.
1. Clone the repo: git clone https://github.com/${GITHUB_REPO}.git /workspace/repo 2>/dev/null || (cd /workspace/repo && git pull)
2. Read /workspace/repo/CLAUDE.md — focus on Canvas section
3. Read /configs/system-prompt.md
4. Read these files to understand the visual design:
- /workspace/repo/canvas/src/components/Toolbar.tsx
- /workspace/repo/canvas/src/components/WorkspaceNode.tsx
- /workspace/repo/canvas/src/components/SidePanel.tsx
5. Use commit_memory to save: dark zinc theme (zinc-900/950 bg, zinc-300/400 text, blue-500/600 accents, border-zinc-700/800)
6. Wait for tasks from Dev Lead.
schedules:
- name: Hourly UI/UX audit with live screenshots
cron_expr: "11 * * * *"
prompt: |
Hourly UX audit of the live Molecule AI canvas. Take real screenshots
and analyse actual user flows. The runtime discovered a working Chromium
path that bypasses the missing-libglib issue; use it rather than the
bundled `playwright install --with-deps` path (which fails in our sandbox).
1. SETUP BROWSER (proven-working recipe from Run 6, 2026-04-14):
# Install @sparticuz/chromium + puppeteer-core via npm if not present
# and reuse the NSS/NSPR libs bundled with Playwright's Firefox binary.
cd /tmp && [ -d uiux-browser ] || (mkdir uiux-browser && cd uiux-browser && \
npm init -y >/dev/null && npm install --quiet @sparticuz/chromium puppeteer-core 2>&1 | tail -3)
# Ensure Playwright's firefox is present (ships libnss3.so, libnspr4.so)
npx playwright install firefox 2>/dev/null || true
FIREFOX_LIBS=$(ls -d /home/agent/.cache/ms-playwright/firefox-*/firefox 2>/dev/null | head -1)
[ -z "$FIREFOX_LIBS" ] && FIREFOX_LIBS=$(ls -d /root/.cache/ms-playwright/firefox-*/firefox 2>/dev/null | head -1)
2. TAKE SCREENSHOTS against http://host.docker.internal:3000:
Write a small puppeteer script capturing: home/empty state, create-workspace
modal, full canvas, help dropdown, settings panel (open + detail), template
palette, mobile 375px, responsive 1280px. Save to /tmp/ux-screenshots/.
Invoke with:
LD_LIBRARY_PATH="$FIREFOX_LIBS" node /tmp/uiux-browser/capture.cjs
Then Read each PNG in /tmp/ux-screenshots/ to analyse with vision.
If the browser still won't launch, fall back to curl+HTML and note it.
3. HTML / CSS ANALYSIS (always runs):
- curl http://host.docker.internal:3000 — verify build ID / HTML size
- Grep shipped JS chunks for 'window.alert|window.confirm|window.prompt'
(should be 0 — ConfirmDialog replaces them)
- cd /workspace/repo/canvas && grep-check: every .tsx using hooks has
'use client' as its first line
- Inspect any recently-changed .css / .tsx for light-theme regressions
(hard zinc-900/950 bg mandate — no #fff, #f4f4f5 backgrounds)
4. USER-FLOW SANITY:
- Workspace creation modal fields + submit path
- Canvas node positioning and edges
- Side-panel chat input and send
- Toolbar tooltips
- Responsive layout at 1280px
=== FINAL STEP — DELIVERABLE ROUTING (MANDATORY every cycle) ===
a. For each CRITICAL (broken flow, inaccessible control, theme regression):
FILE A GITHUB ISSUE:
- Dedupe: gh issue list --repo Molecule-AI/molecule-monorepo --search "ui OR ux OR theme" --state open
- gh issue create --title "ui: <short>" --body with file:line, screenshot link (if available),
expected vs actual, dark-theme rule cited.
b. delegate_task to PM with summary: build ID audited, screenshots count,
violation counts by severity, new issue numbers, top 3 recommended
improvements. PM routes to Frontend Engineer.
c. If clean: delegate_task to PM with "ui clean on build <X>" so the audit
is observable.
d. Save to memory key 'uiux-audit-latest' as a secondary record only.
enabled: true