# 2026-04-10 Session ## Summary Documentation maintenance for the new long-form Molecule AI product and technical narratives: moved both repository-root drafts into the VitePress docs tree, added sidebar/homepage entry points so they are discoverable from the docs site, and linked them from the product overview for ongoing maintenance inside `docs/`. Also brought the landing-page messaging report under docs maintenance by tracking `docs/product/landing-messaging-report.md` in git and adding it to the product navigation surface. ## Changes ### New Long-Form Docs Added To `docs/` - Moved `MOLECULE_PRODUCT_DOC.md` into `docs/product/molecule-product-doc.md` - Moved `MOLECULE_TECHNICAL_DOC.md` into `docs/architecture/molecule-technical-doc.md` - Kept the full source content intact while relocating it into the maintained docs structure ### VitePress Navigation Updated - `docs/.vitepress/config.ts` - Added `Product Narrative` under the Product sidebar group - Added `Landing Messaging Report` under the Product sidebar group - Added `Technical Documentation` under the Architecture sidebar group ### Docs Entry Points Updated - `docs/index.md` - Added homepage recommended-reading links for the new product and technical documents ### Product Overview Cross-Links Updated - `docs/product/overview.md` - Added direct links to the product narrative, landing messaging report, and comprehensive technical documentation ### Additional Product Doc Tracked - Added `docs/product/landing-messaging-report.md` to version control under the Product docs section ## Files Changed - `docs/.vitepress/config.ts` - `docs/index.md` - `docs/product/overview.md` - `docs/product/landing-messaging-report.md` - `docs/product/molecule-product-doc.md` - `docs/architecture/molecule-technical-doc.md` - `docs/edit-history/2026-04-10.md` (new) --- ## CEO Session — Infrastructure Audit + Chain Break Fix ### Infra Audit (fix/infra-audit-critical — PR #5) Comprehensive codebase audit identified 19 issues across 4 priority levels. Critical fixes: 1. **Race condition in crypto/aes.go** — `encryptionKey` global accessed without sync. Fixed with `sync.Once`. Added `ResetForTesting()` for tests. 2. **Missing DB indexes** — Migration 014: `workspaces(parent_id)`, `workspaces(status)`, `canvas_layouts(workspace_id)`. Speeds up hierarchy queries, cascade deletes, list/get joins. 3. **N+1 cascade delete** — Replaced per-child `UPDATE`+`DELETE` loop with recursive CTE batch query. Docker stops still per-child. 4. **CI linting** — Added `golangci-lint` step (continue-on-error until codebase clean). ### Chain Break Root Cause + Fix **Problem:** Delegation chain died after first result. PM delegated to Dev Lead + QA, results completed, heartbeat wrote results file — but PM was never woken again. **Root cause:** Self-message cooldown was 5 minutes. First delegation triggered a self-message within the window. All subsequent completions were blocked by cooldown. PM never woke up to report. **Fix:** Reduced `SELF_MESSAGE_COOLDOWN` from 300s to 60s. With 30s heartbeat cycles, new results trigger a self-message within 1-2 cycles. Results file dedup prevents double-processing. ### Agent-Authored PRs Received Agents autonomously created PRs while CEO did infra work: - **PR #3** — Settings Panel (Frontend Engineer): 34 files, 279 tests, full UX spec implementation - **PR #4** — Onboarding Interception (Frontend Engineer): 10 files, 1362 additions, deploy preflight + missing keys modal ### Monitoring - 13/13 workspaces online throughout session - Heartbeats active (Redis TTL refreshing) - Frontend Engineer + QA Engineer were actively processing tasks - No container crashes, no degraded workspaces ## Files Changed (CEO Session) - `workspace-server/internal/crypto/aes.go` (sync.Once) - `workspace-server/internal/crypto/aes_test.go` (ResetForTesting) - `workspace-server/internal/handlers/workspace.go` (recursive CTE delete) - `workspace-server/internal/handlers/workspace_test.go` (updated mocks) - `workspace-server/migrations/014_indexes.sql` (new — 3 indexes) - `.github/workflows/ci.yml` (golangci-lint) - `workspace/heartbeat.py` (60s cooldown, parent reporting, cached lookup) - `workspace-server/internal/handlers/plugins_test.go` (new — 16 tests) - `CLAUDE.md` (test counts: Go 365+, Python 869, migration 14) - `docs/api-protocol/registry-and-heartbeat.md` (delegation checking section) ### Delegation Chain — Last Mile Fix **Problem:** PM received delegation results but never reported to CEO. The heartbeat self-message said "report back to them" without specifying who. **Fix:** Heartbeat looks up parent workspace name (cached after first call) and includes explicit instruction: "Report these results back to your parent 'CEO'." This closes the full chain: CEO → PM → team → results → PM wakes → reports to CEO. ### Plugins Handler Tests (16 new) Covered: ListRegistry (empty/nonexistent/with plugins), Install validation (missing name, path traversal, not found), Uninstall validation, validatePluginName (valid/slash/dotdot/backslash/empty), parseManifestYAML (valid/invalid/minimal). ### Agent PRs Completed Team autonomously completed test plan checklists: - PR #3 (Settings Panel): 9/9 tasks ✅ - PR #4 (Onboarding): 10/10 tasks ✅ Chain worked: CEO → PM → Dev Lead → FE + QA → PRs updated → all checklists done → PM reported back. ### Root Scripts Cleanup Deleted 4 dead scripts replaced by platform features: - `setup-org.sh`, `setup_reno_stars.sh` → `POST /org/import` - `import-ecc.sh` → plugin system - `scripts/setup-default-org.sh` → `POST /org/import` Moved utility scripts to `scripts/`: `import-agent.sh`, `bundle-compile.sh` Moved 5 E2E test scripts to `tests/e2e/`: `test_api.sh` (62 tests), `test_a2a_e2e.sh` (22), `test_activity_e2e.sh` (25), `test_claude_code_e2e.sh`, `test_comprehensive_e2e.sh` (68). Updated CLAUDE.md paths. ### PR #3 + #4 Code Review Delegated CEO reviewed both PRs and found 6 critical bugs + 9 warnings. Delegated fixes through PM → Dev Lead → FE. Both PRs updated at 4:50 with fixes in progress. ### Provisioner Stale Image Fix **Root cause:** Docker's `unless-stopped` restart policy races with provisioner's Stop → Start sequence. Old container restarts before `ContainerRemove` completes, blocking `ContainerCreate`. Result: old image keeps running after rebuild. **Fix:** Pre-emptive `ContainerRemove(force: true)` before `ContainerCreate` — kills any stale container from restart policy. Added image ID logging on create and start for immediate visibility of stale-image issues. ### PRs #3 + #4 Reverted Agent-authored PRs had too many integration bugs (infinite re-renders, wrong API format, white theme on dark canvas). Reverted both via cherry-pick rebuild of main. ### Template Runtime Detection Bug **Problem:** Deploying "Claude Code Agent" from the template palette started a `langgraph` container instead of `claude-code`. The agent error was `[Errno 2] No such file or directory: '/claude'`. **Root cause:** `workspace.go:Create` defaulted `payload.Runtime` to `"langgraph"` (line 50-52) **before** reading the template's `config.yaml`. The later detection block (line 142) checked `if payload.Runtime == ""` but it was already set, so the template's `runtime: claude-code` was never used. **Fix:** Moved runtime detection from template config.yaml to **before** the DB insert and before the default fallback. Removed the now-dead duplicate detection block in the provisioning section. Added debug log when config.yaml read fails. ### Branding + License - Replaced gradient "S" square in toolbar with actual Molecule AI flame icon (`/molecule-icon.png`) - Added Molecule AI favicon (`canvas/src/app/icon.png`) - Added BSL 1.1 LICENSE file — personal/non-commercial use OK, no competing SaaS, converts to Apache 2.0 on 2029-01-01 - Updated README badge and license section ### AutoGen Adapter `'kwargs'` Fix **Problem:** Deploying AutoGen Agent from template palette resulted in `AutoGen error: 'kwargs'` on every message. **Root cause:** `_langchain_to_autogen()` wrapped LangChain tools as `async def wrapper(**kwargs)`. AutoGen 0.7.5's `FunctionTool` introspects function signatures with type hints — `**kwargs` has no type annotation, causing `KeyError: 'kwargs'` in `_function_utils.py`. **Fix:** Replaced `**kwargs` wrapper with typed `async def _invoke(input: str) -> str` and used `autogen_core.tools.FunctionTool` directly. JSON parsing bridges structured input for tools that expect dicts. ### Chat Duplicate Messages Fix **Problem:** Sending a message showed the agent response twice in the chat. **Root cause:** Two paths both added the response: (1) WebSocket `A2A_RESPONSE` handler in ChatTab, and (2) Zustand store's `pendingA2AResponse` effect. Both fired from the same event. **Fix:** Removed the duplicate WebSocket handler in ChatTab — the store effect is the canonical path. ### Canvas Pan-to-Node on Deploy New workspaces now appear near center and the canvas smoothly pans to them on deploy instead of placing them all at (0,0). ### Docs Cleanup Deleted 6 UX spec files for reverted Phase 20 features (settings panel, onboarding interception, deploy interception) — no longer in codebase. ### Initial Prompt System New feature: agents can auto-execute a configurable prompt on startup — before any user interaction. **Architecture:** - `config.py`: new `initial_prompt` field (string or `initial_prompt_file` reference) - `main.py`: after server ready, sends initial_prompt as A2A `message/send` to self - `org.go`: `InitialPrompt` on `OrgDefaults` and `OrgWorkspace` structs with JSON+YAML tags; injected into config.yaml as YAML block scalar during org import - Org template: per-agent initial prompts instruct dev agents to clone repo, read CLAUDE.md, study codebase, and report ready **Manual E2E verified:** 12 agents deployed, 11/11 non-PM agents cloned repo to `/workspace/repo/`, PM has repo at `/workspace` (bind-mounted). All 12 have codebase access. ### Runtime Change on Restart Fix **Problem:** Comprehensive E2E test "Runtime change langgraph→deepagents on restart" failed — container kept using old image. **Root cause:** `workspace_restart.go` read runtime from DB (`COALESCE(runtime, 'langgraph')`) but when the user changes `config.yaml` runtime, the DB is never updated. Also, `ExecRead` was called after `Stop()` (container already stopped). **Fix:** Read config.yaml runtime from running container *before* stopping it. If runtime differs from DB, update DB. Use `configDirName(id)` for container name (not raw workspace ID). ### QA System Prompt Overhaul Comprehensive rewrite: never trust self-reported results, must clone repo independently, run ALL test suites to 100% green, E2E tests required, visual style verification against dark zinc theme, red flags checklist. ### Org Struct JSON Tags Added `json` tags to `OrgTemplate`, `OrgDefaults`, and `OrgWorkspace` structs — without them, JSON POST bodies couldn't populate `initial_prompt` and other snake_case fields. ## Files Changed - `workspace-server/internal/handlers/workspace.go` — runtime detection before DB insert - `workspace-server/internal/handlers/workspace_restart.go` — read runtime from container config before stop - `workspace-server/internal/handlers/org.go` — InitialPrompt field, JSON tags, config.yaml injection - `workspace-server/internal/handlers/org_test.go` — 5 new tests (YAML parsing, injection, special chars) - `workspace/config.py` — initial_prompt field + file reference - `workspace/main.py` — auto-send initial_prompt after server ready - `workspace/tests/test_config.py` — 5 new tests (inline, file, precedence, default, missing) - `workspace/cli_executor.py` — __del__ getattr guard - `workspace/adapters/autogen/adapter.py` — FunctionTool wrapper - `workspace/tests/test_common_setup.py` — autogen skipif + FunctionTool assertions - `org-templates/molecule-dev/org.yaml` — per-agent initial prompts - `org-templates/molecule-dev/qa-engineer/system-prompt.md` — comprehensive QA rewrite - `canvas/src/components/Canvas.tsx` — pan-to-node on deploy - `canvas/src/components/Toolbar.tsx` — Molecule AI icon - `canvas/src/components/tabs/ChatTab.tsx` — remove duplicate A2A_RESPONSE handler - `canvas/src/store/canvas-events.ts` — node position offset + pan event + window guard - `canvas/src/store/__tests__/canvas.test.ts` — relaxed position assertion - `canvas/src/lib/api/__tests__/secrets.test.ts` — match actual API format - `canvas/src/app/icon.png` — favicon - `tests/e2e/test_comprehensive_e2e.sh` — fix secrets test assumption - `.gitignore` — test-results/, playwright-report/ - `LICENSE` — BSL 1.1 - `README.md` — license badge + section - `CLAUDE.md` — template resolution docs, initial prompt section, test counts - Deleted: `docs/ux-specs/*`, `docs/onboarding-interception.md` ### Initial Prompt Cascade Loop Fix **Problem:** 12 agents all executed initial prompts simultaneously on first boot. Each prompt ended with "report ready to parent" — sending A2A messages while other agents were still booting. Under load, containers died → ProxyA2A detected dead containers → triggered auto-restart → new container → initial prompt fired again → cascade loop. **Root cause:** Two issues: (1) initial prompts instructed agents to send A2A messages during boot, (2) initial prompt re-executed on every restart (no idempotency guard). **Fixes:** - `main.py`: writes `.initial_prompt_done` marker file after first execution. Skips on restart. - `org.yaml`: rewrote all 12 agent prompts — no outbound A2A, no test suite runs during boot. Agents clone repo, read docs, save to `commit_memory`, then wait for tasks. - `workspace_restart.go`: fixed misleading "after secret change" log in `RestartByID` (called by multiple paths, not just secrets). ### Chat Separation: My Chat + Agent Comms Refactored ChatTab into two sub-tabs: - **My Chat**: user↔agent conversation only (`source=canvas` filter) - **Agent Comms**: agent↔agent A2A traffic (`source=agent` filter), read-only, live WebSocket updates **Backend:** Added `source` query param to `GET /workspaces/:id/activity` — `canvas` filters `source_id IS NULL`, `agent` filters `source_id IS NOT NULL`. Invalid values return 400. **Initial prompt fix:** Routes through platform A2A proxy instead of self-send, so the prompt appears as a proper user message in chat history (logged with `source_id=NULL`). Removed `/notify` push code — proxy's `A2A_RESPONSE` broadcast handles delivery. **Shared helper:** Extracted `extractRequestText()` into `message-parser.ts` — used by both ChatTab and AgentCommsPanel. ## Files Changed (Chat Separation) - `workspace-server/internal/handlers/activity.go` — `source` query param + validation - `workspace/main.py` — route initial prompt through proxy, remove /notify - `canvas/src/components/tabs/ChatTab.tsx` — sub-tab container + MyChatPanel - `canvas/src/components/tabs/chat/AgentCommsPanel.tsx` — new agent comms view - `canvas/src/components/tabs/chat/message-parser.ts` — shared `extractRequestText()` ### Claude Code Adapter: CLI Subprocess → Claude Agent SDK Migration Replaced the `claude-code` runtime's subprocess-based `CLIAgentExecutor` with a new `ClaudeSDKExecutor` that uses the official `claude-agent-sdk` Python package. The SDK wraps the same Claude Code engine, so plugins/skills/CLAUDE.md still work — but eliminates subprocess fragility (stdout buffering, zombie processes, session-ID parsing, ~500ms startup overhead). **New files:** - `workspace/claude_sdk_executor.py` — `ClaudeSDKExecutor` with asyncio.Lock serialization, cooperative cancel, `QueryResult` dataclass, session resume via SDK - `workspace/executor_helpers.py` — shared helpers extracted from `cli_executor.py`: memory recall/commit, delegation results, heartbeat, system prompt, error sanitization (`sanitize_agent_error` + `classify_subprocess_error`), markdown-aware `brief_summary`, `extract_message_text` - `workspace/tests/test_claude_sdk_executor.py` — 30 tests including concurrency (timestamp-ordered), cancel (GeneratorExit via async generator), session resume, error sanitization - `workspace/tests/test_executor_helpers.py` — 73 tests for all shared helpers **Modified files:** - `workspace/adapters/claude_code/adapter.py` — `create_executor()` returns `ClaudeSDKExecutor`; removed `shutil.which` CLI check - `workspace/adapters/claude_code/Dockerfile` — pre-installs SDK via `pip install -r requirements.txt` - `workspace/adapters/claude_code/requirements.txt` — added `claude-agent-sdk>=0.1.58` - `workspace/cli_executor.py` — removed `claude-code` from `RUNTIME_PRESETS`, deleted all `self.runtime == "claude-code"` branches (JSON parsing, `--resume`, `--output-format json`, `_session_id`), calls shared helpers directly (no more one-line wrapper methods), uses `sys.executable` for MCP server, regex word-boundary error classification - `workspace/tests/conftest.py` — session-wide `claude_agent_sdk` stub for test imports - `.gitignore` — `.initial_prompt_done`, `.coverage*` **Architecture decisions:** - `asyncio.Lock` on the SDK executor serializes concurrent turns (matches old CLI behavior, keeps session_id race-free) - `ResultMessage.result` preferred over concatenated `AssistantMessage` chunks (avoids doubled pre/post-tool text) - Error sanitization unified: `sanitize_agent_error(exc=..., category=...)` serves both SDK exceptions and CLI subprocess stderr - `classify_subprocess_error()` uses regex word boundaries to avoid false positives (`\brate\b` not `"rate" in`) **Coverage:** 100% on `claude_sdk_executor.py` (110 stmts), `cli_executor.py` (179 stmts), `executor_helpers.py` (154 stmts). Total: 443 stmts, 0 misses. **Live verification:** 12 workspaces restarted on new image. Echo, session resume, Bash tool, TodoWrite, PM→QA MCP delegation, and concurrent requests all verified. Rate-limited on quota (not a code bug). **5 iterative code review passes** caught and fixed: the `_active_stream` race, dead claude-code branches, duplicated A2A instructions, raw-stderr leaks, deprecated `typing.AsyncIterator`, the `_install_fake_sdk` teardown leak, inconsistent error patterns, missing `encoding` args, and 7 other issues across successive rounds. ### Agent Quality Enforcement Stack Built three layers of quality enforcement after observing that agents (same Claude Opus model) missed bugs like `'use client'` directives because they lacked institutional memory and system-level enforcement. **Layer 1: Git pre-commit hook** (`.githooks/pre-commit`) - Rejects commits missing `'use client'` on hook-using `.tsx` files - Rejects light theme colors in canvas components - Rejects SQL injection patterns in Go (`fmt.Sprintf` with SQL) - Rejects leaked secrets (`sk-ant-`, `ghp_`, `AKIA`) - System-enforced — agents cannot bypass **Layer 2: Molecule AI-dev plugin** (`plugins/molecule-dev/`) - `rules/codebase-conventions.md` — injected into every agent's CLAUDE.md with past bugs, patterns, self-check scripts - `skills/review-loop/SKILL.md` — multi-round FE→QA→fix→re-verify workflow for Dev Lead **Layer 3: Awareness memory via initial_prompt** - Key conventions saved to `commit_memory` on first boot - Agents recall them on every future task via memory system - Builds institutional knowledge across sessions **Also shipped:** - SDK executor retry logic (exponential backoff: 5s→10s→20s for rate limits) - Force-remove in `provisioner.Stop()` to prevent restart-policy zombie containers - All 12 agent system prompts rewritten from checklists to senior-engineer expectations - Dev Lead prompt requires UIUX + Security involvement for UI/credential work - Repo made public — removed GITHUB_TOKEN from initial_prompt ### Cron Scheduling System (Phase 22) New feature: users can set up recurring tasks that fire A2A messages to agents on a cron schedule. **Backend:** - `workspace-server/migrations/015_workspace_schedules.sql` — new table with cron_expr, timezone, prompt, enabled, last_run_at, next_run_at, run_count, last_status - `workspace-server/internal/scheduler/scheduler.go` — goroutine polls every 30s, fires due schedules via proxyA2ARequest with `system:scheduler` caller, WaitGroup for completion, semaphore (max 10 concurrent) - `workspace-server/internal/handlers/schedules.go` — 6 REST endpoints: list, create, update (COALESCE-based), delete, run-now, history - `robfig/cron/v3` for cron expression parsing + next-run computation - `proxyA2ARequest` exposed as public method for internal callers - Dedicated `cron_run` activity log entries with schedule metadata for history queries **Frontend:** - `canvas/src/components/tabs/ScheduleTab.tsx` — CRUD UI with create/edit form, cron-to-English helper, status indicators, Run Now button, delete confirmation - Wired into SidePanel as new "Schedule" tab (⏲ icon) **Org template:** - `OrgSchedule` struct in `org.go`, inserted during org import - Example: Security Auditor daily scan in `org-templates/molecule-dev/org.yaml` **E2E verified:** Created every-minute schedule, scheduler fired at next minute boundary, agent received and responded, schedule updated with status=ok + run_count=1. ### Volume Ownership: Root → Gosu Agent Pattern Docker creates volume contents as root, but workspace containers run as UID 1000 (`agent`). This caused `PermissionError` when the adapter tried to write CLAUDE.md with plugin rules. Initially fixed with scattered `chown` hacks in the provisioner and plugin handler, then properly fixed with the standard Docker pattern: - `Dockerfile`: installs `gosu`, removes `USER agent` (entrypoint handles privilege drop) - `entrypoint.sh`: starts as root → `chown -R agent:agent /configs /workspace` → `exec gosu agent` → `python3 main.py` - Removed all band-aid chown calls from provisioner and plugin handler - Verified: 12/12 containers, CLAUDE.md owned by `agent:agent`, plugin rules injected ### Comprehensive Code Review — 13 Issues Fixed + Test Coverage Two-pass code review across the entire repo identified 24 issues. All 13 critical/warning items fixed: **Critical (8):** - `a2a_proxy.go`: ADD access control via `CanCommunicate` for agent-to-agent proxy requests (closing security boundary). Canvas requests (no `X-Workspace-ID`), self-calls, and system callers (`webhook:*`, `system:*`, `test:*`) bypass via explicit `isSystemCaller()` helper. - `org.go`, `delegation.go`: replace `db.DB.Exec()` with `ExecContext` + error checks. Errors no longer silently dropped on inserts/updates. - `activity.go`, `workspace.go`: add `rows.Err()` checks after iteration loops to catch DB iteration failures (was returning partial results). - `ws/hub.go`: add `safeSend` with `recover` for race between Broadcast and Unregister (defensive fix for closed channel send). - `workspace.go`: improve `canvas_layouts` insert error log (non-fatal). - `ChatTab.tsx`, `AgentCommsPanel.tsx`: add WebSocket `onerror` handlers (orphaned connections on failure). - `app/page.tsx`: log hydration errors instead of silent catch. - `cli_executor.py`: guarantee `proc.wait()` after `kill` on timeout to prevent zombie processes; bounded 5s wait timeout. **Warning (5):** - `a2a_proxy.go`: cap `LogActivity` context with 30s timeout (was `WithoutCancel` = unbounded lifetime). - `activity.go`: log JSON marshal failures in `LogActivity` instead of silently corrupting activity logs with nil bodies. - `org.go`: replace 500ms `time.Sleep` with `workspaceCreatePacingMs = 50` constant (org of 12 was 6s+). - `main.py`: stop heartbeat if `adapter.setup()` raises (resource leak). - `Canvas.tsx`: document intentional `getState()` pattern in imperative event handlers. **Test coverage added:** - `a2a_proxy_test.go`: `mockCanCommunicate` helper + 4 access control tests (denied, self-exempt, system caller, canvas) + table-driven `TestIsSystemCaller` (7 cases) - `test_cli_executor.py`: 2 zombie reap tests (verify `proc.wait()` called after `kill`; degraded path when `wait()` also times out) **Verification:** - Go: 6 packages, all tests pass - Canvas Vitest: 344 tests pass - Python pytest: 874 tests pass (was 872, +2 new) - Playwright E2E: 13/13 pass (incl. 3 data-flow tests verifying real browser content) - Comprehensive bash E2E: 68/68 pass - Manual verification: 12-agent org deployed, initial prompts complete, chat shows messages