Renames: - platform/ → workspace-server/ (Go module path stays as "platform" for external dep compat — will update after plugin module republish) - workspace-template/ → workspace/ Removed (moved to separate repos or deleted): - PLAN.md — internal roadmap (move to private project board) - HANDOFF.md, AGENTS.md — one-time internal session docs - .claude/ — gitignored entirely (local agent config) - infra/cloudflare-worker/ → Molecule-AI/molecule-tenant-proxy - org-templates/molecule-dev/ → standalone template repo - .mcp-eval/ → molecule-mcp-server repo - test-results/ — ephemeral, gitignored Security scrubbing: - Cloudflare account/zone/KV IDs → placeholders - Real EC2 IPs → <EC2_IP> in all docs - CF token prefix, Neon project ID, Fly app names → redacted - Langfuse dev credentials → parameterized - Personal runner username/machine name → generic Community files: - CONTRIBUTING.md — build, test, branch conventions - CODE_OF_CONDUCT.md — Contributor Covenant 2.1 All Dockerfiles, CI workflows, docker-compose, railway.toml, render.yaml, README, CLAUDE.md updated for new directory names. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
24 KiB
2026-04-10 Session
Summary
Documentation maintenance for the new long-form Molecule AI product and technical narratives: moved both repository-root drafts into the VitePress docs tree, added sidebar/homepage entry points so they are discoverable from the docs site, and linked them from the product overview for ongoing maintenance inside docs/.
Also brought the landing-page messaging report under docs maintenance by tracking docs/product/landing-messaging-report.md in git and adding it to the product navigation surface.
Changes
New Long-Form Docs Added To docs/
- Moved
MOLECULE_PRODUCT_DOC.mdintodocs/product/molecule-product-doc.md - Moved
MOLECULE_TECHNICAL_DOC.mdintodocs/architecture/molecule-technical-doc.md - Kept the full source content intact while relocating it into the maintained docs structure
VitePress Navigation Updated
docs/.vitepress/config.ts- Added
Product Narrativeunder the Product sidebar group - Added
Landing Messaging Reportunder the Product sidebar group - Added
Technical Documentationunder the Architecture sidebar group
Docs Entry Points Updated
docs/index.md- Added homepage recommended-reading links for the new product and technical documents
Product Overview Cross-Links Updated
docs/product/overview.md- Added direct links to the product narrative, landing messaging report, and comprehensive technical documentation
Additional Product Doc Tracked
- Added
docs/product/landing-messaging-report.mdto version control under the Product docs section
Files Changed
docs/.vitepress/config.tsdocs/index.mddocs/product/overview.mddocs/product/landing-messaging-report.mddocs/product/molecule-product-doc.mddocs/architecture/molecule-technical-doc.mddocs/edit-history/2026-04-10.md(new)
CEO Session — Infrastructure Audit + Chain Break Fix
Infra Audit (fix/infra-audit-critical — PR #5)
Comprehensive codebase audit identified 19 issues across 4 priority levels. Critical fixes:
- Race condition in crypto/aes.go —
encryptionKeyglobal accessed without sync. Fixed withsync.Once. AddedResetForTesting()for tests. - Missing DB indexes — Migration 014:
workspaces(parent_id),workspaces(status),canvas_layouts(workspace_id). Speeds up hierarchy queries, cascade deletes, list/get joins. - N+1 cascade delete — Replaced per-child
UPDATE+DELETEloop with recursive CTE batch query. Docker stops still per-child. - CI linting — Added
golangci-lintstep (continue-on-error until codebase clean).
Chain Break Root Cause + Fix
Problem: Delegation chain died after first result. PM delegated to Dev Lead + QA, results completed, heartbeat wrote results file — but PM was never woken again.
Root cause: Self-message cooldown was 5 minutes. First delegation triggered a self-message within the window. All subsequent completions were blocked by cooldown. PM never woke up to report.
Fix: Reduced SELF_MESSAGE_COOLDOWN from 300s to 60s. With 30s heartbeat cycles, new results trigger a self-message within 1-2 cycles. Results file dedup prevents double-processing.
Agent-Authored PRs Received
Agents autonomously created PRs while CEO did infra work:
- PR #3 — Settings Panel (Frontend Engineer): 34 files, 279 tests, full UX spec implementation
- PR #4 — Onboarding Interception (Frontend Engineer): 10 files, 1362 additions, deploy preflight + missing keys modal
Monitoring
- 13/13 workspaces online throughout session
- Heartbeats active (Redis TTL refreshing)
- Frontend Engineer + QA Engineer were actively processing tasks
- No container crashes, no degraded workspaces
Files Changed (CEO Session)
workspace-server/internal/crypto/aes.go(sync.Once)workspace-server/internal/crypto/aes_test.go(ResetForTesting)workspace-server/internal/handlers/workspace.go(recursive CTE delete)workspace-server/internal/handlers/workspace_test.go(updated mocks)workspace-server/migrations/014_indexes.sql(new — 3 indexes).github/workflows/ci.yml(golangci-lint)workspace/heartbeat.py(60s cooldown, parent reporting, cached lookup)workspace-server/internal/handlers/plugins_test.go(new — 16 tests)CLAUDE.md(test counts: Go 365+, Python 869, migration 14)docs/api-protocol/registry-and-heartbeat.md(delegation checking section)
Delegation Chain — Last Mile Fix
Problem: PM received delegation results but never reported to CEO. The heartbeat self-message said "report back to them" without specifying who.
Fix: Heartbeat looks up parent workspace name (cached after first call) and includes explicit instruction: "Report these results back to your parent 'CEO'." This closes the full chain: CEO → PM → team → results → PM wakes → reports to CEO.
Plugins Handler Tests (16 new)
Covered: ListRegistry (empty/nonexistent/with plugins), Install validation (missing name, path traversal, not found), Uninstall validation, validatePluginName (valid/slash/dotdot/backslash/empty), parseManifestYAML (valid/invalid/minimal).
Agent PRs Completed
Team autonomously completed test plan checklists:
- PR #3 (Settings Panel): 9/9 tasks ✅
- PR #4 (Onboarding): 10/10 tasks ✅
Chain worked: CEO → PM → Dev Lead → FE + QA → PRs updated → all checklists done → PM reported back.
Root Scripts Cleanup
Deleted 4 dead scripts replaced by platform features:
setup-org.sh,setup_reno_stars.sh→POST /org/importimport-ecc.sh→ plugin systemscripts/setup-default-org.sh→POST /org/import
Moved utility scripts to scripts/: import-agent.sh, bundle-compile.sh
Moved 5 E2E test scripts to tests/e2e/: test_api.sh (62 tests), test_a2a_e2e.sh (22), test_activity_e2e.sh (25), test_claude_code_e2e.sh, test_comprehensive_e2e.sh (68). Updated CLAUDE.md paths.
PR #3 + #4 Code Review Delegated
CEO reviewed both PRs and found 6 critical bugs + 9 warnings. Delegated fixes through PM → Dev Lead → FE. Both PRs updated at 4:50 with fixes in progress.
Provisioner Stale Image Fix
Root cause: Docker's unless-stopped restart policy races with provisioner's Stop → Start sequence. Old container restarts before ContainerRemove completes, blocking ContainerCreate. Result: old image keeps running after rebuild.
Fix: Pre-emptive ContainerRemove(force: true) before ContainerCreate — kills any stale container from restart policy. Added image ID logging on create and start for immediate visibility of stale-image issues.
PRs #3 + #4 Reverted
Agent-authored PRs had too many integration bugs (infinite re-renders, wrong API format, white theme on dark canvas). Reverted both via cherry-pick rebuild of main.
Template Runtime Detection Bug
Problem: Deploying "Claude Code Agent" from the template palette started a langgraph container instead of claude-code. The agent error was [Errno 2] No such file or directory: '/claude'.
Root cause: workspace.go:Create defaulted payload.Runtime to "langgraph" (line 50-52) before reading the template's config.yaml. The later detection block (line 142) checked if payload.Runtime == "" but it was already set, so the template's runtime: claude-code was never used.
Fix: Moved runtime detection from template config.yaml to before the DB insert and before the default fallback. Removed the now-dead duplicate detection block in the provisioning section. Added debug log when config.yaml read fails.
Branding + License
- Replaced gradient "S" square in toolbar with actual Molecule AI flame icon (
/molecule-icon.png) - Added Molecule AI favicon (
canvas/src/app/icon.png) - Added BSL 1.1 LICENSE file — personal/non-commercial use OK, no competing SaaS, converts to Apache 2.0 on 2029-01-01
- Updated README badge and license section
AutoGen Adapter 'kwargs' Fix
Problem: Deploying AutoGen Agent from template palette resulted in AutoGen error: 'kwargs' on every message.
Root cause: _langchain_to_autogen() wrapped LangChain tools as async def wrapper(**kwargs). AutoGen 0.7.5's FunctionTool introspects function signatures with type hints — **kwargs has no type annotation, causing KeyError: 'kwargs' in _function_utils.py.
Fix: Replaced **kwargs wrapper with typed async def _invoke(input: str) -> str and used autogen_core.tools.FunctionTool directly. JSON parsing bridges structured input for tools that expect dicts.
Chat Duplicate Messages Fix
Problem: Sending a message showed the agent response twice in the chat.
Root cause: Two paths both added the response: (1) WebSocket A2A_RESPONSE handler in ChatTab, and (2) Zustand store's pendingA2AResponse effect. Both fired from the same event.
Fix: Removed the duplicate WebSocket handler in ChatTab — the store effect is the canonical path.
Canvas Pan-to-Node on Deploy
New workspaces now appear near center and the canvas smoothly pans to them on deploy instead of placing them all at (0,0).
Docs Cleanup
Deleted 6 UX spec files for reverted Phase 20 features (settings panel, onboarding interception, deploy interception) — no longer in codebase.
Initial Prompt System
New feature: agents can auto-execute a configurable prompt on startup — before any user interaction.
Architecture:
config.py: newinitial_promptfield (string orinitial_prompt_filereference)main.py: after server ready, sends initial_prompt as A2Amessage/sendto selforg.go:InitialPromptonOrgDefaultsandOrgWorkspacestructs with JSON+YAML tags; injected into config.yaml as YAML block scalar during org import- Org template: per-agent initial prompts instruct dev agents to clone repo, read CLAUDE.md, study codebase, and report ready
Manual E2E verified: 12 agents deployed, 11/11 non-PM agents cloned repo to /workspace/repo/, PM has repo at /workspace (bind-mounted). All 12 have codebase access.
Runtime Change on Restart Fix
Problem: Comprehensive E2E test "Runtime change langgraph→deepagents on restart" failed — container kept using old image.
Root cause: workspace_restart.go read runtime from DB (COALESCE(runtime, 'langgraph')) but when the user changes config.yaml runtime, the DB is never updated. Also, ExecRead was called after Stop() (container already stopped).
Fix: Read config.yaml runtime from running container before stopping it. If runtime differs from DB, update DB. Use configDirName(id) for container name (not raw workspace ID).
QA System Prompt Overhaul
Comprehensive rewrite: never trust self-reported results, must clone repo independently, run ALL test suites to 100% green, E2E tests required, visual style verification against dark zinc theme, red flags checklist.
Org Struct JSON Tags
Added json tags to OrgTemplate, OrgDefaults, and OrgWorkspace structs — without them, JSON POST bodies couldn't populate initial_prompt and other snake_case fields.
Files Changed
workspace-server/internal/handlers/workspace.go— runtime detection before DB insertworkspace-server/internal/handlers/workspace_restart.go— read runtime from container config before stopworkspace-server/internal/handlers/org.go— InitialPrompt field, JSON tags, config.yaml injectionworkspace-server/internal/handlers/org_test.go— 5 new tests (YAML parsing, injection, special chars)workspace/config.py— initial_prompt field + file referenceworkspace/main.py— auto-send initial_prompt after server readyworkspace/tests/test_config.py— 5 new tests (inline, file, precedence, default, missing)workspace/cli_executor.py— del getattr guardworkspace/adapters/autogen/adapter.py— FunctionTool wrapperworkspace/tests/test_common_setup.py— autogen skipif + FunctionTool assertionsorg-templates/molecule-dev/org.yaml— per-agent initial promptsorg-templates/molecule-dev/qa-engineer/system-prompt.md— comprehensive QA rewritecanvas/src/components/Canvas.tsx— pan-to-node on deploycanvas/src/components/Toolbar.tsx— Molecule AI iconcanvas/src/components/tabs/ChatTab.tsx— remove duplicate A2A_RESPONSE handlercanvas/src/store/canvas-events.ts— node position offset + pan event + window guardcanvas/src/store/__tests__/canvas.test.ts— relaxed position assertioncanvas/src/lib/api/__tests__/secrets.test.ts— match actual API formatcanvas/src/app/icon.png— favicontests/e2e/test_comprehensive_e2e.sh— fix secrets test assumption.gitignore— test-results/, playwright-report/LICENSE— BSL 1.1README.md— license badge + sectionCLAUDE.md— template resolution docs, initial prompt section, test counts- Deleted:
docs/ux-specs/*,docs/onboarding-interception.md
Initial Prompt Cascade Loop Fix
Problem: 12 agents all executed initial prompts simultaneously on first boot. Each prompt ended with "report ready to parent" — sending A2A messages while other agents were still booting. Under load, containers died → ProxyA2A detected dead containers → triggered auto-restart → new container → initial prompt fired again → cascade loop.
Root cause: Two issues: (1) initial prompts instructed agents to send A2A messages during boot, (2) initial prompt re-executed on every restart (no idempotency guard).
Fixes:
main.py: writes.initial_prompt_donemarker file after first execution. Skips on restart.org.yaml: rewrote all 12 agent prompts — no outbound A2A, no test suite runs during boot. Agents clone repo, read docs, save tocommit_memory, then wait for tasks.workspace_restart.go: fixed misleading "after secret change" log inRestartByID(called by multiple paths, not just secrets).
Chat Separation: My Chat + Agent Comms
Refactored ChatTab into two sub-tabs:
- My Chat: user↔agent conversation only (
source=canvasfilter) - Agent Comms: agent↔agent A2A traffic (
source=agentfilter), read-only, live WebSocket updates
Backend: Added source query param to GET /workspaces/:id/activity — canvas filters source_id IS NULL, agent filters source_id IS NOT NULL. Invalid values return 400.
Initial prompt fix: Routes through platform A2A proxy instead of self-send, so the prompt appears as a proper user message in chat history (logged with source_id=NULL). Removed /notify push code — proxy's A2A_RESPONSE broadcast handles delivery.
Shared helper: Extracted extractRequestText() into message-parser.ts — used by both ChatTab and AgentCommsPanel.
Files Changed (Chat Separation)
workspace-server/internal/handlers/activity.go—sourcequery param + validationworkspace/main.py— route initial prompt through proxy, remove /notifycanvas/src/components/tabs/ChatTab.tsx— sub-tab container + MyChatPanelcanvas/src/components/tabs/chat/AgentCommsPanel.tsx— new agent comms viewcanvas/src/components/tabs/chat/message-parser.ts— sharedextractRequestText()
Claude Code Adapter: CLI Subprocess → Claude Agent SDK Migration
Replaced the claude-code runtime's subprocess-based CLIAgentExecutor with a new ClaudeSDKExecutor that uses the official claude-agent-sdk Python package. The SDK wraps the same Claude Code engine, so plugins/skills/CLAUDE.md still work — but eliminates subprocess fragility (stdout buffering, zombie processes, session-ID parsing, ~500ms startup overhead).
New files:
workspace/claude_sdk_executor.py—ClaudeSDKExecutorwith asyncio.Lock serialization, cooperative cancel,QueryResultdataclass, session resume via SDKworkspace/executor_helpers.py— shared helpers extracted fromcli_executor.py: memory recall/commit, delegation results, heartbeat, system prompt, error sanitization (sanitize_agent_error+classify_subprocess_error), markdown-awarebrief_summary,extract_message_textworkspace/tests/test_claude_sdk_executor.py— 30 tests including concurrency (timestamp-ordered), cancel (GeneratorExit via async generator), session resume, error sanitizationworkspace/tests/test_executor_helpers.py— 73 tests for all shared helpers
Modified files:
workspace/adapters/claude_code/adapter.py—create_executor()returnsClaudeSDKExecutor; removedshutil.whichCLI checkworkspace/adapters/claude_code/Dockerfile— pre-installs SDK viapip install -r requirements.txtworkspace/adapters/claude_code/requirements.txt— addedclaude-agent-sdk>=0.1.58workspace/cli_executor.py— removedclaude-codefromRUNTIME_PRESETS, deleted allself.runtime == "claude-code"branches (JSON parsing,--resume,--output-format json,_session_id), calls shared helpers directly (no more one-line wrapper methods), usessys.executablefor MCP server, regex word-boundary error classificationworkspace/tests/conftest.py— session-wideclaude_agent_sdkstub for test imports.gitignore—.initial_prompt_done,.coverage*
Architecture decisions:
asyncio.Lockon the SDK executor serializes concurrent turns (matches old CLI behavior, keeps session_id race-free)ResultMessage.resultpreferred over concatenatedAssistantMessagechunks (avoids doubled pre/post-tool text)- Error sanitization unified:
sanitize_agent_error(exc=..., category=...)serves both SDK exceptions and CLI subprocess stderr classify_subprocess_error()uses regex word boundaries to avoid false positives (\brate\bnot"rate" in)
Coverage: 100% on claude_sdk_executor.py (110 stmts), cli_executor.py (179 stmts), executor_helpers.py (154 stmts). Total: 443 stmts, 0 misses.
Live verification: 12 workspaces restarted on new image. Echo, session resume, Bash tool, TodoWrite, PM→QA MCP delegation, and concurrent requests all verified. Rate-limited on quota (not a code bug).
5 iterative code review passes caught and fixed: the _active_stream race, dead claude-code branches, duplicated A2A instructions, raw-stderr leaks, deprecated typing.AsyncIterator, the _install_fake_sdk teardown leak, inconsistent error patterns, missing encoding args, and 7 other issues across successive rounds.
Agent Quality Enforcement Stack
Built three layers of quality enforcement after observing that agents (same Claude Opus model) missed bugs like 'use client' directives because they lacked institutional memory and system-level enforcement.
Layer 1: Git pre-commit hook (.githooks/pre-commit)
- Rejects commits missing
'use client'on hook-using.tsxfiles - Rejects light theme colors in canvas components
- Rejects SQL injection patterns in Go (
fmt.Sprintfwith SQL) - Rejects leaked secrets (
sk-ant-,ghp_,AKIA) - System-enforced — agents cannot bypass
Layer 2: Molecule AI-dev plugin (plugins/molecule-dev/)
rules/codebase-conventions.md— injected into every agent's CLAUDE.md with past bugs, patterns, self-check scriptsskills/review-loop/SKILL.md— multi-round FE→QA→fix→re-verify workflow for Dev Lead
Layer 3: Awareness memory via initial_prompt
- Key conventions saved to
commit_memoryon first boot - Agents recall them on every future task via memory system
- Builds institutional knowledge across sessions
Also shipped:
- SDK executor retry logic (exponential backoff: 5s→10s→20s for rate limits)
- Force-remove in
provisioner.Stop()to prevent restart-policy zombie containers - All 12 agent system prompts rewritten from checklists to senior-engineer expectations
- Dev Lead prompt requires UIUX + Security involvement for UI/credential work
- Repo made public — removed GITHUB_TOKEN from initial_prompt
Cron Scheduling System (Phase 22)
New feature: users can set up recurring tasks that fire A2A messages to agents on a cron schedule.
Backend:
workspace-server/migrations/015_workspace_schedules.sql— new table with cron_expr, timezone, prompt, enabled, last_run_at, next_run_at, run_count, last_statusworkspace-server/internal/scheduler/scheduler.go— goroutine polls every 30s, fires due schedules via proxyA2ARequest withsystem:schedulercaller, WaitGroup for completion, semaphore (max 10 concurrent)workspace-server/internal/handlers/schedules.go— 6 REST endpoints: list, create, update (COALESCE-based), delete, run-now, historyrobfig/cron/v3for cron expression parsing + next-run computationproxyA2ARequestexposed as public method for internal callers- Dedicated
cron_runactivity log entries with schedule metadata for history queries
Frontend:
canvas/src/components/tabs/ScheduleTab.tsx— CRUD UI with create/edit form, cron-to-English helper, status indicators, Run Now button, delete confirmation- Wired into SidePanel as new "Schedule" tab (⏲ icon)
Org template:
OrgSchedulestruct inorg.go, inserted during org import- Example: Security Auditor daily scan in
org-templates/molecule-dev/org.yaml
E2E verified: Created every-minute schedule, scheduler fired at next minute boundary, agent received and responded, schedule updated with status=ok + run_count=1.
Volume Ownership: Root → Gosu Agent Pattern
Docker creates volume contents as root, but workspace containers run as UID 1000 (agent). This caused PermissionError when the adapter tried to write CLAUDE.md with plugin rules. Initially fixed with scattered chown hacks in the provisioner and plugin handler, then properly fixed with the standard Docker pattern:
Dockerfile: installsgosu, removesUSER agent(entrypoint handles privilege drop)entrypoint.sh: starts as root →chown -R agent:agent /configs /workspace→exec gosu agent→python3 main.py- Removed all band-aid chown calls from provisioner and plugin handler
- Verified: 12/12 containers, CLAUDE.md owned by
agent:agent, plugin rules injected
Comprehensive Code Review — 13 Issues Fixed + Test Coverage
Two-pass code review across the entire repo identified 24 issues. All 13 critical/warning items fixed:
Critical (8):
a2a_proxy.go: ADD access control viaCanCommunicatefor agent-to-agent proxy requests (closing security boundary). Canvas requests (noX-Workspace-ID), self-calls, and system callers (webhook:*,system:*,test:*) bypass via explicitisSystemCaller()helper.org.go,delegation.go: replacedb.DB.Exec()withExecContext+ error checks. Errors no longer silently dropped on inserts/updates.activity.go,workspace.go: addrows.Err()checks after iteration loops to catch DB iteration failures (was returning partial results).ws/hub.go: addsafeSendwithrecoverfor race between Broadcast and Unregister (defensive fix for closed channel send).workspace.go: improvecanvas_layoutsinsert error log (non-fatal).ChatTab.tsx,AgentCommsPanel.tsx: add WebSocketonerrorhandlers (orphaned connections on failure).app/page.tsx: log hydration errors instead of silent catch.cli_executor.py: guaranteeproc.wait()afterkillon timeout to prevent zombie processes; bounded 5s wait timeout.
Warning (5):
a2a_proxy.go: capLogActivitycontext with 30s timeout (wasWithoutCancel= unbounded lifetime).activity.go: log JSON marshal failures inLogActivityinstead of silently corrupting activity logs with nil bodies.org.go: replace 500mstime.SleepwithworkspaceCreatePacingMs = 50constant (org of 12 was 6s+).main.py: stop heartbeat ifadapter.setup()raises (resource leak).Canvas.tsx: document intentionalgetState()pattern in imperative event handlers.
Test coverage added:
a2a_proxy_test.go:mockCanCommunicatehelper + 4 access control tests (denied, self-exempt, system caller, canvas) + table-drivenTestIsSystemCaller(7 cases)test_cli_executor.py: 2 zombie reap tests (verifyproc.wait()called afterkill; degraded path whenwait()also times out)
Verification:
- Go: 6 packages, all tests pass
- Canvas Vitest: 344 tests pass
- Python pytest: 874 tests pass (was 872, +2 new)
- Playwright E2E: 13/13 pass (incl. 3 data-flow tests verifying real browser content)
- Comprehensive bash E2E: 68/68 pass
- Manual verification: 12-agent org deployed, initial prompts complete, chat shows messages