Second-pass cleanup after the monolith split. Addresses every issue
from the code-review pass.
Core additions in src/api.ts:
- toMcpResult(data) + toMcpText(text): single source of truth for the
MCP text-content envelope (was ~87 duplicated literals)
- ApiError type + isApiError(v) guard: typed discriminated-union for
the error-by-value pattern; replaces open-coded shape checks
- apiCall<T = unknown>: generic so callers can document expected
response shape without unchecked "as" casts
Bulk cleanups across all 12 tools/*.ts:
- Every handler now returns toMcpResult(data) or toMcpText(text)
- Open-coded "typeof obj === 'object' && 'error' in obj" in
remote_agents.ts replaced with isApiError(v)
- Extracted initialCanvasPosition() helper out of
handleCreateWorkspace; explains why random seeding exists
- Added runtime/workspace_dir/workspace_access to create_workspace
zod schema (previously accepted by handler but hidden from clients)
src/index.ts:
- Replaced "export * from" with explicit named re-exports so the
public surface is auditable and future name collisions fail loudly
Tests:
- createServer() smoke test that records every srv.tool(...) call and
asserts 87 registered tools unique by name. Catches future PRs that
forget to wire a registerXxxTools(srv).
Docs:
- Fix broken relative links in sdk/python/molecule_agent/README.md
(was ../../examples/ from inside sdk/python/, should be ../examples/)
- Update stale "61 tools" -> "87 tools" in CLAUDE.md + main() log
Verification:
- npm run build clean
- npx jest -> 97/97 passed (was 96; +1 smoke test)
- grep "content: [{ type: \"text\" as const" src/tools/ -> 0 matches
- No file over 216 lines
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
23 KiB
CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
Project Overview
Molecule AI is a platform for orchestrating AI agent workspaces that form an organizational hierarchy. Workspaces register with a central platform, communicate via A2A protocol, and are visualized on a drag-and-drop canvas.
Ecosystem Context
Before research, strategy, or design work, skim docs/ecosystem-watch.md —
it catalogs adjacent agent projects (Holaboss, Hermes, gstack, …) with
overlap / differentiation / terminology-collision notes. Cross-referenced
from PLAN.md and README.md; it's the canonical starting point for
"what else is out there."
Architecture
Canvas (Next.js :3000) ←WebSocket→ Platform (Go :8080) ←HTTP→ Postgres + Redis
↑
Workspace A ←──A2A──→ Workspace B
(Python agents)
↑ register/heartbeat ↑
└───── Platform ─────┘
Four main components:
- Platform (
platform/): Go/Gin control plane — workspace CRUD, registry, discovery, WebSocket hub, liveness monitoring - Canvas (
canvas/): Next.js 15 + React Flow (@xyflow/react v12) + Zustand + Tailwind — visual workspace graph - Workspace Runtime (
workspace-template/): Unified Docker image with pluggable adapter system — supports LangGraph, Claude Code, OpenClaw, DeepAgents, CrewAI, AutoGen. Adapters inworkspace-template/adapters/. Deps installed at startup viaentrypoint.sh. - molecli (
platform/cmd/cli/): Go TUI dashboard (Bubbletea + Lipgloss) — real-time workspace monitoring, event log, health overview, delete/filter operations
Build & Run Commands
Infrastructure
./infra/scripts/setup.sh # Start Postgres, Redis, Langfuse; run migrations
./infra/scripts/nuke.sh # Tear down everything, remove volumes
Platform (Go)
cd platform
go build ./cmd/server # Build server
go run ./cmd/server # Run server (requires Postgres + Redis running)
go build -o molecli ./cmd/cli # Build TUI dashboard
./molecli # Run TUI dashboard (requires platform running)
Must run from platform/ directory (not repo root). Env vars: DATABASE_URL, REDIS_URL, PORT, PLATFORM_URL (default http://host.docker.internal:PORT — passed to agent containers so they can reach the platform), SECRETS_ENCRYPTION_KEY (optional AES-256, 32 bytes), CONFIGS_DIR (auto-discovered), PLUGINS_DIR (deprecated — plugins are now installed per-workspace via API; the plugins/ registry at repo root is auto-discovered), ACTIVITY_RETENTION_DAYS (default 7), ACTIVITY_CLEANUP_INTERVAL_HOURS (default 6), CORS_ORIGINS (comma-separated, default http://localhost:3000,http://localhost:3001), RATE_LIMIT (requests/min, default 600), WORKSPACE_DIR (optional — global fallback host path for /workspace bind-mount; overridden by per-workspace workspace_dir column in DB; if neither is set, each workspace gets an isolated Docker named volume), AWARENESS_URL (optional — if set, injected into workspace containers along with a deterministic AWARENESS_NAMESPACE derived from workspace ID), MOLECULE_IN_DOCKER (optional — set to 1 when the platform itself runs inside Docker so the A2A proxy rewrites 127.0.0.1:<port> URLs to container hostnames; auto-detected via /.dockerenv).
Plugin install safeguards (bound the cost of a single POST /workspaces/:id/plugins install so a slow/malicious source can't tie up a handler):
PLUGIN_INSTALL_BODY_MAX_BYTES— max request body size (default65536= 64 KiB)PLUGIN_INSTALL_FETCH_TIMEOUT— duration string; whole fetch+copy deadline (default5m)PLUGIN_INSTALL_MAX_DIR_BYTES— max staged-tree size (default104857600= 100 MiB)
See docs/plugins/sources.md for the two-axis source/shape plugin model.
molecli reads MOLECLI_URL (default http://localhost:8080) to locate the platform. Logs are written to molecli.log in the working directory (already covered by *.log in .gitignore).
Canvas (Next.js)
cd canvas
npm install
npm run dev # Dev server on :3000
npm run build && npm start # Production
Env vars: NEXT_PUBLIC_PLATFORM_URL (default http://localhost:8080), NEXT_PUBLIC_WS_URL (default ws://localhost:8080/ws).
Workspace Images
bash workspace-template/build-all.sh # Build base + ALL runtime images
bash workspace-template/build-all.sh claude-code # Build base + specific runtime only
Each runtime has its own Docker image extending workspace-template:base, with deps pre-installed for fast startup. The base Dockerfile (workspace-template/Dockerfile) builds :base, then each adapters/*/Dockerfile extends it (e.g. claude_code/Dockerfile installs the claude CLI). Always use build-all.sh — it builds base first, then all runtimes in order. No :latest tag — each runtime uses its own tag to avoid confusion.
| Runtime | Image Tag | Key Deps |
|---|---|---|
| langgraph | workspace-template:langgraph |
langchain-anthropic, langgraph |
| claude-code | workspace-template:claude-code |
claude-agent-sdk (pip), @anthropic-ai/claude-code (npm) |
| openclaw | workspace-template:openclaw |
openclaw deps |
| crewai | workspace-template:crewai |
crewai |
| autogen | workspace-template:autogen |
autogen |
| deepagents | workspace-template:deepagents |
deepagents |
Templates are framework presets in workspace-configs-templates/: claude-code-default, langgraph, openclaw, deepagents. Agent roles are configured after deployment via Config tab or API.
For Claude Code runtime, write your OAuth token to workspace-configs-templates/claude-code-default/.auth-token.
Pre-commit Hook
git config core.hooksPath .githooks # Install hooks (agents do this via initial_prompt)
Enforces: 'use client' on hook-using .tsx files, dark theme (no white/light), no SQL injection (fmt.Sprintf with SQL), no leaked secrets (sk-ant-, ghp_, AKIA). Commit is rejected until violations are fixed — agents cannot bypass this.
Plugins
Shared plugins in plugins/ are auto-loaded by every workspace:
molecule-dev: Codebase conventions (rules injected into CLAUDE.md) +review-loopskill for multi-round QA cyclessuperpowers:verification-before-completion,test-driven-development,systematic-debugging,writing-plansecc: General Claude Code guardrails
Scripts
bash scripts/setup-default-org.sh # Create PM + 3 teams (Marketing/Research/Dev) via API
OPENAI_API_KEY=... bash scripts/test-a2a-cross-runtime.sh # E2E: Claude Code ↔ OpenClaw A2A test
OPENAI_API_KEY=... bash scripts/test-team-e2e.sh # E2E: Multi-template team + A2A
Unit Tests
cd platform && go test -race ./... # 487 Go tests (handlers, registry, provisioner, CLI, delegation, org, channels, wsauth — sqlmock + miniredis)
cd canvas && npm test # 352 Vitest tests (store, components, hydration, buildTree, secrets API, org template import)
cd workspace-template && python -m pytest -v # 1078 pytest tests (adds platform_auth token store for Phase 30.1)
cd sdk/python && python -m pytest -v # 87 SDK tests (agentskills.io spec validator, CLI, AgentskillsAdaptor round-trip, workspace/org/channel validators)
Integration Tests
bash tests/e2e/test_api.sh # 62 API tests against localhost:8080
bash tests/e2e/test_a2a_e2e.sh # 22 A2A end-to-end tests (requires 2 online agents)
bash tests/e2e/test_activity_e2e.sh # 25 activity/task E2E tests (requires 1 online agent)
bash tests/e2e/test_comprehensive_e2e.sh # 68 checks — ALL endpoints, memory, runtime, bundles, approvals
test_api.sh requires platform running. Tests full CRUD, registry, heartbeat, discovery, peers, access control, events, degraded/recovery lifecycle, activity logging, current task tracking, bundle round-trip (export → delete → import → verify).
test_a2a_e2e.sh requires platform + two provisioned agents (Echo Agent, SEO Agent) running with a valid OPENROUTER_API_KEY. Tests message/send, JSON-RPC wrapping, error handling, peer discovery, agent cards, heartbeat. Timeout configurable via A2A_TIMEOUT env var (default 120s).
test_activity_e2e.sh requires platform + one online agent. Tests A2A communication logging (request/response capture, duration, method), agent self-reported activity, type filtering, current task visibility via heartbeat, cross-workspace activity isolation, edge cases.
MCP Server
cd mcp-server
npm install && npm run build # Build MCP server
node dist/index.js # Run (stdio transport)
Exposes 87 tools for managing Molecule AI from Claude Code, Cursor, Codex, or any MCP client. Includes workspace CRUD, async delegation, plugins (install/uninstall/list), global secrets, pause/resume, org import, A2A chat, approvals, memory, files, config, discovery, bundles, templates, traces, activity logs, and social channels (add/update/remove/send/test). Configured in .mcp.json. Env: MOLECULE_URL (default http://localhost:8080).
CI Pipeline
GitHub Actions (.github/workflows/ci.yml) runs on push to main and PRs:
- platform-build: Go build, vet,
go test -racewith coverage profiling (25% baseline threshold) - canvas-build: npm build,
vitest run(no--passWithNoTests-- tests must exist and pass) - mcp-server-build: npm build
- python-lint:
pytest --cov=. --cov-report=term-missing(pytest-cov enabled)
Docker Compose
docker compose -f docker-compose.infra.yml up -d # Infra only
docker compose up # Full stack
Key Architectural Patterns
Import Cycle Prevention
The platform uses function injection to avoid Go import cycles between ws, registry, and events packages:
ws.NewHub(canCommunicate AccessChecker)— Hub acceptsregistry.CanCommunicateas a functionregistry.StartLivenessMonitor(ctx, onOffline OfflineHandler)— Liveness accepts broadcaster callbackregistry.StartHealthSweep(ctx, checker ContainerChecker, interval, onOffline)— Health sweep accepts Docker checker interface- Wiring happens in
platform/cmd/server/main.go— init order:wh → onWorkspaceOffline → liveness/healthSweep → router
Container Health Detection
Three layers detect dead containers (e.g. Docker Desktop crash):
- Passive (Redis TTL): 60s heartbeat key expires → liveness monitor → auto-restart
- Proactive (Health Sweep):
registry.StartHealthSweeppolls Docker API every 15s → catches dead containers faster - Reactive (A2A Proxy): On connection error, checks
provisioner.IsRunning()→ immediate offline + restart
All three call onWorkspaceOffline which broadcasts WORKSPACE_OFFLINE + go wh.RestartByID(). Redis cleanup uses shared db.ClearWorkspaceKeys().
Template Resolution (Create)
Runtime detection happens before DB insert: if payload.Runtime is empty and a template is specified, the handler reads runtime: from configsDir/template/config.yaml first. If still empty, defaults to "langgraph". This ensures the correct runtime (e.g. claude-code) is persisted in the DB and used for container image selection.
When a workspace specifies a template that doesn't exist, the Create handler falls back:
- Check
os.Stat(configsDir/template)— use if exists - Try
{runtime}-defaulttemplate (e.g.claude-code-default/) - Generate default config via
ensureDefaultConfig()(includes.auth-tokencopy for CLI runtimes)
Communication Rules (registry/access.go)
CanCommunicate(callerID, targetID) determines if two workspaces can talk:
- Same workspace → allowed
- Siblings (same parent_id) → allowed
- Root-level siblings (both parent_id IS NULL) → allowed
- Parent ↔ child → allowed
- Everything else → denied
The A2A proxy (POST /workspaces/:id/a2a) enforces this for agent-to-agent calls. Canvas requests (no X-Workspace-ID), self-calls, and system callers (webhook:*, system:*, test:* prefixes via isSystemCaller() in a2a_proxy.go) bypass the check.
JSONB Gotcha
When inserting Go []byte (from json.Marshal) into Postgres JSONB columns, you must:
- Convert to
string()first - Use
::jsonbcast in SQL
lib/pq treats []byte as bytea, not JSONB.
WebSocket Events Flow
- Action occurs (register, heartbeat, etc.)
broadcaster.RecordAndBroadcast()inserts intostructure_eventstable + publishes to Redis pub/sub- Redis subscriber relays to WebSocket hub
- Hub broadcasts to canvas clients (all events) and workspace clients (filtered by CanCommunicate)
Canvas State Management
- Initial load: HTTP fetch from
GET /workspaces→ Zustand hydrate - Real-time updates: WebSocket events →
applyEvent()in Zustand store - Position persistence:
onNodeDragStop→PATCH /workspaces/:idwith{x, y} - Embedded sub-workspaces:
nestNodesetshidden: !!targetIdon child nodes; children render as recursiveTeamMemberChipcomponents inside parent (up to 3 levels), not as separate canvas nodes. Usen.data.parentId(not React Flow'sn.parentId) for hierarchy lookups. - Chat: two sub-tabs — "My Chat" (user↔agent,
source=canvas) and "Agent Comms" (agent↔agent A2A traffic,source=agent). History loaded fromGET /activitywith source filter. Real-time viaA2A_RESPONSE+AGENT_MESSAGEWebSocket events. Conversation history (last 20 messages) sent viaparams.metadata.historyin A2Amessage/sendrequests. - Config save: "Save & Restart" writes config.yaml and auto-restarts the workspace. "Save" writes only (shows restart banner). Secrets POST/DELETE auto-restart on the platform side.
Initial Prompt
Agents can auto-execute a prompt on startup before any user interaction. Configure via initial_prompt (inline string) or initial_prompt_file (path relative to config dir) in config.yaml. After the A2A server is ready, main.py sends the prompt as a message/send to self. A .initial_prompt_done marker file prevents re-execution on restart. Org templates support initial_prompt on both defaults (all agents) and per-workspace (overrides default).
Important: Initial prompts must NOT send A2A messages (delegate_task, send_message_to_user) — other agents may not be ready. Keep them local: clone repo, read docs, save to memory, wait for tasks.
Workspace Lifecycle
provisioning → online (on register) → degraded (error_rate > 0.5) → online (recovered) → offline (Redis TTL expired OR health sweep detects dead container) → auto-restart → provisioning → ... → removed (deleted). Any state → paused (user pauses) → provisioning (user resumes). Paused workspaces skip health sweep, liveness monitor, and auto-restart.
Platform API Routes
| Method | Path | Handler |
|---|---|---|
| GET | /health | inline |
| GET | /metrics | metrics.Handler() — Prometheus text format (v0.0.4); no auth, scrape-safe |
| POST/GET/PATCH/DELETE | /workspaces[/:id] | workspace.go |
| GET/PATCH | /workspaces/:id/config | workspace.go |
| GET/POST | /workspaces/:id/memory | workspace.go |
| DELETE | /workspaces/:id/memory/:key | workspace.go |
| POST/PATCH/DELETE | /workspaces/:id/agent | agent.go |
| POST | /workspaces/:id/agent/move | agent.go |
| GET/POST/PUT | /workspaces/:id/secrets | secrets.go (POST/PUT auto-restarts workspace) |
| DELETE | /workspaces/:id/secrets/:key | secrets.go (DELETE auto-restarts workspace) |
| GET | /workspaces/:id/model | secrets.go |
| GET | /settings/secrets | secrets.go — list global secrets (keys only, values masked) |
| PUT/POST | /settings/secrets | secrets.go — set a global secret {key, value} |
| DELETE | /settings/secrets/:key | secrets.go — delete a global secret |
| GET/POST/DELETE | /admin/secrets[/:key] | secrets.go — legacy aliases for /settings/secrets |
| WS | /workspaces/:id/terminal | terminal.go |
| POST | /workspaces/:id/expand | team.go |
| POST | /workspaces/:id/collapse | team.go |
| POST/GET | /workspaces/:id/approvals | approvals.go |
| POST | /workspaces/:id/approvals/:id/decide | approvals.go |
| GET | /approvals/pending | approvals.go |
| POST/GET | /workspaces/:id/memories | memories.go |
| DELETE | /workspaces/:id/memories/:id | memories.go |
| GET | /workspaces/:id/traces | traces.go |
| GET/POST | /workspaces/:id/activity | activity.go |
| POST | /workspaces/:id/notify | activity.go (agent→user push message via WS) |
| POST | /workspaces/:id/restart | workspace.go |
| POST | /workspaces/:id/pause | workspace.go (stops container, status→paused) |
| POST | /workspaces/:id/resume | workspace.go (re-provisions paused workspace) |
| POST | /workspaces/:id/a2a | workspace.go |
| POST | /workspaces/:id/delegate | delegation.go (async fire-and-forget) |
| GET | /workspaces/:id/delegations | delegation.go (list delegation status) |
| GET/POST | /workspaces/:id/schedules | schedules.go (cron CRUD) |
| PATCH/DELETE | /workspaces/:id/schedules/:scheduleId | schedules.go |
| POST | /workspaces/:id/schedules/:scheduleId/run | schedules.go (manual trigger) |
| GET | /workspaces/:id/schedules/:scheduleId/history | schedules.go (past runs) |
| GET/POST | /workspaces/:id/channels | channels.go (social channel CRUD) |
| PATCH/DELETE | /workspaces/:id/channels/:channelId | channels.go |
| POST | /workspaces/:id/channels/:channelId/send | channels.go (outbound message) |
| POST | /workspaces/:id/channels/:channelId/test | channels.go (test connection) |
| GET | /channels/adapters | channels.go (list available platforms) |
| POST | /channels/discover | channels.go (auto-detect chats for a bot token) |
| POST | /webhooks/:type | channels.go (incoming social webhook) |
| GET | /workspaces/:id/shared-context | templates.go |
| GET/PUT/DELETE | /workspaces/:id/files[/*path] | templates.go |
| GET/PUT | /canvas/viewport | viewport.go |
| GET | /templates | templates.go |
| POST | /templates/import | templates.go |
| POST | /registry/register | registry.go |
| POST | /registry/heartbeat | registry.go |
| POST | /registry/update-card | registry.go |
| GET | /registry/discover/:id | discovery.go |
| GET | /registry/:id/peers | discovery.go |
| POST | /registry/check-access | discovery.go |
| GET | /plugins | plugins.go (list registry; supports ?runtime= filter) |
| GET | /plugins/sources | plugins.go (list registered install-source schemes) |
| GET/POST/DELETE | /workspaces/:id/plugins[/:name] | plugins.go — list, install ({"source":"scheme://spec"}), uninstall per-workspace |
| GET | /workspaces/:id/plugins/available | plugins.go (filtered by workspace runtime) |
| GET | /workspaces/:id/plugins/compatibility?runtime=X | plugins.go (preflight runtime-change check) |
| GET | /bundles/export/:id | bundle.go |
| POST | /bundles/import | bundle.go |
| GET | /org/templates | org.go (list available org templates) |
| POST | /org/import | org.go (import entire org hierarchy from YAML) |
| GET | /ws | socket.go |
Database
16 migration files in platform/migrations/. Key tables: workspaces (core entity with status, runtime, agent_card JSONB, heartbeat columns, current_task, awareness_namespace, workspace_dir), canvas_layouts (x/y position), structure_events (append-only event log), activity_logs (A2A communications, task updates, agent logs, errors), workspace_schedules (cron tasks with expression, timezone, prompt, run history), workspace_channels (social channel integrations — Telegram, Slack, etc., with JSONB config and allowlist), agents, workspace_secrets, global_secrets, agent_memories (HMA scoped memory), approvals.
The platform auto-discovers and runs migrations on startup from several candidate paths.
Project Memory (Awareness MCP)
IMPORTANT: These instructions override default behavior. You must follow them exactly.
Awareness Memory Integration (MANDATORY)
awareness_* = cross-session persistent memory (past decisions, knowledge, tasks). Other tools = current codebase navigation (file search, code index). Use BOTH - they serve different purposes.
STEP 1 - SESSION START: Call awareness_init(source="claude-code") -> get session_id, review context. If active_skills[] is returned: skill = reusable procedure done 2+ times; summary = injectable instruction, methods = steps. Apply matching skills to tasks.
STEP 2 - RECALL BEFORE WORK (progressive disclosure):
- awareness_recall(semantic_query=..., keyword_query=..., detail='summary') → lightweight index.
- Review summaries/scores, pick relevant IDs.
- awareness_recall(detail='full', ids=[...]) → expand only what you need.
STEP 3 - RECORD EVERY CHANGE: After EVERY code edit, decision, or bug fix: awareness_record(content=, insights={knowledge_cards:[...], action_items:[...], risks:[...]}) Content should be RICH and DETAILED — include reasoning, key code snippets, user quotes, alternatives considered, and files changed. Do NOT compress into a single-line summary. The content IS the memory — more detail = better recall. Include insights to create searchable knowledge in ONE step (recommended). Skipping = permanent data loss.
STEP 4 - CATEGORY GUIDE (for insights.knowledge_cards):
- decision = choice made between alternatives.
- problem_solution = bug/problem plus the fix that resolved it.
- workflow = process, setup, or configuration steps only.
- pitfall = blocker, warning, or limitation without a fix yet.
- insight = reusable pattern or general learning.
- skill = reusable procedure done 2+ times; summary = injectable instruction, methods = steps.
- key_point = important technical fact when nothing else fits. Never default everything to workflow.
STEP 5 - SESSION END: awareness_record(content=[step1, step2, ...], insights={...}) with final summary.
BACKFILL (if applicable): If MCP connected late: awareness_record(content=)
RULES VERSION: Pass rules_version="2" to awareness_init so the server knows you have these rules. If the server returns _setup_action, the rules have been updated — follow the instruction to re-sync.
NOTE: memory_id from X-Awareness-Memory-Id header. source/actor/event_type auto-inferred.
Compliance Check
Before responding to ANY user request:
-
Have you called awareness_init yet this session? If not, call it NOW.
-
Did you just edit a file? Call awareness_record(content=, insights={...}) IMMEDIATELY.
-
Is the user asking about past work? Call awareness_recall FIRST.