molecule-ai/molecule-core

Fork 2

Files

T

hongming be9190e57a

Block internal-flavored paths / Block forbidden paths (push) Waiting to run

Details

ci-arm64-advisory / fast-checks (push) Waiting to run

Details

CI / Python Lint & Test (push) Waiting to run

Details

CI / Detect changes (push) Waiting to run

Details

CI / Platform (Go) (push) Blocked by required conditions

Details

CI / Canvas (Next.js) (push) Blocked by required conditions

Details

CI / Shellcheck (E2E scripts) (push) Blocked by required conditions

Details

CI / Canvas Deploy Reminder (push) Blocked by required conditions

Details

E2E API Smoke Test / detect-changes (push) Waiting to run

Details

E2E API Smoke Test / E2E API Smoke Test (push) Blocked by required conditions

Details

E2E Chat / detect-changes (push) Waiting to run

Details

E2E Chat / E2E Chat (push) Blocked by required conditions

Details

E2E Staging Canvas (Playwright) / detect-changes (push) Waiting to run

Details

E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Blocked by required conditions

Details

Handlers Postgres Integration / detect-changes (push) Waiting to run

Details

Handlers Postgres Integration / Handlers Postgres Integration (push) Blocked by required conditions

Details

Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push) Waiting to run

Details

Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (push) Waiting to run

Details

publish-workspace-server-image / Production auto-deploy (push) Blocked by required conditions

Details

Secret scan / Scan diff for credential-shaped strings (push) Waiting to run

Details

CI / all-required (push) Has been cancelled

Details

Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (push) Successful in 10s

Details

publish-workspace-server-image / build-and-push (push) Has been cancelled

Details

docs: #1753 sweep awareness namespace mentions across narrative docs (#1758 )

CTO-bypass merge 2026-05-24: CI/all-required green at 05:00:25Z, persona acks in place.

2026-05-24 05:20:43 +00:00

47 KiB

Raw Blame History

Molecule AI — Comprehensive Technical Documentation

Definitive technical reference for the Molecule AI Agent Team platform. Based on a full non-invasive scan of the molecule-monorepo repository.

Executive Summary
Product Positioning
System Architecture
Database Schema
Workspace Lifecycle
Communication Rules
Platform API Routes
A2A Protocol
Hierarchical Memory Architecture
Runtime Tier System
Provisioning & Container Lifecycle
Workspace Runtime
Skills System
Bundle System
Canvas UI
Tools & Capabilities
Coordinator Pattern
Codebase Structure
Key Design Patterns
Docker Compose Orchestration
Environment Variables
Recent Feature Highlights
Known Gaps & Backlog
Licensing & Commercialization Path
OSS Growth Research
Technical Debt & Constraints
Production Deployment
MCP Server & Integrations
Summary Statistics
Vision: From Agent Teams to Robot Teams

1. Executive Summary

Molecule AI is an org-native control plane for heterogeneous AI agent teams. It is not a workflow builder or a replacement for agent frameworks; rather, it is the operational and organizational layer that sits above multiple runtime frameworks and provides:

Workspace-as-role abstraction (not task nodes)
Hierarchy-driven topology (org chart = communication paths)
Hierarchical Memory Architecture (HMA) with LOCAL/TEAM/GLOBAL scopes
A2A (Agent-to-Agent) direct inter-workspace communication via JSON-RPC 2.0
Canvas-based visual team building with drag-to-nest hierarchy
Comprehensive control plane operations — registry, heartbeats, lifecycle, approvals, secrets, traces, bundles

Six runtime adapters ship production-ready on main: LangGraph, DeepAgents, Claude Code, CrewAI, AutoGen, OpenClaw.

2. Product Positioning

Core Narrative

One-liner: "Molecule AI is the org-native control plane for heterogeneous AI agent teams."

Five Key Differentiators

#	Principle	Implication
1	Workspace = role, not task	Internal AI model can swap, but organizational identity persists across model/framework changes
2	Org chart = topology	Hierarchy determines communication boundaries — no manual edge wiring needed
3	Heterogeneous runtime support	6 adapters shipped; teams choose freely without forced standardization
4	Memory follows org boundaries	HMA prevents over-sharing, aligns data isolation with organizational structure
5	Skill evolution loop	memory → signal → skill → hot-reload → operational improvement (self-improving flywheel)

What Molecule AI Is NOT

Category	Examples	How Molecule AI Differs
Workflow builders	n8n, Windmill, Temporal	Molecule AI models roles, not tasks
Agent frameworks	LangGraph, CrewAI, AutoGen	Molecule AI sits above frameworks as runtime adapters
Coding agents	Claude Code, Cursor, Codex	Molecule AI runs coding agents as workspace roles alongside other types
Chat UIs	ChatGPT, Claude.ai	Molecule AI is operational infrastructure, not a conversation interface

3. System Architecture

System Boundary Diagram

┌─────────────────────────────────────────────────────────────┐
│ Canvas (Next.js 15 · port 3000)                             │
│ React Flow + Zustand + WebSocket                            │
│ Visual drag-to-nest org chart · 10-tab ops panel            │
└──────────────────┬──────────────────────────────────────────┘
                   │ HTTP + WebSocket
┌──────────────────▼──────────────────────────────────────────┐
│ Platform (Go / Gin · port 8080)                             │
│ Control plane: workspace CRUD, registry, discovery,         │
│ A2A proxy, activity, memory APIs, secrets, approvals        │
└─────────┬────────────────────────────────────┬──────────────┘
          │                                    │
    Postgres 16                             Redis 7
    (internal: 5432)                        (internal: 6379)

┌─────────────────────────────────────────────────────────────┐
│ Workspace Runtime (Python 3.11+ Docker image)               │
│ Pluggable adapters: LangGraph, DeepAgents, Claude Code,     │
│ CrewAI, AutoGen, OpenClaw                                   │
│ A2A protocol server · heartbeat · skills · HMA memory       │
└─────────────────────────────────────────────────────────────┘
          │
┌─────────────────────────────────────────────────────────────┐
│ Langfuse (self-hosted · ClickHouse + Postgres backend)      │
│ OpenTelemetry traces for every LLM call                     │
└─────────────────────────────────────────────────────────────┘

Network Model

Path	Protocol	Purpose
Canvas ↔ Platform	HTTP REST + WebSocket	UI operations + real-time event fanout
Platform ↔ Postgres	TCP	Source of truth for all durable state
Platform ↔ Redis	TCP	Ephemeral state (liveness TTL), caching, pub/sub
Workspace ↔ Workspace	HTTP (A2A JSON-RPC 2.0)	Direct peer-to-peer, platform not in data path
Workspace → Langfuse	HTTP	Automatic OpenTelemetry tracing
Docker Network	`molecule-core-net`	Internal-only by default, no exposed DB/Redis ports

Core Components

1. Canvas (Next.js 15)

React Flow for visual workspace graph
Zustand for state management
WebSocket for real-time updates
10-tab side panel: Chat, Activity, Details, Skills, Terminal, Config, Files, Memory, Traces, Events
Drag-to-nest team building
Bundle import/export via drag-and-drop
Empty state with template palette + onboarding wizard

2. Platform (Go 1.25+ / Gin)

Gin-based REST API + WebSocket hub
Workspace lifecycle management (CRUD + pause/resume/restart)
Registry and heartbeat system (30s default)
Hierarchy-aware access control (CanCommunicate())
A2A proxy for browser-safe inter-workspace communication
Event broadcasting (Redis pub/sub → WebSocket fanout)
Docker provisioner with T1–T4 tier enforcement
Activity logging with configurable retention (default 7 days)
Secrets management (AES-256-GCM encryption)
File, terminal, bundle, template, traces APIs
Langfuse integration
Prometheus metrics endpoint

3. Workspace Runtime (Python 3.11+)

Unified workspace/ Docker image
Adapter-driven execution (6 adapters)
A2A server via Uvicorn
Heartbeat loop (30s default)
Skill hot-reload system (~3 second propagation)
Memory tools with HMA scope support
Approval/human-in-the-loop integration
Activity reporting
Awareness namespace integration (optional)
Plugin-mounted shared rules and skills

4. Infrastructure

Postgres 16: Source of truth (workspaces, events, activity, secrets, memories)
Redis 7: Ephemeral state (liveness TTL 60s), URL caching, pub/sub
Langfuse 2.x: LLM tracing and observability (self-hosted, ClickHouse backend)
Docker: Workspace provisioning with T1–T4 tier system
LiteLLM proxy (optional): Unified API for multiple model providers
Ollama (optional): Local LLM models

4. Database Schema

11 migration files in workspace-server/migrations/.

Core Tables

Table	Purpose	Key Columns
`workspaces`	Current state registry	`id`, `name`, `role`, `tier` (1-4), `status`, `parent_id`, `agent_card` (JSONB), `url`, `forwarded_to`, `last_heartbeat_at`, `last_error_rate`, `active_tasks`, `uptime_seconds`, `current_task`, `runtime`
`agents`	Agent assignment history	`workspace_id`, `model`, `status`, `removed_at`, `removal_reason`
`workspace_secrets`	Encrypted credentials	`workspace_id`, `key`, `encrypted_value` (BYTEA, AES-256-GCM)
`agent_memories`	HMA-scoped memory	`workspace_id`, `content`, `scope` (LOCAL/TEAM/GLOBAL)
`structure_events`	Immutable event log (APPEND-ONLY, never UPDATE/DELETE)	`event_type`, `workspace_id`, `agent_id`, `target_id`, `payload` (JSONB)
`activity_logs`	Operational activity with retention	`workspace_id`, `activity_type`, `source_id`, `target_id`, `method`, `request_body`, `response_body`, `duration_ms`, `status`, `error_detail`
`canvas_layouts`	Node visual positions	`workspace_id`, `x`, `y`, `collapsed`
`canvas_viewport`	Canvas pan/zoom state	Single row, upserted

Redis Key Patterns

Key Pattern	Value	TTL	Purpose
`ws:{id}`	`"online"`	60s	Liveness detection (heartbeat refreshes)
`ws:{id}:url`	Host-mapped URL	5min	URL cache for external discovery
`ws:{id}:internal_url`	Docker-internal URL	—	Container-to-container discovery
`events:broadcast`	pub/sub channel	—	Event fanout to WebSocket hub

5. Workspace Lifecycle

State Machine

provisioning → online ↔ degraded
   ↓             ↓         ↓
 failed       offline    offline
   ↓             ↓
 retry      (auto-restart)

↓ (any state)
paused → (user resumes) → provisioning

↓ (any state)
removed

Status Definitions

Status	Meaning	Canvas Indicator
`provisioning`	Waiting for first heartbeat	Spinner
`online`	Heartbeat received, reachable	Green dot
`degraded`	Online but `error_rate ≥ 0.5`	Yellow node with warning
`offline`	Heartbeat TTL expired, unreachable	Gray node
`paused`	User paused, container stopped, config preserved	Indigo badge
`failed`	Provisioning timeout or launch error	Red node + retry button
`removed`	Deleted, kept for event log history	Node removed from Canvas

Health Detection (Three Layers)

Layer	Mechanism	Interval	Trigger
Passive	Redis TTL expiry	60s heartbeat key	Liveness monitor callback
Proactive	Docker API poll	Every 15s	Health sweep goroutine
Reactive	A2A proxy connection error	On-demand	`provisioner.IsRunning()` check

All three layers call onWorkspaceOffline() → broadcast WORKSPACE_OFFLINE + auto-restart.

Cascade Behavior

Pause: Pausing a parent cascades to all children. Children of a paused parent cannot be individually resumed.
Delete: Removes container, cleans memory (DB rows, Redis keys). Structure events and Agent Card history are never deleted.

6. Communication Rules

Hierarchy = Topology

The CanCommunicate() function is the single source of truth for all access control.

Direction	Allowed	Example
Sibling ↔ Sibling	YES	Marketing Agent ↔ Developer PM
Parent → Child	YES	Developer PM → Frontend Agent
Child → Parent	YES	Frontend Agent → Developer PM
Skip levels	NO	Frontend Agent → Business Core (403)
Cross-team	NO	Frontend Agent → Operations Team (403)

Access Check Algorithm

IF caller.parent_id == target.parent_id → ALLOW (siblings)
ELIF both parent_id IS NULL → ALLOW (root-level siblings)
ELIF caller.ID == target.parent_id → ALLOW (parent→child)
ELIF target.ID == caller.parent_id → ALLOW (child→parent)
ELSE → DENY (403 Forbidden)

This same logic governs: A2A delegation, memory scope enforcement, activity visibility, approval routing, and WebSocket event fanout.

7. Platform API Routes

Workspace Lifecycle (8 endpoints)

Method	Endpoint	Purpose
`POST`	`/workspaces`	Create and provision new workspace
`GET`	`/workspaces`	List all with canvas layout data
`GET`	`/workspaces/:id`	Get single workspace
`PATCH`	`/workspaces/:id`	Update name, role, tier, runtime, parent
`DELETE`	`/workspaces/:id`	Remove workspace
`POST`	`/workspaces/:id/restart`	Restart container
`POST`	`/workspaces/:id/pause`	Pause with cascade to children
`POST`	`/workspaces/:id/resume`	Resume from pause

Registry & Discovery (5 endpoints)

Method	Endpoint	Purpose
`POST`	`/registry/register`	Workspace self-registration on startup
`POST`	`/registry/heartbeat`	Liveness + current task + error rate (30s interval)
`POST`	`/registry/update-card`	Push Agent Card updates (skills, capabilities)
`GET`	`/registry/discover/:id`	Resolve workspace URL (hierarchy-validated)
`GET`	`/registry/:id/peers`	List reachable peers for workspace

Memory (6 endpoints)

Method	Endpoint	Purpose
`POST`	`/workspaces/:id/memories`	Commit HMA-scoped memory (LOCAL/TEAM/GLOBAL)
`GET`	`/workspaces/:id/memories`	Search scoped memories with query
`DELETE`	`/workspaces/:id/memories/:memoryId`	Delete specific memory entry
`GET`	`/workspaces/:id/memory`	List key/value workspace memory
`POST`	`/workspaces/:id/memory`	Upsert key/value pair (optional TTL)
`DELETE`	`/workspaces/:id/memory/:key`	Delete key/value entry

Secrets & Config (5 endpoints)

Method	Endpoint	Purpose
`GET`	`/workspaces/:id/secrets`	Get merged workspace + global secrets
`POST`	`/workspaces/:id/secrets`	Upsert workspace secret (triggers auto-restart)
`DELETE`	`/workspaces/:id/secrets/:key`	Delete secret (triggers auto-restart)
`GET`	`/workspaces/:id/config`	Get workspace config.yaml
`PATCH`	`/workspaces/:id/config`	Update workspace config

A2A & Activity (5 endpoints)

Method	Endpoint	Purpose
`POST`	`/workspaces/:id/a2a`	Proxy A2A request to target workspace
`GET`	`/workspaces/:id/activity`	List activity rows (filterable)
`POST`	`/workspaces/:id/activity`	Report activity from workspace
`POST`	`/workspaces/:id/notify`	Emit user-facing notification
`GET`	`/workspaces/:id/session-search`	Search recent activity + memory for contextual recall

Team & Hierarchy (2 endpoints)

Method	Endpoint	Purpose

Files, Terminal, Templates, Bundles (8 endpoints)

Method	Endpoint	Purpose
`GET`	`/workspaces/:id/files[/*path]`	List or read files
`PUT`	`/workspaces/:id/files/*path`	Write file
`DELETE`	`/workspaces/:id/files/*path`	Delete file
`WS`	`/workspaces/:id/terminal`	WebSocket terminal session into container
`GET`	`/bundles/export/:id`	Export workspace as `.bundle.json`
`POST`	`/bundles/import`	Import workspace bundle (recursive)
`GET`	`/templates`	List available workspace templates
`POST`	`/templates/import`	Import custom template folder

Observability & Real-Time (5 endpoints)

Method	Endpoint	Purpose
`GET`	`/health`	Health check
`GET`	`/metrics`	Prometheus metrics (v0.0.4 format)
`GET`	`/workspaces/:id/traces`	Langfuse trace links
`GET`	`/events[/:workspaceId]`	Event stream (SSE)
`WS`	`/ws`	WebSocket hub for real-time event fanout

8. A2A Protocol

Message Format (JSON-RPC 2.0 over HTTP)

{
  "jsonrpc": "2.0",
  "id": "task-123",
  "method": "message/send",
  "params": {
    "message": {
      "role": "user",
      "parts": [{"kind": "text", "text": "Build the login feature"}],
      "messageId": "msg-456"
    }
  }
}

Two Call Modes

Mode	Method	Use Case
Synchronous	`message/send`	Short tasks, immediate response
Streaming	`message/sendSubscribe`	Long tasks, SSE progress updates

Discovery Flow

Caller queries GET /registry/discover/:targetId with X-Workspace-ID header
Platform validates CanCommunicate(caller, target) — returns 403 if denied
Returns Docker-internal URL (workspace caller) or host-mapped URL (Canvas/external caller)
Caller sends A2A message directly to target (peer-to-peer)
Target processes task and responds

Task State Machine

submitted → working → completed
                   → failed
                   → canceled
         → input-required → working (caller provides input)

Authentication

MVP (current): Discovery-time validation only. Direct A2A calls are unauthenticated (acceptable for self-hosted Docker network isolation).
Post-MVP: Platform-issued short-lived signed tokens scoped to caller/target pair.

9. Hierarchical Memory Architecture

Three Scopes

Scope	Visibility	Write Access	Use Case
LOCAL	This workspace only	Self	Private scratch facts, reasoning, working state
TEAM	Parent + children + siblings	Self	Handoffs, coordination, team-level knowledge
GLOBAL	Readable by all workspaces	Root only	Org-wide policies, standards, institutional knowledge

Four Memory Surfaces

Surface	Storage	Endpoint	Purpose
Memory v2 plugin (SSOT)	`memory_plugin.memory_records` table via RFC #2728 HTTP plugin	`POST /workspaces/:id/v2/memories`, MCP tools `commit_memory` / `commit_memory_v2` / `commit_summary`	Production memory backend — agent reads/writes route through here exclusively
Key/value workspace memory	`workspace_memory` table	`POST /workspaces/:id/memory`	Simple structured state, UI-visible, optional TTL — separate from agent memory
Activity recall	`activity_logs` + `agent_memories` (legacy read-only)	`GET /workspaces/:id/session-search`	"What just happened?" contextual recall
Legacy `agent_memories`	`agent_memories` table	`POST /workspaces/:id/memories` (REST)	Frozen post-A1 — kept only for the REST canvas-side path; the workspace-create `seedInitialMemories` writer routes through the v2 plugin once #1755 (PR #1759) lands. Scheduled for drop in Phase A3 (#1733).

Memory → Skill Compounding Flywheel

Task execution
  → Durable insight captured in LOCAL/TEAM memory
  → Repeated success patterns detected (repetition signal)
  → Memory row promoted → SKILL.md package created
  → Hot-reload (~3 seconds) → skill injected into live runtime
  → Agent Card updated → broadcast to peers via WebSocket
  → Future tasks use promoted skill → faster + more reliable
  → Organization becomes more capable over time

Key property: promotion events are visible in activity logs. Skills are inspectable in Canvas Skills tab. This is not hidden prompt inflation.

10. Runtime Tier System

Tier	Name	Container Flags	Use Case
T1	Sandboxed	Read-only rootfs, tmpfs /tmp, 512 MiB, no `/workspace` mount	Untrusted code, text-only analysis
T2	Standard (default)	Read-write, 512 MiB, 1 CPU, `/workspace` mount	Most agent workloads
T3	Privileged	`--privileged`, `--pid=host`, Docker network access	Internal tooling, elevated operations
T4	Full Access	T3 + `--network=host` + Docker socket mount	System-level orchestration, DevOps

Unknown tier values default to T2 for safety. Applied via provisioner.ApplyTierConfig() during container creation.

11. Provisioning & Container Lifecycle

Docker Networking

All containers join molecule-core-net private network
Container naming: ws-{workspace_id[:12]}
Ephemeral host port binding: 127.0.0.1:0→8000/tcp

URL Resolution

Caller	URL Type	Example
Workspace (container)	Docker-internal	`http://ws-{id}:8000`
Canvas (browser)	Host-mapped	`http://127.0.0.1:{ephemeral_port}`

Container Cleanup on Delete

Docker container stopped and removed
Memory cleaned (DB rows, Redis keys)
Status set to removed
WORKSPACE_REMOVED event written to structure_events
Structure events and Agent Card history never deleted (audit trail)

12. Workspace Runtime

Entry Point: `workspace/main.py`

Startup Sequence (10 steps):

Initialize telemetry (OpenTelemetry, no-op if packages absent)
Load config.yaml into WorkspaceConfig dataclass
Run preflight validation (model availability, skills, configs)
Create HeartbeatLoop for background task tracking
Resolve adapter from runtime field in config
Run adapter setup() and create_executor()
Build Agent Card from loaded skills + runtime capabilities
Register: POST /registry/register with workspace ID + Agent Card
Start heartbeat loop (30s interval) + skill hot-reload watcher
Serve A2A over Uvicorn on configured port

Runtime Configuration Schema (`config.yaml`)

name: "Workspace Name"
description: ""
version: "1.0.0"
tier: 2                                    # 1=sandboxed, 2=standard, 3=privileged, 4=full-host
model: "anthropic:claude-sonnet-4-6"       # provider:model syntax
runtime: "langgraph"                       # langgraph | deepagents | claude-code | crewai | autogen | openclaw
runtime_config:                            # Runtime-specific settings
  command: "claude"                        # For CLI runtimes
  args: []
  auth_token_file: ".auth-token"
  timeout: 0
  model: ""                                # Override model just for this runtime
skills: ["skill1", "skill2"]               # Folder names under skills/
tools: ["web_search", "filesystem"]        # Built-in tool names
prompt_files: ["system-prompt.md"]         # Additional prompt text files
# `shared_context` was removed; team-shared knowledge now lives in memory v2's
# team:<id> namespace (recall_memory MCP tool). See RFC #2789 for shared files.

a2a:
  port: 8000
  streaming: true
  push_notifications: true

delegation:
  retry_attempts: 3
  retry_delay: 5.0
  timeout: 120.0
  escalate: true

sandbox:
  backend: "subprocess"                    # subprocess | docker
  memory_limit: "256m"
  timeout: 30

rbac:
  roles: ["operator"]
  allowed_actions: {}

hitl:
  channels:
    - type: "dashboard"
  default_timeout: 300
  bypass_roles: []

governance:
  enabled: false
  policy_mode: "audit"                     # audit | permissive | strict
  policy_file: ""

security_scan:
  mode: "warn"                             # warn | block | off

compliance:
  mode: "owasp_agentic"
  prompt_injection: "detect"               # detect | block
  max_tool_calls_per_task: 50
  max_task_duration_seconds: 300

Six Runtime Adapters

Adapter	Core Strength	Image Tag
LangGraph	Graph-based state machine, tool use, streaming	`workspace-template:langgraph`
DeepAgents	Deep planning, multi-step task decomposition	`workspace-template:deepagents`
Claude Code	Native coding workflows, CLI continuity, OAuth auth	`workspace-template:claude-code`
CrewAI	Role-based crews, structured task orchestration	`workspace-template:crewai`
AutoGen	Multi-agent conversations, explicit strategies	`workspace-template:autogen`
OpenClaw	CLI-native runtime, own session model	`workspace-template:openclaw`

Branch-level WIP: NemoClaw (NVIDIA T4 + Docker socket) on feat/nemoclaw-t4-docker.

Each adapter implements setup() + create_executor(). The base adapter provides shared infrastructure: system prompt assembly, skill loading, tool registration, coordinator detection, plugin injection.

13. Skills System

Three Capability Sources

Workspace-local skills: skills/<skill-name>/SKILL.md + tools/ directory
Plugin-mounted rules: /plugins volume (read-only), shared across all workspaces
Built-in tools: delegation, approval, memory, sandbox, telemetry, audit, compliance, governance

Skill Directory Structure

skills/generate-seo-page/
├── SKILL.md              # YAML frontmatter + natural language instructions
├── tools/
│   ├── write_page.py     # @tool-decorated Python functions
│   └── check_gsc.py
├── examples/             # Few-shot examples
├── templates/            # Reference files (HTML, etc.)
└── links.yaml            # External resource URLs

SKILL.md Frontmatter

---
name: "Generate SEO Landing Page"
description: "Create SEO-optimized bilingual landing pages"
version: "1.0.0"
tags: ["seo", "content", "bilingual"]
examples: ["Create a Vancouver renovation page in EN/ZH"]
requires:
  env: ["GSC_CREDENTIALS"]
  bins: ["jq"]
---

# Agent instructions follow in natural language...

Hot-Reload Pipeline

Step	Action	Timing
1	File watcher detects change in `skills/`	2s debounce
2	Reload skill metadata + tool Python modules	Immediate
3	Rebuild agent tools and Agent Card	~100ms
4	Broadcast updated card via WebSocket	~50ms
5	Peer system prompts rebuilt with new awareness	~500ms
Total	End-to-end propagation	~3 seconds

14. Bundle System

Bundle Format (`.bundle.json`)

{
  "schema": "1.0",
  "id": "seo-agent-vancouver",
  "name": "Vancouver SEO Agent",
  "tier": 1,
  "model": "anthropic:claude-sonnet-4-6",
  "system_prompt": "...full prompt text...",
  "skills": [{
    "id": "generate-seo-page",
    "name": "Generate SEO Landing Page",
    "files": {
      "SKILL.md": "---\nname: ...",
      "tools/write_page.py": "def write_page(...):\n    ..."
    }
  }],
  "tools": [{"id": "web_search", "config": {}}],
  "prompts": {"prompts/page-generation.md": "..."},
  "sub_workspaces": [],
  "agent_card": {"...": "A2A card snapshot"},
  "author": "molecule",
  "version": "1.2.0"
}

Inclusion/Exclusion Rules

Included	Excluded
Full system prompt text	API keys / secrets
All skill files (inlined)	Memory / conversation history
Prompt templates + assets	Database data
Tool configurations	Runtime state
Sub-workspace bundles (recursive)
Agent Card snapshot

Workflow

Export: Right-click workspace → "Export as bundle" → .bundle.json download
Import: Drag .bundle.json onto Canvas → POST /bundles/import → recursive provisioning → new IDs → source_bundle_id traces lineage

15. Canvas UI

Tech Stack

Layer	Technology
Framework	Next.js 15 (App Router)
Graph	React Flow v12 (`@xyflow/react`)
State	Zustand
Styling	TailwindCSS v4
Real-time	Native WebSocket API

Core Interactions

Drag-to-Nest: Drag workspace over another → overlap detection → highlight → drop → update parent_id
Right-Click Menu: Open Details/Chat/Terminal, Restart, Duplicate, Export Bundle, Expand/Collapse Team, Extract from Team, Delete
Template Palette: Empty state shows up to 6 templates + "Create blank workspace"
Onboarding Wizard: 4-step guided setup tracked in localStorage

10-Tab Operations Panel

#	Tab	Function
1	Chat	A2A conversational interface with session history (last 20 messages)
2	Activity	Rich operation log — A2A messages, task updates, logs, skill promotions (filterable)
3	Details	Workspace metadata, runtime summary, status, Agent Card, restart/pause controls, peer list
4	Skills	Live skill display from Agent Card — metadata, tags, examples
5	Terminal	WebSocket shell into workspace container
6	Config	Structured YAML editor — runtime, skills, tools, A2A, delegation, sandbox settings
7	Files	File browser + editor for /configs, /workspace, /home, /plugins
8	Memory	Scoped memory view (LOCAL/TEAM/GLOBAL) + key/value workspace memory with TTL
9	Traces	Langfuse trace viewer — every LLM call with input/output/tokens/cost
10	Events	Structure event stream — real-time workspace change log

Real-Time Architecture

Phase	Mechanism
Initial load	`GET /workspaces` → Zustand store hydration
Live updates	WebSocket events → `applyEvent()` → instant Canvas re-render
Position persistence	`onNodeDragStop` → `PATCH /workspaces/:id` with x, y
Error recovery	Error boundary with reload button + hydration retry banner

16. Tools & Capabilities

Workspace Tools (`workspace/builtin_tools/`)

Tool File	Purpose	RBAC
`memory.py`	HMA memory `commit_memory()` / `search_memory()`	memory.write, memory.read
`delegation.py`	A2A delegation to peer workspaces with retry + tracing	delegate permission
`approval.py`	Human-in-the-loop approval flow with polling/WebSocket	approve permission
`audit.py`	RBAC enforcement + audit trail logging	audit enforcement
`compliance.py`	OWASP Agentic compliance checks	compliance check
`governance.py`	Microsoft Agent Governance Toolkit integration	policy evaluation
`hitl.py`	Multi-channel HITL (dashboard, Slack, email)	hitl.bypass_roles
`sandbox.py`	Code execution (subprocess or Docker backend)	sandbox access
`telemetry.py`	OpenTelemetry span creation and tracing	trace emission
`security_scan.py`	CVE and security scanning (pip-audit/Snyk)	security audit
`temporal_workflow.py`	Temporal.io workflow integration	workflow engine
`a2a_tools.py`	A2A delegation helpers and route resolution	delegate/receive

Built-In MCP Tools (from `.mcp.json`)

Server	Purpose
`molecule`	20+ platform management tools (workspace CRUD, chat, memory, teams, secrets, files, approvals) — includes `commit_memory` / `commit_memory_v2` / `search_memory` routed through the v2 plugin

17. Coordinator Pattern

How Team Expansion Works

Workspace "expands into team" → becomes coordinator (team lead)
Coordinator fetches children's Agent Cards to understand capabilities
For each incoming task: analyzes → selects best-suited child → delegates via A2A
Aggregates responses when tasks span multiple children
Falls back to self-handling only if no child suitable

Key Properties

Enforcement: Coordinators cannot do direct work — all execution delegated to children
Recursive: Child workspaces can themselves expand into teams (unlimited depth)
Transparent: Upstream parent doesn't need to know if child is single agent or team of fifty
Detectable: coordinator.py checks get_children() — if children exist, coordinator mode activates

18. Codebase Structure

Python Runtime (95 files)

workspace/
├── main.py                    # Entry point (startup sequence)
├── config.py                  # Config parsing → dataclasses (120+ lines)
├── heartbeat.py               # 30s heartbeat loop
├── preflight.py               # Startup validation
├── plugins.py                 # Plugin rule/skill injection
├── coordinator.py             # Team lead routing
├── prompt.py                  # System prompt builder
├── adapters/
│   ├── __init__.py            # Adapter registry
│   ├── base.py                # BaseAdapter interface
│   ├── shared_runtime.py      # Shared execution logic
│   ├── langgraph/adapter.py
│   ├── deepagents/adapter.py
│   ├── claude_code/adapter.py
│   ├── crewai/adapter.py
│   ├── autogen/adapter.py
│   └── openclaw/adapter.py
├── tools/                     # 14 tool files
├── skills/
│   ├── loader.py              # SKILL.md parser + tool loader
│   └── watcher.py             # Hot-reload file watcher
└── tests/                     # 148 pytest tests

Go Platform (94 files)

workspace-server/
├── cmd/
│   ├── server/main.go         # Entry point + dependency injection
│   └── cli/                   # molecli TUI dashboard
├── internal/
│   ├── handlers/              # 26 HTTP handler files (26k+ lines)
│   ├── registry/              # 6 files — workspace registry + access control
│   ├── provisioner/           # 8 files — Docker provisioning + tier enforcement
│   ├── ws/                    # 4 files — WebSocket hub + fanout
│   ├── events/                # 3 files — event broadcasting + Postgres persistence
│   ├── router/                # 2 files — route definitions + middleware
│   ├── db/                    # 6 files — Postgres + Redis drivers, migrations
│   └── crypto/                # 2 files — AES-256-GCM secrets encryption
└── migrations/                # 11 SQL migration files

Canvas Frontend (62 TypeScript files)

canvas/
├── src/
│   ├── store/                 # Zustand store (workspaces, viewport, chat, activity)
│   ├── components/            # React Flow nodes, side panel tabs, context menus, modals
│   ├── hooks/                 # Custom hooks (WebSocket, resize, etc.)
│   └── lib/                   # API client, utilities
└── tests/                     # 188 Vitest tests

19. Key Design Patterns

1. Import Cycle Prevention (Go)

Function injection avoids circular imports between packages:

hub := ws.NewHub(registry.CanCommunicate)
broadcaster := events.NewBroadcaster(hub)
registry.StartLivenessMonitor(ctx, onWorkspaceOffline)

2. JSONB Handling

Go []byte must convert to string() before JSONB insert with ::jsonb cast. lib/pq treats []byte as bytea, not JSONB.

3. Event Sourcing

structure_events table is append-only — never UPDATE, never DELETE. Provides complete audit trail and event replay capability.

4. Template Resolution

On workspace create: (1) check template folder → (2) try {runtime}-default → (3) generate minimal config via ensureDefaultConfig().

5. Hierarchy-Driven Everything

CanCommunicate() is the single source of truth. All operational concerns (communication, memory, access, approvals, event visibility) derive from the same hierarchy.

20. Docker Compose Orchestration

Full Stack (`docker-compose.yml`)

Service	Image	Port	Purpose
`postgres`	postgres:16	5432 (internal)	Primary database (`wal_level=logical`)
`redis`	redis:7	6379 (internal)	Cache + pub/sub (`notify-keyspace-events=KEA`)
`langfuse-clickhouse`	clickhouse/clickhouse-server	internal	Analytics backend
`langfuse-web`	langfuse/langfuse	3100	Observability UI
`platform`	Built from Go	8080	Control plane
`canvas`	Built from Next.js	3000	Frontend

Optional Profiles

Profile	Service	Purpose
`multi-provider`	LiteLLM proxy	Unified API for OpenAI, Anthropic, Google, etc.
`local-models`	Ollama	Local LLM inference

Infrastructure-Only (`docker-compose.infra.yml`)

Postgres + Redis + Langfuse only (for local development without containerized workspace-server/canvas).

21. Environment Variables

Platform (Go)

Variable	Default	Purpose
`DATABASE_URL`	`postgres://dev:dev@localhost:5432/molecule?sslmode=prefer`	Postgres connection
`REDIS_URL`	`redis://localhost:6379`	Redis connection
`PORT`	`8080`	Platform listen port
`PLATFORM_URL`	`http://host.docker.internal:8080`	Injected to workspace containers
`SECRETS_ENCRYPTION_KEY`	Optional	AES-256-GCM key (32 bytes) for tenant secret encryption. Provisioned at tenant boot by the control plane, which holds the master key in AWS KMS — see secrets-key-custody.md.
`CONFIGS_DIR`	`/configs`	Workspace config template directory
`PLUGINS_DIR`	`/plugins`	Shared plugin directory
`ACTIVITY_RETENTION_DAYS`	`7`	Activity log retention
`ACTIVITY_CLEANUP_INTERVAL_HOURS`	`6`	Cleanup frequency
`CORS_ORIGINS`	`http://localhost:3000,...`	CORS whitelist
`RATE_LIMIT`	`600`	Requests per minute
`WORKSPACE_DIR`	Optional	Shared workspace volume
`MEMORY_PLUGIN_URL`	Unset by default	v2 memory plugin sidecar address. Typically set externally — CP user-data injects `http://localhost:9100` on tenant EC2 boot, which `entrypoint-tenant.sh` reads as the signal to spawn the bundled `memory-plugin` sidecar on the matching loopback port. When unset, today (pre-#1747) the legacy `agent_memories` SQL path is used as silent fallback; after #1747 (RFC #1733 Phase A1) lands, memory MCP tools return a "plugin not configured" error instead.

Canvas (Next.js)

Variable	Default	Purpose
`NEXT_PUBLIC_PLATFORM_URL`	`http://localhost:8080`	Platform backend URL
`NEXT_PUBLIC_WS_URL`	`ws://localhost:8080/ws`	WebSocket URL
`PORT`	`3000`	Canvas listen port

Workspace Runtime (Python)

Variable	Default	Purpose
`WORKSPACE_ID`	`workspace-default`	Unique workspace identifier
`WORKSPACE_CONFIG_PATH`	`/configs`	Config directory mount
`PLATFORM_URL`	`http://platform:8080`	Platform connection
`PARENT_ID`	Empty	Parent workspace ID (set if nested)
`LANGFUSE_HOST`	`http://langfuse-web:3000`	Langfuse endpoint
`LANGFUSE_PUBLIC_KEY`	Optional	Langfuse auth
`LANGFUSE_SECRET_KEY`	Optional	Langfuse auth
`DEPLOYMENT_RETRY_ATTEMPTS`	`3`	Delegation retry count
`DELEGATION_TIMEOUT`	`120`	Delegation timeout (seconds)
`APPROVAL_TIMEOUT`	`300`	Approval wait timeout (seconds)
`AUDIT_LOG_PATH`	`/var/log/molecule/audit.jsonl`	Audit log file path

22. Recent Feature Highlights

Feature	Description
A2A streaming response	Real-time task result delivery via SSE (`message/sendSubscribe`)
Onboarding wizard	4-step guided first-run experience in Canvas
Global API keys	Platform-wide secrets with per-workspace override + AES-256 encryption
Coordinator enforcement	Team leads cannot do work, only route and aggregate
Cascade pause/resume	Pausing a parent cascades to all children; paused children can't be individually resumed
Graceful A2A errors	`[A2A_ERROR]` sentinel + retry with exponential backoff + fallback
Canvas error boundary	React class component catches render errors, shows retry button
Hydration retry	Banner with "Retry" button + `PLATFORM_URL` hint on WebSocket stale state
Activity log retention	Configurable cleanup (default 7 days, `ACTIVITY_RETENTION_DAYS`)
Security hardening	Hub double-close race fix (`sync.Once`), A2A proxy timeout (5min canvas, ∞ workspace), Python JSON decode guards

23. Known Gaps & Backlog

Test Coverage

18 of 26 Go handler files have zero unit tests: a2a_proxy, workspace, templates, registry, discovery, secrets, etc. Current: 278 tests with 25% baseline enforced.

Silent Failures

6+ locations with fire-and-forget ExecContext DB writes need proper error handling (activity log inserts, event broadcasts).

Python Tool Error Handling

Tools call resp.json() without catching JSON decode errors. Should wrap in try/except for malformed responses.

Branch-Level Work

Branch	Feature	Status
`feat/nemoclaw-t4-docker`	NemoClaw adapter (NVIDIA T4 support)	WIP
Backlog	Firecracker backend (faster cold starts)	Planned
Backlog	E2B backend (cloud-hosted code sandbox)	Planned
Backlog	pgvector semantic memory search	Planned
Backlog	Canvas search, batch operations, keyboard shortcuts	Planned

24. Licensing & Commercialization Path

Open Source (Current)

License: MIT
Strategy: Maximize adoption, zero friction
Model: Follows n8n Community Edition approach

SaaS Path (Future `molecule-cloud` repo)

Feature	Technology
Authentication	Clerk or Auth.js
Multi-tenancy	`org_id` column added to schema
Billing	Stripe integration
Managed infrastructure	ECS + Neon + Upstash
White-labeling	Custom Canvas branding

Key principle: No changes to core open-source repo. SaaS layer is purely additive.

25. OSS Growth Research

Analysis of 8 OSS agent projects (from oss-agent-growth-research.md):

Winning Launch Formula

[Viral Demo] + [HN Front Page] + [One Major Amplifier] + [Zero-Friction Install]
     ↓              ↓                   ↓                         ↓
  60s video     400+ upvotes      Karpathy / Altman /       docker compose up
  screen rec    top comment       Major YouTuber             3 commands max

Every Tier 1 launch (Open Interpreter, CrewAI) had all four elements.

Documentation Best Practice (Diataxis Model)

Type	Purpose	Example
Tutorials	Learning-oriented	"Build your first agent team in 5 minutes"
How-to guides	Task-oriented	"How to configure RBAC for production"
Explanation	Understanding-oriented	"Why memory follows org boundaries"
Reference	Information-oriented	API route tables, config schema

26. Technical Debt & Constraints

Hard Design Constraints

Platform never routes agent messages — A2A is strictly peer-to-peer
Postgres is fact source, Redis is cache — Redis loss is fully recoverable
structure_events is append-only — Never UPDATE, never DELETE
workspace-template has no business logic — Logic lives in workspace-configs-templates/
Bundles never include secrets — API keys forbidden from serialization
Hierarchy = topology — No manual edge wiring; all communication derived from parent_id

27. Production Deployment

Multi-Host Configuration

Docker-internal URLs (http://ws-{id}:8000) work directly between containers
Nginx on host handles TLS termination
For external HTTPS: proxy requests to host-mapped URLs

Volume Management

Mode	Configuration	Behavior
Default	No `WORKSPACE_DIR`	Each workspace gets isolated Docker volume `ws-{id}-workspace`
Shared	`WORKSPACE_DIR=/path`	All agents mount same host directory (read/write)

28. MCP Server & Integrations

Molecule AI MCP Server (`mcp-server/`)

20+ tools for Claude Code, Cursor, Codex, or any MCP client:

Workspace CRUD (list, create, get, delete, restart)
Agent communication (chat_with_agent)
Memory operations (commit_memory, search_memory)
Team management (expand_team, collapse_team)
Secrets management (set_secret, list_secrets)
File operations (read_file, write_file, delete_file)
Approvals (list_pending_approvals, decide_approval)
Config updates (update_workspace)
Templates (list_templates)

Transport: stdio (local CLI integration)

{
  "mcpServers": {
    "molecule": {
      "type": "stdio",
      "command": "node",
      "args": ["./mcp-server/dist/index.js"],
      "env": {"MOLECULE_URL": "http://localhost:8080"}
    }
  }
}

29. Summary Statistics

Metric	Value
Python runtime files	95
Go platform files	94
TypeScript/JS canvas files	62
Runtime adapter implementations	6
Go handler files	26
Postgres migrations	11
Core workspace tools	14
Platform API endpoints	40+
MCP tools	20+
Go tests	278 (with `-race` flag)
Canvas Vitest tests	188
Python pytest tests	148
Total tests	614
Activity retention	7 days (configurable)
Heartbeat interval	30s (default)
Redis liveness TTL	60s
Health sweep interval	15s (proactive)
Skill hot-reload propagation	~3 seconds
Coverage baseline (Go)	25% enforced in CI

30. Vision: From Agent Teams to Robot Teams

Molecule AI's workspace abstraction is runtime-agnostic by design. A workspace is a role with an A2A interface — not an LLM with a prompt. The same hierarchy, memory boundaries, approval chains, and governance that organize AI agents in containers today can organize any autonomous system that speaks A2A.

Phase	Era	Systems	Status
NOW	Software Agent Teams	LLM agents in Docker, 6 adapters, HMA, Langfuse, A2A	LIVE on main
NEXT	Terminal + Device Agents	Terminal bots, browser agents, IoT controllers, CI/CD agents	BUILDING
HORIZON	Embodied Robot Teams	Warehouse robots, autonomous vehicles, manufacturing cells, field inspection	HORIZON

The workspace is the role. The protocol is A2A. The boundary between digital and physical disappears — the organizational layer remains.

47 KiB Raw Blame History Unescape Escape

Molecule AI — Comprehensive Technical Documentation

Table of Contents

1. Executive Summary

2. Product Positioning

Core Narrative

Five Key Differentiators

What Molecule AI Is NOT

3. System Architecture

System Boundary Diagram

Network Model

Core Components

4. Database Schema

Core Tables

Redis Key Patterns

5. Workspace Lifecycle

State Machine

Status Definitions

Health Detection (Three Layers)

Cascade Behavior

6. Communication Rules

Hierarchy = Topology

Access Check Algorithm

7. Platform API Routes

Workspace Lifecycle (8 endpoints)

Registry & Discovery (5 endpoints)

Memory (6 endpoints)

Secrets & Config (5 endpoints)

A2A & Activity (5 endpoints)

Team & Hierarchy (2 endpoints)

Files, Terminal, Templates, Bundles (8 endpoints)

Observability & Real-Time (5 endpoints)

8. A2A Protocol

Message Format (JSON-RPC 2.0 over HTTP)

Two Call Modes

Discovery Flow

Task State Machine

Authentication

9. Hierarchical Memory Architecture

Three Scopes

Four Memory Surfaces

Memory → Skill Compounding Flywheel

10. Runtime Tier System

11. Provisioning & Container Lifecycle

Docker Networking

URL Resolution

Container Cleanup on Delete

12. Workspace Runtime

Entry Point: workspace/main.py

Runtime Configuration Schema (config.yaml)

Six Runtime Adapters

13. Skills System

Three Capability Sources

Skill Directory Structure

SKILL.md Frontmatter

Hot-Reload Pipeline

14. Bundle System

Bundle Format (.bundle.json)

Inclusion/Exclusion Rules

Workflow

15. Canvas UI

Tech Stack

Core Interactions

10-Tab Operations Panel

Real-Time Architecture

16. Tools & Capabilities

Workspace Tools (workspace/builtin_tools/)

Built-In MCP Tools (from .mcp.json)

17. Coordinator Pattern

How Team Expansion Works

Key Properties

18. Codebase Structure

Python Runtime (95 files)

Go Platform (94 files)

Canvas Frontend (62 TypeScript files)

19. Key Design Patterns

1. Import Cycle Prevention (Go)

2. JSONB Handling

3. Event Sourcing

4. Template Resolution

47 KiB

Raw Blame History

Entry Point: `workspace/main.py`

Runtime Configuration Schema (`config.yaml`)

Bundle Format (`.bundle.json`)

Workspace Tools (`workspace/builtin_tools/`)

Built-In MCP Tools (from `.mcp.json`)

Full Stack (`docker-compose.yml`)

Infrastructure-Only (`docker-compose.infra.yml`)

SaaS Path (Future `molecule-cloud` repo)

Molecule AI MCP Server (`mcp-server/`)