Part 1 of 4 in the scalability refactor. Each role can now keep its
initial_prompt / idle_prompt / schedule prompts as sibling .md files
under files_dir/; inline YAML literals still work for backwards-compat.
## What changes
**Platform (org.go importer):**
- `OrgWorkspace` gains `InitialPromptFile`, `IdlePrompt`, `IdlePromptFile`,
`IdleIntervalSeconds`. The idle_* fields were previously dropped by the
org importer entirely — struct didn't declare them — which is why
engineer idle_prompts never propagated from org.yaml to live /configs
(I've been manually docker-cp'ing them in every maintenance cron).
- `OrgSchedule` gains `PromptFile`. Hourly/weekly cron prompts are the
largest bodies in org.yaml (1-5 KB each) and get resolved at import
time just like initial_prompt.
- `OrgDefaults` gains the same idle_* + *_file fields for org-wide fallback.
- New `resolvePromptRef(inline, fileRef, orgBaseDir, filesDir)` helper —
the single chokepoint for inline-vs-file resolution. Inline wins when
both are set. File refs route through `resolveInsideRoot` so a crafted
ref can't escape the org template directory (same traversal defense as
files_dir).
- `createWorkspaceTree` now injects idle_prompt + idle_interval_seconds
into the workspace's config.yaml (previously missing — that's the
second half of the idle-prompt propagation bug).
**Tests:**
- `org_prompt_ref_test.go` — 10 cases: inline-wins, file-read-when-empty,
both-empty, defaults-level resolution, inline-template mode errors,
traversal rejection (via file ref AND via files_dir), missing-file
errors, and YAML-unmarshal parsing for each new field.
**Proof migration:**
- Documentation Specialist (biggest role at 6.9 KB of prompts) moves from
inline YAML to `documentation-specialist/{initial-prompt.md,
schedules/daily-docs-sync.md, schedules/weekly-terminology-audit.md}`.
- org.yaml drops 1801 → 1687 lines (-6.3%) from just this one role.
## Why this matters
org.yaml is 108 KB of which 67 KB (62%) is prompt text. At the current
12-role template size that's already unreadable; the marketing + triage-
operator additions pushed it to 1801 lines. The 4-phase refactor aims:
- **Phase 1 (this PR):** platform support + 1 role proof.
- **Phase 2:** migrate remaining ~20 roles to file refs. Target: org.yaml
at ~600 lines of pure structural scaffolding.
- **Phase 3:** YAML `!include` preprocessor — split org.yaml into
teams/{research,dev,marketing,ops}.yaml shards.
- **Phase 4:** per-workspace atomization — each role gets its own
workspace.yaml manifest; org.yaml composes them.
## Backwards compatibility
- Inline `initial_prompt: |` / `prompt: |` / `idle_prompt: |` all still work.
- Missing `prompt_file` refs log + skip the schedule (not fatal) — fail
loud so bugs surface during deployment rather than silent-drop.
- Inline-template mode (POST /org/import with raw JSON body, no `dir`)
errors cleanly when a file ref is used — can't resolve files without a
base dir, surface that rather than guessing.
## Test plan
- [x] `go build ./...` clean
- [x] `go test -run 'TestResolvePromptRef|TestOrgYAML' ./internal/handlers/`
— 10 tests pass
- [x] `python -c "yaml.safe_load(...)"` on the edited org.yaml — parses
- [ ] Post-merge: deploy platform rebuild, run `POST /org/import` against
a fresh workspace, verify Documentation Specialist's /configs/config.yaml
contains the initial_prompt body and workspace_schedules rows contain
the cron prompts (phantom-success check: grep the actual content, not
just the row count).
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
||
|---|---|---|
| .claude | ||
| .githooks | ||
| .github/workflows | ||
| canvas | ||
| docs | ||
| infra | ||
| mcp-server | ||
| org-templates | ||
| platform | ||
| plugins | ||
| scripts | ||
| sdk/python | ||
| tests | ||
| workspace-configs-templates | ||
| workspace-template | ||
| .env.example | ||
| .gitattributes | ||
| .gitignore | ||
| .mcp.json | ||
| AGENTS.md | ||
| CLAUDE.md | ||
| docker-compose.infra.yml | ||
| docker-compose.yml | ||
| HANDOFF.md | ||
| LICENSE | ||
| PLAN.md | ||
| railway.toml | ||
| README.md | ||
| README.zh-CN.md | ||
| render.yaml | ||
The Org-Native Control Plane For Heterogeneous AI Agent Teams
The world's most powerful governance platform for AI agent teams.
Visual Canvas • Runtime Compatibility • Hierarchical Memory • Skill Evolution • Operational Guardrails
Docs Home • Quick Start • Architecture • Platform API • Workspace Runtime
The Pitch
Molecule AI is the most powerful way to govern an AI agent organization in production.
It combines the parts that are usually scattered across demos, internal glue code, and framework-specific tooling into one product:
- one org-native control plane for teams, roles, hierarchy, and lifecycle
- one runtime layer that lets LangGraph, DeepAgents, Claude Code, CrewAI, AutoGen, and OpenClaw run side by side
- one memory model that keeps recall, sharing, and skill evolution aligned with organizational boundaries
- one operational surface for observing, pausing, restarting, inspecting, and improving live workspaces
Most teams can build a workflow, a strong single agent, a coding agent, or a custom multi-agent graph.
Very few teams can run all of that as a governed organization with clear structure, durable memory boundaries, and production operations.
That is the gap Molecule AI closes.
Why Molecule AI Feels Different
1. The node is a role, not a task
In Molecule AI, a workspace is an organizational role. That role can begin as one agent, later expand into a sub-team, and still keep the same external identity, hierarchy position, memory boundary, and A2A interface.
2. The org chart is the topology
You do not wire collaboration paths by hand. Hierarchy defines the default communication surface. The structure is not decorative UI. It is part of the operating model.
3. Runtime choice stops being a dead-end decision
LangGraph, DeepAgents, Claude Code, CrewAI, AutoGen, and OpenClaw can all plug into the same workspace abstraction. Teams can standardize governance without forcing every group onto one runtime.
4. Memory is treated like infrastructure
Molecule AI's HMA approach is designed around organizational boundaries, not just “store more context somewhere.” Durable recall, scoped sharing, awareness namespaces, and skill promotion are all part of one coherent system.
5. It comes with a real control plane
Registry, heartbeats, restart, pause/resume, activity logs, approvals, terminal access, files, traces, bundles, templates, and WebSocket fanout are not afterthoughts. They are first-class parts of the platform.
The Category Gap Molecule AI Fills
| Category | What it does well | Where it breaks | What Molecule AI adds |
|---|---|---|---|
| Workflow builders | Visual task automation | Nodes are tasks, not durable organizational roles | Role-native workspaces, hierarchy, long-lived teams |
| Agent frameworks | Strong runtime semantics | Weak control plane and weak org-level operations | Unified lifecycle, canvas, registry, policies, observability |
| Coding agents | Excellent local execution | Usually not designed as team infrastructure | Workspace abstraction, A2A collaboration, platform ops |
| Custom multi-agent graphs | Full flexibility | Brittle topology and governance sprawl | Standardized operating model without losing runtime freedom |
What Makes Molecule AI Defensible
| Advantage | Why it matters in practice |
|---|---|
| Role-native workspace abstraction | Your org structure survives model swaps, framework changes, and team expansion |
| Fractal team expansion | A single specialist can become a managed department without breaking upstream integrations |
| Heterogeneous runtime compatibility | Different teams can keep their preferred agent architecture while sharing one control plane |
| HMA + awareness namespaces | Memory sharing follows hierarchy instead of leaking across the whole system |
| Skill evolution loop | Durable successful workflows can graduate from memory into reusable, hot-reloadable skills |
| WebSocket-first operational UX | The canvas reflects task state, structure changes, and A2A responses in near real time |
| Global secrets with local override | Centralize provider access, then override only where a workspace needs specialized credentials |
Runtime Compatibility, Compared
Molecule AI is not trying to replace the frameworks below. It is the system that makes them easier to run together.
| Runtime / architecture | Status in current repo | Native strength | What Molecule AI adds |
|---|---|---|---|
| LangGraph | Shipping on main |
Graph control, tool use, Python extensibility | Canvas orchestration, hierarchy routing, A2A, memory scopes, operational lifecycle |
| DeepAgents | Shipping on main |
Deeper planning and decomposition | Same workspace contract, team topology, activity stream, restart behavior |
| Claude Code | Shipping on main |
Real coding workflows, CLI-native continuity | Secure workspace abstraction, A2A delegation, org boundaries, shared control plane |
| CrewAI | Shipping on main |
Role-based crews | Persistent workspace identity, policy consistency, shared canvas and registry |
| AutoGen | Shipping on main |
Assistant/tool orchestration | Standardized deployment, hierarchy-aware collaboration, shared ops plane |
| OpenClaw | Shipping on main |
CLI-native runtime with its own session model | Workspace lifecycle, templates, activity logs, topology-aware collaboration |
| NemoClaw | WIP on feat/nemoclaw-t4-docker |
NVIDIA-oriented runtime path | Planned to join the same abstraction once merged; not yet part of main |
This is the key idea: many agent runtimes, one organizational operating system.
Why The Memory Architecture Compounds
Most projects stop at “we added memory.” Molecule AI pushes further:
| Conventional memory setup | Molecule AI |
|---|---|
| Flat store or weak namespaces | Hierarchy-aligned LOCAL, TEAM, GLOBAL scopes |
| Sharing is easy to overexpose | Sharing is explicit and structure-aware |
| Memory and procedure get mixed together | Memory stores durable facts; skills store repeatable procedure |
| Every agent can become over-privileged | Workspace awareness namespaces reduce blast radius |
| UI memory and runtime memory blur together | Separate surfaces for scoped agent memory, key/value workspace memory, and recall |
The flywheel
Task execution
-> durable insight captured in memory
-> repeated success becomes a signal
-> workflow promoted into a reusable skill
-> skill hot-reloads into the runtime
-> future work gets faster and more reliable
This is one of Molecule AI's strongest long-term advantages: the system can get more operationally capable without turning into one giant hidden prompt.
Self-Improving Agent Teams, Built Into Molecule AI
Most agent systems stop at "a smart runtime." Molecule AI pushes further: it gives teams a way to capture what worked, promote repeatable procedure into skills, reload those improvements into live workspaces, and keep the whole loop visible at the platform level.
| Positioning lens | Conventional self-improving agent pattern | Molecule AI |
|---|---|---|
| Unit of improvement | A single agent session or runtime | A workspace, a team, and eventually the whole org graph |
| Operational surface | Mostly hidden inside the agent loop | Visible in the platform, Canvas, activity stream, memory surfaces, and runtime controls |
| Strategic outcome | A smarter agent | A compounding organization with durable knowledge and governed reusable skills |
Where that shows up in Molecule AI
| Core mechanism | Molecule AI module(s) | Why it matters |
|---|---|---|
| Durable memory that survives sessions | workspace-template/builtin_tools/memory.py, workspace-template/builtin_tools/awareness_client.py, platform/internal/handlers/memories.go |
Memory is not just durable, it is workspace-scoped and can route into awareness namespaces tied to the org structure |
| Cross-session recall | platform/internal/handlers/activity.go (/workspaces/:id/session-search) |
Recall spans both activity history and memory rows, so the system can search what happened and what was learned without inventing a separate hidden store |
| Skills built from experience | workspace-template/builtin_tools/memory.py (_maybe_log_skill_promotion) |
Promotion from memory into a skill candidate is surfaced as an explicit platform activity, not a silent internal side effect |
| Skill improvement during use | workspace-template/skill_loader/watcher.py, workspace-template/skill_loader/loader.py, workspace-template/main.py |
Skills hot-reload into the live runtime, so improvements become available on the next A2A task without restarting the workspace |
| Persistent skill lifecycle | platform/cmd/cli/cmd_agent_skill.go, workspace-template/plugins.py |
Skills are not just generated once; they can be audited, installed, published, shared, mounted by plugins, and governed as reusable operational assets |
Why this matters in Molecule AI
-
The learning loop is org-aware, not just session-aware. Memory can live at
LOCAL,TEAM, orGLOBALscope, and awareness namespaces give each workspace a durable identity boundary. -
The learning loop is visible to operators. Promotion events, activity logs, current-task updates, traces, and WebSocket fanout mean self-improvement is part of the control plane, not a hidden black box.
-
The learning loop compounds across teams, not just one agent. A workflow learned by one workspace can become a governed skill, reload into the runtime, appear in the Agent Card, and become usable inside a larger organizational hierarchy.
The result is not just “an agent that learns.” It is an organization that gets more capable as its workspaces accumulate durable memory and reusable procedure.
What Ships In main
Canvas
- Next.js 15 + React Flow + Zustand
- drag-to-nest team building
- empty-state deployment + onboarding wizard
- template palette
- bundle import/export
- 10-tab side panel for chat, activity, details, skills, terminal, config, files, memory, traces, and events
Platform
- Go/Gin control plane
- workspace CRUD and provisioning
- registry and heartbeats
- browser-safe A2A proxy
- team expansion/collapse
- activity logs and approvals
- secrets and global secrets
- files API, terminal, bundles, templates, viewport persistence
Runtime
- unified
workspace-template/image - adapter-driven execution
- Agent Card registration
- awareness-backed memory integration
- plugin-mounted shared rules/skills
- hot-reloadable local skills
- coordinator-only delegation path
Ops
- Langfuse traces
- current-task reporting
- pause/resume/restart flows
- activity streaming
- runtime tiers
- direct workspace inspection through terminal and files
Built For Teams That Need More Than A Demo
Molecule AI is especially strong when you need to run:
- AI engineering teams with PM / Dev Lead / QA / Research / Ops roles
- mixed runtime organizations where one team prefers LangGraph and another prefers Claude Code
- long-lived agent organizations that need memory boundaries and reusable procedures
- internal platforms that want to expose agent teams as structured infrastructure, not ad hoc scripts
Architecture
Canvas (Next.js :3000) <--HTTP / WS--> Platform (Go :8080) <---> Postgres + Redis
| |
| +--> Docker provisioner / bundles / templates / secrets
|
+-------------------- shows --------------------> workspaces, teams, tasks, traces, events
Workspace Runtime (Python image with adapters)
- LangGraph / DeepAgents / Claude Code / CrewAI / AutoGen / OpenClaw
- Agent Card + A2A server
- heartbeat + activity + awareness-backed memory
- skills + plugins + hot reload
Quick Start
git clone https://github.com/Molecule-AI/molecule-monorepo.git
cd molecule-monorepo
./infra/scripts/setup.sh
# Boots Postgres (:5432), Redis (:6379), Langfuse (:3001),
# and Temporal (:7233 gRPC, :8233 UI) on the shared
# `molecule-monorepo-net` Docker network. Temporal runs with
# no auth on localhost — dev-only; production must gate it.
cd platform
go run ./cmd/server
cd ../canvas
npm install
npm run dev
Then open http://localhost:3000:
- Deploy a template or create a blank workspace from the empty state.
- Follow the onboarding guide into
Config. - Add a provider key in
Secrets & API Keys. - Open
Chatand send the first task.
Documentation Map
- Docs Home
- Quick Start
- Product Overview
- System Architecture
- Memory Architecture
- Platform API
- Workspace Runtime
- Canvas UI
- Local Development
- Ecosystem Watch — adjacent projects we track (Holaboss, Hermes, gstack, …)
- Glossary — how we use "harness", "workspace", "plugin", "flow" vs. ecosystem neighbors
Current Scope
The current main branch already includes the core platform, canvas, memory model, six production adapters, skill lifecycle, and operational surfaces. Adjacent runtime work such as NemoClaw remains branch-level until merged, and this README keeps that distinction explicit on purpose.
License
Business Source License 1.1 — copyright © 2025 Molecule AI.
Personal, internal, and non-commercial use is permitted without restriction. You may not use the Licensed Work to offer a competing product or service. On January 1, 2029, the license converts to Apache 2.0.
