molecule-ai/molecule-core

Fork 2

Files

T

claude-ceo-assistant f7e2976324

ci-arm64-advisory / fast-checks (pull_request) Waiting to run

Details

Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 9s

Details

Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 7s

Details

Check migration collisions / Migration version collision check (pull_request) Successful in 10s

Details

CI / Detect changes (pull_request) Successful in 7s

Details

CI / Python Lint & Test (pull_request) Successful in 5s

Details

E2E API Smoke Test / detect-changes (pull_request) Successful in 7s

Details

E2E Chat / detect-changes (pull_request) Successful in 7s

Details

E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Successful in 5s

Details

E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 10s

Details

E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Has been skipped

Details

Handlers Postgres Integration / detect-changes (pull_request) Successful in 6s

Details

Harness Replays / detect-changes (pull_request) Successful in 4s

Details

Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 4s

Details

E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Successful in 33s

Details

E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Successful in 50s

Details

Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 8s

Details

Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 9s

Details

lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 58s

Details

gate-check-v3 / gate-check (pull_request) Successful in 4s

Details

qa-review / approved (pull_request) Successful in 3s

Details

security-review / approved (pull_request) Successful in 3s

Details

sop-checklist / na-declarations (pull_request) N/A: (none)

Details

sop-checklist / all-items-acked (pull_request) Successful in 4s

Details

sop-checklist / review-refire (pull_request) Has been skipped

Details

sop-tier-check / tier-check (pull_request) Successful in 4s

Details

Ops Scripts Tests / Ops scripts (unittest) (pull_request) Successful in 1m6s

Details

E2E Staging External Runtime / E2E Staging External Runtime (pull_request) Successful in 5m25s

Details

CI / Shellcheck (E2E scripts) (pull_request) Successful in 20s

Details

E2E Chat / E2E Chat (pull_request) Successful in 33s

Details

E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 11s

Details

E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m58s

Details

Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2m44s

Details

Harness Replays / Harness Replays (pull_request) Successful in 6s

Details

CI / Platform (Go) (pull_request) Successful in 6m9s

Details

CI / Canvas (Next.js) (pull_request) Successful in 7m41s

Details

CI / all-required (pull_request) Successful in 32m0s

Details

CI / Canvas Deploy Reminder (pull_request) Has been skipped

Details

audit-force-merge / audit (pull_request) Successful in 32s

Details

chore: retire unmaintained workspace runtimes

2026-05-23 23:45:09 -07:00

13 KiB

Raw Blame History

Agent Runtime Adapters

Overview

The workspace runtime uses a pluggable adapter architecture — each maintained agent infrastructure (Claude Code, Codex, Hermes, OpenClaw) has its own adapter that bridges the A2A protocol to the infra's native interface.

Adapters live in workspace/adapters/<runtime>/ and are auto-discovered at startup. Each adapter implements BaseAdapter (from adapters/base.py) with setup() and create_executor() methods.

The runtime is selected via config.yaml:

runtime: claude-code    # or: codex, hermes, openclaw
runtime_config:
  model: sonnet
  auth_token_file: .auth-token
  timeout: 0

How It Works

The unified runtime checks the runtime field in config.yaml, discovers the matching adapter, calls adapter.setup(config) then adapter.create_executor(config) to get an AgentExecutor that handles A2A requests.

A2A request arrives
      |
      v
AgentExecutor.execute(context, event_queue)
      |  - extracts user message from A2A parts
      |  - extracts conversation history from params.metadata.history
      |  - sets current_task on heartbeat (shows on canvas card)
      |  - invokes the runtime adapter
      v
Response → A2A event queue → JSON-RPC response

Conversation History

Chat sessions in the Canvas UI send prior messages (up to 20) via params.metadata.history in each A2A message/send request. Executors extract this history:

Claude Code: Uses --resume <session_id> for native session continuity (history not needed)
Codex: Uses the Codex runtime's native session state
Hermes: Uses Hermes' agent runtime session handling
OpenClaw: Uses --session-id for native session continuity

Current Task Reporting

All executors update the workspace's current_task via the heartbeat during execution. This shows an amber banner on the canvas card. The shared set_current_task(heartbeat, task) function in a2a_executor.py handles this for all runtimes.

Built-in Adapters

Claude Code (`runtime: claude-code`)

runtime: claude-code
runtime_config:
  model: sonnet          # or opus, haiku
  auth_token_file: .auth-token   # OAuth token file in /configs/

Uses the Claude Agent SDK (claude-agent-sdk Python package) to invoke the Claude Code engine programmatically via ClaudeSDKExecutor. This replaced the earlier subprocess-based approach (claude --print ...) to eliminate stdout buffering, zombie processes, session-ID parsing fragility, and ~500ms per-message startup overhead.

The SDK uses the same Claude Code engine under the hood — plugins, CLAUDE.md discovery, hooks, auto-memory, and skills all work identically. The @anthropic-ai/claude-code npm package is still installed in the image because the SDK wraps it internally.

Auth: Uses the CLAUDE_CODE_OAUTH_TOKEN env var — the OAuth token is read from /configs/.auth-token and picked up by the SDK automatically.

Concurrency: Turns are serialized per-executor via an asyncio.Lock so session state stays race-free. Cooperative cancel support via aclose() on the SDK's async generator.

Important: Claude Code refuses to run as root with --dangerously-skip-permissions. The Dockerfile creates a non-root agent user.

Codex (`runtime: codex`)

runtime: codex
model: openai/gpt-5.3-codex

Hermes (`runtime: hermes`)

runtime: hermes
model: openai/gpt-4o

OpenClaw (`runtime: openclaw`)

Proxies A2A messages to OpenClaw via openclaw agent CLI subprocess. Handles its own session continuity via --session-id.

runtime: openclaw

Auth: Uses OpenClaw's own authentication (configured during openclaw setup).

Session Continuity (Claude Code)

Claude Code workspaces maintain conversation state across messages using the SDK's resume option:

First message: the SDK's ResultMessage returns a session_id
Subsequent messages: the SDK is called with resume=<session_id> to continue the same conversation
System prompt: only injected on the first message — resumed sessions already have it
Memories: recalled from the platform API on the first turn only; subsequent turns already have context

Session state is stored inside the container at ~/.claude/ and persists across messages but resets on container restart.

System Prompt

All runtimes load system-prompt.md from the workspace's config directory (/configs/system-prompt.md). For Claude Code (SDK executor) and other CLI runtimes, the prompt is re-read on each message (supports hot-reload without restart). A2A delegation instructions are appended automatically.

For LangGraph runtimes, the system prompt is built from multiple sources (config, skills, plugins, peer capabilities) at startup.

Auth Token Resolution

The CLI executor resolves auth tokens in this order:

Environment variable — CLAUDE_AUTH_TOKEN, OPENAI_API_KEY, etc.
Token file — /configs/.auth-token (relative to config dir)

For Claude Code specifically:

Extract your OAuth access token from the macOS keychain: security find-generic-password -s "Claude Code-credentials" -a "<username>" -w
Write it to workspace-configs-templates/claude-code-default/.auth-token
The provisioner copies this file to each new workspace's config dir

Auto-Provisioning Without Templates

Workspaces can be created without specifying a template. The platform automatically:

Creates a config directory (ws-<id>) under workspace-configs-templates/
Generates a minimal config.yaml with the workspace's name, role, runtime, and model
Copies .auth-token from the claude-code-default template (if it exists)
Merges any files previously uploaded via the Files API
Starts the container

This means you can create a workspace with just:

curl -X POST http://localhost:8080/workspaces \
  -H "Content-Type: application/json" \
  -d '{"name": "My Agent", "role": "Does things", "runtime": "claude-code"}'

And it provisions, registers, and comes online automatically.

Dockerfile

The unified workspace/Dockerfile includes both Python and Node.js:

FROM python:3.11-slim

# Node.js for CLI runtimes (claude-code, codex)
RUN apt-get update && apt-get install -y nodejs
RUN npm install -g @anthropic-ai/claude-code

# Non-root user (claude --dangerously-skip-permissions refuses root)
RUN useradd -m -s /bin/bash agent

# Python deps for LangGraph runtime
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

COPY *.py ./
USER agent
CMD ["python", "main.py"]

Inter-Agent Communication (A2A Delegation)

CLI-based workspaces can communicate with other workspaces via two mechanisms:

MCP Tools (Claude Code and other MCP-compatible runtimes)

For MCP-compatible runtimes, an A2A MCP server (a2a_mcp_server.py) is automatically injected via --mcp-config. This gives the agent three MCP tools:

Tool	Description
`list_peers`	Discover sibling/parent/child workspaces (name, ID, status, role)
`delegate_task`	Send a task to a peer and get their response via A2A
`delegate_task_async`	Send a task and return immediately with a task_id (for long tasks)
`check_task_status`	Poll an async task's status and get results when done
`get_workspace_info`	Get this workspace's own metadata

The agent uses these tools naturally — no special instructions needed. Access control is enforced by the platform registry.

Example flow: Marketing uses delegate_task(seo_id, "What is your status?") → A2A message to SEO → SEO responds → result returned to Marketing.

Delegation Error Handling

When delegate_task receives an error from a child (auth failure, timeout, offline), the MCP server wraps it as a DELEGATION FAILED message with instructions for the calling agent to: (1) try a different peer, (2) handle the task itself, or (3) inform the user which peer is unavailable and provide its own best answer. Errors are tagged with a [A2A_ERROR] sentinel prefix so they can be reliably distinguished from normal response text. Coordinator prompts and A2A instructions reinforce that agents must never forward raw error messages to the user.

CLI Commands (Custom runtimes)

For non-MCP runtimes, A2A instructions are injected into the system prompt. The agent uses bash commands:

a2a peers                          # List available peers
a2a delegate <workspace_id> <task>  # Send task to a peer
a2a info                           # Show workspace info

Both approaches use the same backend: platform registry for discovery, A2A protocol for messaging, and access control enforcement (parent↔child, siblings only).

Memory Tools

CLI runtimes keep the same memory tool surface as the Python runtime: commit_memory / commit_memory_v2 / search_memory / commit_summary / forget_memory are exposed via the workspace's MCP bridge and route through the platform's v2 memory plugin under the workspace's workspace:<id> namespace. See Memory Architecture for the backend.

Task Status Reporting

Any process inside a workspace container (cron jobs, scripts, background tasks) can update the canvas card display:

python3 -m molecule_runtime.molecule_ai_status "Running weekly SEO audit..."  # show on canvas
python3 -m molecule_runtime.molecule_ai_status ""                              # clear when done

From Python:

from molecule_runtime.molecule_ai_status import set_status
set_status("Analyzing competitor data...")

This pushes an immediate heartbeat with current_task to the platform, which broadcasts via WebSocket to the canvas. The task banner appears instantly on the workspace card.

Key Files

File	Role
`main.py`	Runtime selector — discovers adapter, calls setup/create_executor
`claude_sdk_executor.py`	`ClaudeSDKExecutor` for Claude Code runtime (SDK-based, replaces subprocess)
`executor_helpers.py`	Shared helpers: memory recall/commit, delegation results, heartbeat, system prompt, error sanitization
`cli_executor.py`	`CLIAgentExecutor` for Codex, Ollama, custom runtimes (subprocess-based)
`a2a_executor.py`	`LangGraphA2AExecutor`, shared `set_current_task()`, `_extract_history()`
`adapters/base.py`	`BaseAdapter` interface + `AdapterConfig` dataclass
`adapters/__init__.py`	Auto-discovers adapters from subdirectories
`molecule_ai_status.py`	CLI tool + module for updating canvas task display from any process
`a2a_mcp_server.py`	MCP server exposing A2A delegation tools (list_peers, delegate_task)
`a2a_cli.py`	CLI tool for A2A delegation (all runtimes)
`config.py`	`RuntimeConfig` dataclass, `runtime` field in `WorkspaceConfig`

Rate Limit Handling

Both executors include built-in retry logic with exponential backoff:

Empty responses (common rate limit signal) → retry up to 3 times (5s, 10s, 20s)
Rate limit errors (429, "overloaded") → retry with same backoff
Auth errors (OAuth token transient failures) → retry with backoff
Timeouts → kill subprocess (CLI) or close stream (SDK) and report (no retry)
All error messages are sanitized via sanitize_agent_error() — no raw stderr or exception details leak to the user chat

The A2A CLI (a2a_cli.py) also retries delegation calls on rate limits.

For production with many concurrent agents, consider:

Using different auth tokens per workspace (separate subscriptions)
Staggering agent invocations
Using delegate_task_async for long-running tasks

Known Limitations

Tier 1 (sandboxed): Read-only root filesystem is disabled for CLI runtimes because Claude Code needs writable directories (.claude/, .npm/, /tmp). Tier 1 still restricts the /workspace volume.
Rate limits: All workspaces share the same Claude subscription. Retry logic handles transient rate limits, but sustained high volume needs separate tokens.
Auth token lifecycle: OAuth tokens expire and need refreshing. Use claude setup-token for long-lived tokens in production.

Extending with New Runtimes

To add a new adapter:

Create workspace/adapters/<name>/ with:
- adapter.py — class extending BaseAdapter with setup() and create_executor() methods
- requirements.txt — runtime-specific Python dependencies (installed at container startup)
- __init__.py — exports adapter class as Adapter
The create_executor() method returns an AgentExecutor (from a2a.server.agent_execution) whose execute(context, event_queue) method handles A2A requests.
Use set_current_task() from a2a_executor.py for heartbeat/canvas integration.
Use it in config.yaml: runtime: <name>

13 KiB Raw Blame History

Agent Runtime Adapters

Overview

How It Works

Conversation History

Current Task Reporting

Built-in Adapters

Claude Code (runtime: claude-code)

Codex (runtime: codex)

Hermes (runtime: hermes)

OpenClaw (runtime: openclaw)

Session Continuity (Claude Code)

System Prompt

Auth Token Resolution

Auto-Provisioning Without Templates

Dockerfile

Inter-Agent Communication (A2A Delegation)

MCP Tools (Claude Code and other MCP-compatible runtimes)

Delegation Error Handling

CLI Commands (Custom runtimes)

Memory Tools

Task Status Reporting

Key Files

Rate Limit Handling

Known Limitations

Extending with New Runtimes

13 KiB

Raw Blame History

Claude Code (`runtime: claude-code`)

Codex (`runtime: codex`)

Hermes (`runtime: hermes`)

OpenClaw (`runtime: openclaw`)