molecule-ai/molecule-core

Fork 2

Files

T

hongming 0ff41fb4bf

ci-arm64-advisory / fast-checks (pull_request) Waiting to run

Details

CI / Canvas Deploy Reminder (pull_request) Blocked by required conditions

Details

Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Waiting to run

Details

sop-checklist / review-refire (pull_request) Waiting to run

Details

sop-tier-check / tier-check (pull_request) Waiting to run

Details

Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 4s

Details

CI / Detect changes (pull_request) Successful in 6s

Details

CI / Python Lint & Test (pull_request) Successful in 4s

Details

E2E API Smoke Test / detect-changes (pull_request) Successful in 6s

Details

E2E Chat / detect-changes (pull_request) Successful in 6s

Details

E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 7s

Details

E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Has been skipped

Details

E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Successful in 40s

Details

Handlers Postgres Integration / detect-changes (pull_request) Successful in 5s

Details

Harness Replays / detect-changes (pull_request) Successful in 4s

Details

Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 4s

Details

Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 7s

Details

Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 3s

Details

lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Failing after 58s

Details

Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Failing after 1m7s

Details

lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (pull_request) Successful in 20s

Details

review-check-tests / review-check.sh regression tests (pull_request) Successful in 8s

Details

lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Successful in 1m21s

Details

Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 11s

Details

lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m0s

Details

qa-review / approved (pull_request) Failing after 8s

Details

gate-check-v3 / gate-check (pull_request) Successful in 12s

Details

sop-checklist / all-items-acked (pull_request) [info tier:low] acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, l

Details

sop-checklist / na-declarations (pull_request) N/A: (none)

Details

security-review / approved (pull_request) Failing after 12s

Details

Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m24s

Details

Ops Scripts Tests / Ops scripts (unittest) (pull_request) Successful in 1m20s

Details

CI / Platform (Go) (pull_request) Successful in 2s

Details

CI / Canvas (Next.js) (pull_request) Successful in 2s

Details

CI / Shellcheck (E2E scripts) (pull_request) Successful in 2s

Details

E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 2s

Details

E2E Chat / E2E Chat (pull_request) Successful in 6s

Details

CI / all-required (pull_request) Successful in 35m10s

Details

E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 2s

Details

Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 3s

Details

Harness Replays / Harness Replays (pull_request) Successful in 1s

Details

docs: #1753 sweep awareness namespace mentions across narrative docs

Companion to #1737 (awareness backend removal) and #1742 (v2 plugin
schema isolation). The backend deletion already landed in #1737; the
two API spec lines that contradict the contract are patched in that
PR too. This sweep handles the ~30 narrative mentions across
architecture, runtime, and README copy that the #1737 review
deliberately deferred to keep the backend PR small.

Files touched:

- docs/architecture/memory.md — replaced §4 "Awareness-backed
  persistence" with §4 "Memory v2 plugin"; updated the practical-
  summary bullet to describe v2 plugin search instead of "enable
  awareness namespaces".
- docs/architecture/molecule-technical-doc.md — rewrote the "Four
  Memory Surfaces" table so the v2 plugin is the production row and
  agent_memories is correctly labeled as frozen legacy; removed the
  Awareness MCP Server example (§28 — never shipped); dropped
  awareness_client.py and awareness-memory references from the tool
  + MCP-server tables; updated the env-var tables to drop AWARENESS_*
  and add MEMORY_PLUGIN_URL with the actual production value.
- docs/agent-runtime/workspace-runtime.md — dropped AWARENESS_URL /
  AWARENESS_NAMESPACE from the example env block; rewrote the
  "Awareness And Memory Integration" section as "Memory Integration"
  with the actual v2 plugin contract (commit_memory_v2, namespace
  resolver, plugin-on-tenant-EC2 deployment).
- docs/agent-runtime/cli-runtime.md — rewrote "Workspace Awareness"
  section as "Memory Tools" pointing at the v2 plugin.
- docs/agent-runtime/config-format.md — dropped AWARENESS_URL /
  AWARENESS_NAMESPACE from the optional env list.
- docs/index.md — updated the "Hierarchical Memory" hero card +
  Memory row in the current-capability table to describe the v2
  plugin instead of awareness namespaces.
- README.md and README.zh-CN.md — replaced "awareness namespaces" /
  "awareness namespace" / "workspace awareness namespaces" wording
  throughout with v2-plugin-accurate equivalents (HMA + v2 plugin,
  per-workspace namespaces, pgvector semantic recall).

Deliberately not touched:

- docs/engineering/postmortem-2026-04-23-boot-event-401.md:109 uses
  "awareness" in the generic sense ("alert latency from merge to
  awareness") — unrelated to the feature.
- docs/architecture/molecule-technical-doc.md:631 ("Peer system
  prompts rebuilt with new awareness") — same, generic verb.
- docs/api-reference.md and docs/api-protocol/platform-api.md are
  patched in #1737 (the same fixed lines this would touch). When
  #1737 merges, those two contradictions resolve.

No code change, no migration, no tests.

Refs: closes #1753, follow-up to #1737 / #1735 / #1742.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-05-23 18:11:45 -07:00

12 KiB

Raw Blame History

Workspace Runtime

The workspace/ directory is Molecule AI's unified runtime image. Every provisioned workspace starts from this image, loads its own config, selects a runtime adapter, registers an Agent Card, exposes A2A, and joins the platform heartbeat/activity loop.

Runtime Matrix In Current `main`

Current main ships six adapters:

langgraph
deepagents
claude-code
crewai
autogen
openclaw

This is the merged runtime surface today. Branch-level experiments such as NemoClaw are separate and should be treated as roadmap/WIP, not merged support.

Adapter-specific behavior is documented in Agent Runtime Adapters.

What The Runtime Is Responsible For

loading config.yaml
running preflight checks before the workspace goes live
selecting an adapter based on runtime
loading local skills plus plugin-mounted shared rules/skills
constructing an Agent Card
serving A2A over HTTP
registering with the platform and sending heartbeats
reporting activity and task state
proxying durable memory tools through the v2 memory plugin
hot-reloading skills while the workspace is running

Environment Model

Common runtime environment variables:

WORKSPACE_ID=ws-123
WORKSPACE_CONFIG_PATH=/configs
PLATFORM_URL=http://platform:8080
PARENT_ID=
LANGFUSE_HOST=http://langfuse-web:3000
LANGFUSE_PUBLIC_KEY=...
LANGFUSE_SECRET_KEY=...

Important behavior:

WORKSPACE_CONFIG_PATH points at the mounted config directory for that workspace.
Memory MCP tools route through the platform's v2 memory plugin (see Memory Architecture doc); there is no per-workspace memory env var anymore — the plugin sidecar is provisioned at the tenant EC2 boundary.

Startup Sequence

At a high level, workspace/main.py does this:

Initialize telemetry.
Load config.yaml.
Run preflight validation.
Build the heartbeat loop.
Resolve the adapter from config.runtime.
Let the adapter run setup() and build an executor.
Build the Agent Card from loaded skills and runtime config.
Register the workspace with POST /registry/register.
Start heartbeats.
Start the skill watcher when skills are configured.
Serve the A2A app through Uvicorn.

Core Runtime Pieces

File	Responsibility
`main.py`	Entry point, adapter bootstrap, Agent Card registration, heartbeat startup, initial prompt execution
`config.py`	Parses `config.yaml` into the runtime config dataclasses
`adapters/`	Adapter registry and adapter implementations
`claude_sdk_executor.py`	`ClaudeSDKExecutor` — Claude Code runtime via `claude-agent-sdk` (replaces subprocess)
`executor_helpers.py`	Shared helpers for all executors: memory, delegation, heartbeat, system prompt, error sanitization
`a2a_executor.py`	Shared LangGraph execution bridge and current-task reporting
`cli_executor.py`	`CLIAgentExecutor` — subprocess executor for Codex, Ollama, custom runtimes
`skills/loader.py`	Parses `SKILL.md`, loads tool modules, returns loaded skill metadata
`skills/watcher.py`	Hot reload path for skill changes
`plugins.py`	Scans mounted plugins for shared rules, prompt fragments, and extra skills
`tools/memory.py`	Agent memory tools (route through the platform's v2 memory plugin via the workspace-server proxy)
`coordinator.py`	Coordinator-only delegation path for team leads

Skills, Plugins, And Hot Reload

The runtime combines three sources of capability:

workspace-local skills from skills/<skill>/SKILL.md
plugin-mounted rules and shared skills from /plugins
built-in tools like delegation, approval, memory, sandbox, and telemetry helpers

Hot reload matters because the runtime is designed to keep a workspace alive while its capability surface evolves:

edit SKILL.md
add/remove skill files
update tool modules
modify config prompt references

The watcher rescans the skill package, rebuilds the agent tool surface, and updates the Agent Card so peers and the canvas reflect the new capabilities.

Memory Integration

The runtime keeps the agent-facing contract stable:

commit_memory(content, scope) — legacy MCP name, routed through the v2 plugin's scope→namespace shim
commit_memory_v2(content, namespace) — direct v2 surface
search_memory(query, namespace?) — v2 plugin search with FTS + semantic scoring when the plugin declares the capability

All writes land in the workspace's workspace:<workspace_id> namespace unless the agent passes an explicit one. Cross-workspace namespaces (team:<root>, org:<root>) follow the platform's namespace ACL (internal/memory/namespace/resolver.go). There is no per-workspace memory env var on the runtime side — the plugin lives on the tenant EC2 (MEMORY_PLUGIN_URL=http://localhost:9100, set by CP user-data / entrypoint-tenant.sh) and the workspace-server proxies all memory calls through it.

See Memory Architecture for the full backend story.

Coordinator Enforcement

coordinator.py is not a generic “smart agent” mode. It is intentionally strict:

coordinators delegate
coordinators synthesize
coordinators do not quietly do the child work themselves

This matters because Molecule AI wants hierarchy to remain operationally real, not cosmetic.

Remote Agent Registration (External Workspaces)

External workspaces run outside the platform's Docker infrastructure — on your laptop, a cloud VM, an on-prem server, or a CI/CD agent. They register via the platform API and send heartbeats to stay live on the canvas.

How it differs from Docker workspaces

	Docker workspace	External workspace
Provisioning	Platform spins up a container	You provide the machine; platform just tracks it
Liveness	Docker health sweep	Heartbeat TTL (90s offline threshold)
Registration	Automatic at container start	Manual: `POST /workspaces` + `POST /registry/register`
Token	Inherited from container env	Minted at registration, shown once
Secrets	Baked in image or env var	Pulled from platform at boot via `GET /workspaces/:id/secrets`

Registration flow

1. Create the workspace:

curl -X POST http://localhost:8080/workspaces \
  -H "Authorization: Bearer <admin-token>" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "my-remote-agent",
    "runtime": "external",
    "external": true,
    "url": "https://my-agent.example.com/a2a",
    "parent_id": "ws-pm-123"
  }'

Returns { "id": "ws-xyz", "platform_url": "http://localhost:8080" }.

2. Register the agent with the platform:

curl -X POST http://localhost:8080/registry/register \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer <admin-token>" \
  -d '{
    "workspace_id": "ws-xyz",
    "name": "my-remote-agent",
    "description": "Runs on a cloud VM in us-east-1",
    "skills": ["research", "summarization"],
    "url": "https://my-agent.example.com/a2a"
  }'

The platform returns a 256-bit bearer token — save it, it is shown only once.

3. Pull secrets at boot:

curl http://localhost:8080/workspaces/ws-xyz/secrets \
  -H "Authorization: Bearer <your-token>"

Returns { "ANTHROPIC_API_KEY": "...", "OPENAI_API_KEY": "..." }. No credentials baked into images or env files.

4. Send heartbeats every 30 seconds:

curl -X POST http://localhost:8080/registry/heartbeat \
  -H "Authorization: Bearer <your-token>" \
  -H "Content-Type: application/json" \
  -d '{
    "workspace_id": "ws-xyz",
    "status": "online",
    "task": "analyzing Q1 sales data",
    "error_rate": 0.0
  }'

If the platform misses two consecutive heartbeats, the workspace shows offline on the canvas.

5. A2A with X-Workspace-ID header:

When sending A2A messages to sibling or parent workspaces, include the header so the platform can verify mutual auth:

curl -X POST http://localhost:8080/workspaces/ws-pm-123/a2a \
  -H "Authorization: Bearer <your-token>" \
  -H "X-Workspace-ID: ws-xyz" \
  -H "Content-Type: application/json" \
  -d '{"type": "status_report", "payload": {...}}'

Behind NAT — Cloudflare Tunnel / ngrok

If the agent machine has no public IP, use an outbound tunnel:

# ngrok
ngrok http 8000 --url https://my-agent.ngrok.io

# Cloudflare Tunnel
cloudflared tunnel run --token <token>

# Register the tunnel URL (not localhost)
curl -X POST http://localhost:8080/registry/update-card \
  -H "Authorization: Bearer <your-token>" \
  -d '{"workspace_id": "ws-xyz", "url": "https://my-agent.ngrok.io/a2a"}'

The agent initiates the outbound WebSocket to the platform — no inbound ports need to be opened on the firewall.

Revocation and re-registration

To revoke and re-register:

# Delete the workspace
curl -X DELETE http://localhost:8080/workspaces/ws-xyz \
  -H "Authorization: Bearer <admin-token>"

# Create fresh (new workspace_id, new token)

Re-registration with the same workspace_id does not issue a new token — use the token saved from first registration.

Full step-by-step: External Agent Registration Guide
Tutorial with CI/CD examples: Register a Remote Agent
API reference: Registry and Heartbeat

A2A And Registration

Each workspace exposes an A2A server, builds an Agent Card, and registers with the platform. The platform is used for:

discovery
liveness
event fanout
proxying browser-initiated A2A calls

But the long-term collaboration model remains direct workspace-to-workspace communication via A2A.

Known Limitations

Playwright / browser system libs are not installed

The base molecule-ai-workspace-runtime image (workspace/Dockerfile) is built on python:3.11-slim with Node.js 22, git, and gh — about 500 MB. It deliberately does not include the system libraries Chromium needs (libnss3, libatk-bridge2.0-0, libxkbcommon0, libcups2, libdrm2, libxcomposite1, libxdamage1, libxrandr2, libgbm1, libpango-1.0-0, libasound2, etc.). Adding them would inflate the image by ~200–250 MB (~40%) for every workspace, even though only frontend / QA workspaces ever launch a browser.

Practical consequences:

npx playwright test (and any other Chromium-driven E2E tooling) will fail at browser launch when run from inside an in-container workspace agent.
The error surface is missing-shared-object messages such as error while loading shared libraries: libnss3.so or Host system is missing dependencies to run browsers.
Unit and integration tests (Vitest, Jest, etc.) that don't spawn a real browser are unaffected.

Recommended workflow:

Run E2E in CI, not in-container. The Gitea Actions self-hosted runner (and the GitHub Actions runner used by mirror repos) has the full Playwright dep set installed and is the supported surface for E2E. Push a branch, let CI run the suite.
Local debugging of a single failing spec is best done on a developer laptop with npx playwright install-deps run once.
In-container iteration on test logic itself is fine — write specs, lint them, type-check them — just don't expect playwright test to actually launch a browser.

If a particular workspace role genuinely needs in-container E2E (a dedicated QA template, for instance), the right place to layer Playwright deps is in a role-specific adapter template image that does FROM molecule-ai-workspace-runtime:<tag> and adds RUN npx playwright install-deps. Open a request against molecule-ai-workspace-runtime if you need this template stamped.

Tracking issue: molecule-ai/molecule-app#7.

12 KiB Raw Blame History Unescape Escape

Workspace Runtime

Runtime Matrix In Current main

What The Runtime Is Responsible For

Environment Model

Startup Sequence

Core Runtime Pieces

Skills, Plugins, And Hot Reload

Memory Integration

Coordinator Enforcement

Remote Agent Registration (External Workspaces)

How it differs from Docker workspaces

Registration flow

Behind NAT — Cloudflare Tunnel / ngrok

Revocation and re-registration

Related docs

A2A And Registration

Known Limitations

Playwright / browser system libs are not installed

Related Docs

12 KiB

Raw Blame History

Runtime Matrix In Current `main`