docs(tutorials): add Self-Hosted AI Agents deployment guide

Covers Docker, Fly Machines, and bare metal deployment models with use cases, configuration examples, and a comparison table. Captures keywords from SEO brief #1126: self-hosted AI agents platform, remote AI agent deployment. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-21 10:26:41 +00:00 · 2026-04-21 10:26:41 +00:00 · 8adf972dd9
commit 8adf972dd9
parent dbf53aa901
1 changed files with 242 additions and 0 deletions
--- a/docs/tutorials/self-hosted-ai-agents.md
+++ b/docs/tutorials/self-hosted-ai-agents.md
@ -0,0 +1,242 @@
+---
+title: "Self-Hosted AI Agents: Molecule AI on Docker, Fly Machines, or Bare Metal"
+date: 2026-04-21
+slug: self-hosted-ai-agents-molecule-ai
+description: "Molecule AI runs anywhere — Docker containers, Fly Machines, or bare metal. This guide covers all three deployment models, when to use each, and how to choose for your infra constraints."
+tags: [self-hosted, deployment, Docker, Fly Machines, tutorial, infrastructure]
+---
+
+# Self-Hosted AI Agents: Molecule AI on Docker, Fly Machines, or Bare Metal
+
+Molecule AI is designed to run wherever your agents need to run. Whether you're deploying on a single VPS, distributing agents across cloud VMs, or running on hardware that can't be containerized, Molecule AI has a path that fits.
+
+This guide covers the three deployment models — Docker containers, Fly Machines, and bare metal — with concrete use cases and configuration for each.
+
+## Choosing a Deployment Model
+
+| Model | Best for | Provisioning | Cold start | Isolation |
+|---|---|---|---|---|
+| **Docker** | Single-host, dev/test, one-box production | Manual (`docker run`) or Docker Compose | ~15–30s | Shared kernel |
+| **Fly Machines** | Multi-region, auto-scaling, per-tenant isolation | Platform API (`POST /workspaces`) | <1s | Firecracker microVM |
+| **Bare metal / remote** | On-prem, laptops, CI/CD, air-gapped | Manual registration | N/A | None (your infra) |
+
+All three models use the same agent runtime and A2A protocol. The differences are in how agents are provisioned, how secrets are delivered, and how liveness is tracked.
+
+## Model 1: Docker Containers
+
+The default deployment. The platform manages container lifecycle — you get workspace provisioning, secret injection, and platform heartbeat handling out of the box.
+
+**How it works:**
+
+```
+POST /workspaces → platform runs `docker run ghcr.io/molecule-ai/workspace-<runtime>`
+```
+
+The platform injects `WORKSPACE_ID`, `PLATFORM_URL`, and workspace secrets as environment variables before the container starts. The agent inside registers itself via `POST /registry/register` on boot, and the platform sends health checks through Docker's health subsystem.
+
+**Configuration:**
+
+```bash
+# Your platform's .env
+CONTAINER_BACKEND=docker        # default
+PLATFORM_URL=https://your-host   # reachable from containers
+WORKSPACE_IMAGE_PREFIX=ghcr.io/molecule-ai/workspace-
+
+# Optional: restrict which runtimes are allowed
+ALLOWED_RUNTIMES=hermes,claude-code,langgraph
+
+# For CI on the same host:
+WORKSPACE_NETWORK=host          # use host network for zero-config networking
+```
+
+**When to choose Docker:**
+- Single-host deployments (VPS, single EC2)
+- Dev/test environments where isolation is less critical
+- Teams that already have Docker infra
+- You want the platform to handle provisioning automatically
+
+## Model 2: Fly Machines
+
+Fly Machines are Firecracker microVMs managed by the Fly.io API. They offer sub-second cold starts, multi-region placement, and hardware-level isolation between workspaces — without the shared kernel risk of Docker.
+
+**How it works:**
+
+```
+POST /workspaces → platform calls Fly API → Fly Machine boots workspace image
+```
+
+The platform talks to Fly Machines API directly, passing workspace config and secrets as environment variables. The same agent runtime runs inside the Machine.
+
+**Configuration:**
+
+```bash
+# Your platform's .env
+CONTAINER_BACKEND=flyio
+FLY_API_TOKEN=<fly-deploy-token>          # flyctl tokens create deploy
+FLY_WORKSPACE_APP=my-molecule-workspaces   # Fly app for workspace Machines
+FLY_REGION=ord                             # default region (or leave for auto)
+```
+
+**Resource tiers** (configured per workspace via `"tier": 2|3|4`):
+
+| Tier | RAM | CPUs | Use case |
+|---|---|---|---|
+| T2 | 512 MB | 1 | Light workers, eval agents |
+| T3 | 2 GB | 2 | General-purpose orchestrators |
+| T4 | 4 GB | 4 | Heavy inference, long-context tasks |
+
+**Setting tier on creation:**
+
+```bash
+curl -X POST https://platform.moleculesai.app/workspaces \
+  -H "Authorization: Bearer ${ADMIN_TOKEN}" \
+  -d '{
+    "name": "eu-worker",
+    "runtime": "hermes",
+    "tier": 3,
+    "metadata": { "region": "ams" }
+  }'
+```
+
+Fly picks the closest region to the `region` metadata field, or defaults to `FLY_REGION`.
+
+**When to choose Fly Machines:**
+- Multi-tenant SaaS where workspace isolation matters (Firecracker = no shared kernel)
+- Sub-second cold starts matter (queue workers, on-demand workers)
+- You want multi-region agent distribution without managing your own fleet
+- You want pay-per-second billing instead of always-on VMs
+
+**See:** [Provision Workspaces on Fly Machines](/docs/tutorials/fly-machines-provisioner) — full walkthrough with `flyctl` commands.
+
+## Model 3: Bare Metal / Remote Agents
+
+For agents that can't be containerized — on-prem hardware, laptops, CI/CD runners — Molecule AI ships a registration API. Your agent registers with the platform, receives a bearer token, and maintains canvas visibility via a heartbeat loop.
+
+This is the most flexible model. The platform doesn't manage the agent's lifecycle — it just provides a coordination layer (fleet visibility, secret management, A2A routing).
+
+**How it works:**
+
+1. Create an external workspace via the API
+2. Register the agent and receive a one-time bearer token
+3. The agent starts a 30-second heartbeat loop
+4. The canvas shows the agent with a **REMOTE** badge
+
+**Step-by-step registration:**
+
+```bash
+ADMIN_TOKEN="your-admin-token"
+PLATFORM_URL="https://platform.moleculesai.app"
+AGENT_URL="https://your-agent.example.com"  # must be HTTPS and reachable
+
+# 1. Create external workspace
+WORKSPACE=$(curl -s -X POST "${PLATFORM_URL}/workspaces" \
+  -H "Authorization: Bearer ${ADMIN_TOKEN}" \
+  -H "Content-Type: application/json" \
+  -d "{
+    \"name\": \"CI Agent\",
+    \"runtime\": \"external\",
+    \"external\": true,
+    \"url\": \"${AGENT_URL}\"
+  }")
+WORKSPACE_ID=$(echo $WORKSPACE | jq -r '.id')
+
+# 2. Register and receive bearer token
+REG=$(curl -s -X POST "${PLATFORM_URL}/registry/register" \
+  -H "Authorization: Bearer ${ADMIN_TOKEN}" \
+  -H "Content-Type: application/json" \
+  -d "{
+    \"id\": \"${WORKSPACE_ID}\",
+    \"url\": \"${AGENT_URL}\",
+    \"agent_card\": {\"name\": \"CI Agent\", \"runtime\": \"external\"}
+  }")
+AUTH_TOKEN=$(echo $REG | jq -r '.auth_token')
+
+# 3. Heartbeat every 30s
+curl -s -X POST "${PLATFORM_URL}/registry/heartbeat" \
+  -H "Authorization: Bearer ${AUTH_TOKEN}" \
+  -d "{\"workspace_id\": \"${WORKSPACE_ID}\"}"
+```
+
+**Agent-side heartbeat (Python):**
+
+```python
+import requests, time, threading
+
+AUTH_TOKEN = "<from registration>"
+WORKSPACE_ID = "<from registration>"
+PLATFORM_URL = "https://platform.moleculesai.app"
+
+def heartbeat_loop():
+    while True:
+        requests.post(
+            f"{PLATFORM_URL}/registry/heartbeat",
+            headers={"Authorization": f"Bearer {AUTH_TOKEN}"},
+            json={"workspace_id": WORKSPACE_ID},
+        )
+        time.sleep(30)
+
+threading.Thread(target=heartbeat_loop, daemon=True).start()
+```
+
+**For agents behind NAT or firewall:**
+
+The platform needs to reach `AGENT_URL` for inbound A2A messages. Expose your agent with a tunnel:
+
+```bash
+# Cloudflare Tunnel (recommended for production)
+cloudflared tunnel --url http://localhost:8080
+
+# Or ngrok (quick dev/test)
+ngrok http 8080
+```
+
+Copy the public URL and use it as `AGENT_URL` in the registration call.
+
+**When to choose bare metal / remote:**
+- On-prem hardware that can't be containerized
+- Laptops or workstations where Docker isn't practical
+- CI/CD runners (GitHub Actions, Jenkins) that spin up per job
+- Air-gapped networks
+- Any scenario where the platform shouldn't own the agent's lifecycle
+
+**See:** [Register a Remote Agent on Molecule AI](/docs/tutorials/register-remote-agent) — full tutorial with CI/CD examples and minimal Python agent.
+
+## Comparing the Three Models
+
+| | Docker | Fly Machines | Bare Metal / Remote |
+|---|---|---|---|
+| Provisioning | Platform (`docker run`) | Platform (Fly API) | Manual via API |
+| Secrets | Injected as env vars at boot | Injected as env vars at boot | Pulled on demand via API |
+| Heartbeat | Platform (Docker health) | Platform (health check) | Agent sends every 30s |
+| Canvas badge | None (standard) | None (standard) | Purple REMOTE |
+| Cold start | ~15–30s | <1s | N/A |
+| Isolation | Shared kernel | Hardware (Firecracker) | None (your infra) |
+| Lifecycle managed | ✅ Yes | ✅ Yes | ❌ No (your code) |
+| Works with existing infra | ❌ No | ❌ No | ✅ Yes |
+| Best for | Single-host, dev/test | Multi-region, SaaS | On-prem, CI/CD, laptops |
+
+## Mixing Deployment Models
+
+You can combine models in the same organization. A typical production setup might look like:
+
+- **CI/CD agents** → bare metal / remote (register per pipeline run)
+- **Queue workers** → Fly Machines (auto-scale, sub-second spin-up)
+- **Staging / dev** → Docker on a single VPS
+- **Long-running services** → Fly Machines in the region closest to your users
+
+All of these show up on the same canvas, visible to the same orchestrator, reachable via A2A. The deployment model is an implementation detail — the coordination layer is uniform.
+
+## Which Model Should You Use?
+
+**Start with Docker** if you're evaluating Molecule AI or running on a single host. It's the lowest friction path.
+
+**Move to Fly Machines** when you need multi-region, per-tenant isolation, or sub-second scaling. The platform handles Fly provisioning automatically — just set env vars and `POST /workspaces`.
+
+**Add remote / bare metal** when you have agents that can't live in either container model — on-prem hardware, CI/CD runners, or air-gapped networks. Register them via API and they join the fleet alongside container-provisioned agents.
+
+→ [Register a Remote Agent](/docs/tutorials/register-remote-agent) — bare metal tutorial
+→ [Provision Workspaces on Fly Machines](/docs/tutorials/fly-machines-provisioner) — Fly Machines walkthrough
+→ [Platform API Reference](/docs/api-reference) — full endpoint documentation
+
+---
+*Molecule AI is open source. All three deployment models are documented in `docs/tutorials/` on `main`.*