docs(tutorials): add AWS EC2 provisioner tutorial — replace deprecated Fly tutorial

Fills the gap noted in PR #121's report: the deprecation banner on fly-machines-provisioner.md previously pointed straight at the controlplane source file (ec2.go) because there was no published tutorial. This PR replaces that link with a real tutorial that walks through env vars, the POST /cp/orgs and POST /workspaces flows, the tier-sizing table, lifecycle ops, and the migration rationale. Banner updated in fly-machines-provisioner.md to point at the new page.
docs: align runtime lists with workspace-server allowlist; remove stale NemoClaw/NVIDIA claims
2026-05-04 05:09:58 -07:00 · 2026-05-04 05:06:09 -07:00 · 2026-05-04 04:58:10 -07:00 · 2026-05-04 04:48:28 -07:00 · 2026-05-04 04:46:59 -07:00 · 2026-05-04 04:46:02 -07:00
28 changed files with 1888 additions and 106 deletions
--- a/app/(home)/page.tsx
+++ b/app/(home)/page.tsx
@ -1,29 +1,100 @@
 import Link from 'next/link';
 // Three quick-start lanes — keeps the home page from being a wall of text
 // and lets builders, operators, and integrators each find their entry
 // point in one click.
 const lanes = [
  {
    kicker: '01',
    title: 'Build a workspace',
    body: 'Pick a runtime template (Claude Code, LangGraph, CrewAI, Hermes, …), wire your tools, and ship.',
    href: '/docs/workspace',
    cta: 'Workspace guide →',
  },
  {
    kicker: '02',
    title: 'Run an organisation',
    body: 'Topology, A2A, three-tier memory, governance — the platform layer that ties multi-agent teams together.',
    href: '/docs/platform',
    cta: 'Platform reference →',
  },
  {
    kicker: '03',
    title: 'Publish to the Marketplace',
    body: 'Plugins, agents, and org bundles ship as signed manifests. Authors keep 80%, paid via Stripe Connect.',
    href: '/docs/marketplace',
    cta: 'Author guide →',
  },
 ];
 export default function HomePage() {
  return (
-    <main className="flex flex-1 flex-col items-center justify-center px-6 py-24 text-center">
+    <main className="flex flex-1 flex-col">
-      <h1 className="mb-4 text-5xl font-bold tracking-tight sm:text-6xl">
+      {/* Statusbar — mirrors the landing's "All systems · status.* · phase" strip */}
-        Molecule AI
+      <div className="border-b border-fd-border bg-fd-muted px-6 py-1.5 text-[11px] font-mono text-fd-muted-foreground flex flex-wrap justify-between gap-4">
-      </h1>
+        <span>
-      <p className="mb-8 max-w-2xl text-lg text-fd-muted-foreground">
+          <span className="inline-block size-1.5 rounded-full bg-[#2f7a4d] align-middle mr-1.5" />
-        Build and run multi-agent organisations. Templates, plugins, channels,
+          All systems · status.moleculesai.app
-        and the runtime that ties them together — documented end to end.
+        </span>
-      </p>
+        <span>Phase 33 shipped · Phase 35 Marketplace public beta</span>
      <div className="flex flex-wrap items-center justify-center gap-3">
        <Link
          href="/docs"
          className="rounded-md bg-fd-primary px-5 py-2.5 text-sm font-medium text-fd-primary-foreground transition-colors hover:opacity-90"
        >
          Read the docs
        </Link>
        <Link
          href="https://github.com/Molecule-AI/molecule-monorepo"
          className="rounded-md border border-fd-border px-5 py-2.5 text-sm font-medium transition-colors hover:bg-fd-muted"
        >
          View on GitHub
        </Link>
      </div>
      {/* Hero */}
      <section className="px-6 py-20 sm:py-28 max-w-6xl mx-auto w-full">
        <div className="text-[11px] font-mono uppercase tracking-[0.08em] text-fd-muted-foreground mb-4 flex items-center gap-2">
          <span className="inline-block size-1.5 rounded-full bg-[#c0532b]" />
          Documentation
        </div>
        <h1 className="text-5xl sm:text-6xl font-semibold tracking-tight leading-[1.05] mb-5 max-w-3xl">
          The operating system for{' '}
          <span className="text-[#3b5bdb]">AI agent organizations.</span>
        </h1>
        <p className="text-lg text-fd-muted-foreground max-w-2xl leading-relaxed mb-8">
          Build and run multi-agent organisations the way you'd staff a company.
          Templates, plugins, channels, runtimes, governance — documented end
          to end.
        </p>
        <div className="flex flex-wrap items-center gap-3">
          <Link
            href="/docs"
            className="rounded-md bg-fd-primary px-5 py-2.5 text-sm font-medium text-fd-primary-foreground transition hover:opacity-90"
          >
            Read the docs
          </Link>
          <Link
            href="https://github.com/Molecule-AI"
            target="_blank"
            rel="noopener noreferrer"
            className="rounded-md border border-fd-border px-5 py-2.5 text-sm font-medium transition hover:bg-fd-muted"
          >
            View on GitHub
          </Link>
        </div>
      </section>
      {/* Three lanes */}
      <section className="px-6 pb-24 max-w-6xl mx-auto w-full">
        <div className="grid grid-cols-1 md:grid-cols-3 gap-4">
          {lanes.map((lane) => (
            <Link
              key={lane.kicker}
              href={lane.href}
              className="group rounded-lg border border-fd-border bg-fd-card p-6 transition hover:border-fd-foreground hover:-translate-y-0.5"
            >
              <div className="text-[11px] font-mono text-[#3b5bdb] mb-3 tracking-[0.08em]">
                {lane.kicker}
              </div>
              <h3 className="text-base font-semibold mb-2">{lane.title}</h3>
              <p className="text-sm text-fd-muted-foreground leading-relaxed mb-4">
                {lane.body}
              </p>
              <div className="text-xs font-mono text-fd-foreground group-hover:text-[#3b5bdb] transition">
                {lane.cta}
              </div>
            </Link>
          ))}
        </div>
      </section>
    </main>
  );
 }
--- a/app/global.css
+++ b/app/global.css
@ -1,3 +1,32 @@
@import 'tailwindcss';
@import 'fumadocs-ui/css/neutral.css';
@import 'fumadocs-ui/css/preset.css';
 /* Warm-paper light theme — aligned with the landing page (moleculesai.app).
   Tokens map fumadocs' @theme variables onto our brand palette so docs,
   marketing, and the canvas read as one product. */
@theme {
  --font-sans: var(--font-geist), ui-sans-serif, system-ui, sans-serif;
  --font-mono: var(--font-mono), ui-monospace, SFMono-Regular, monospace;
  --color-fd-background: #fafaf7;
  --color-fd-foreground: #15181c;
  --color-fd-muted: #f3f1ec;
  --color-fd-muted-foreground: #5a5e66;
  --color-fd-popover: #ffffff;
  --color-fd-popover-foreground: #15181c;
  --color-fd-card: #ffffff;
  --color-fd-card-foreground: #15181c;
  --color-fd-border: #e6e2d8;
  --color-fd-primary: #3b5bdb;
  --color-fd-primary-foreground: #ffffff;
  --color-fd-secondary: #efece4;
  --color-fd-secondary-foreground: #15181c;
  --color-fd-accent: #efece4;
  --color-fd-accent-foreground: #15181c;
  --color-fd-ring: #3b5bdb;
  --color-fd-overlay: hsla(0, 0%, 0%, 0.18);
 }
 /* Dark mode keeps fumadocs' neutral defaults — readers expect docs sites
   to honor their system preference, and our landing only ships light. */
--- a/app/layout.config.tsx
+++ b/app/layout.config.tsx
@ -1,7 +1,50 @@
 import type { BaseLayoutProps } from 'fumadocs-ui/layouts/shared';
 // Molecule logo — the same triangle-of-nodes mark used on moleculesai.app.
 // Inlined as a JSX element so fumadocs renders it in the topbar without a
 // separate asset request.
 const MoleculeLogo = (
  <svg
    width="22"
    height="22"
    viewBox="0 0 28 28"
    fill="none"
    aria-hidden="true"
  >
    <circle cx="14" cy="6" r="2.5" fill="currentColor" />
    <circle cx="6" cy="20" r="2.5" fill="currentColor" />
    <circle cx="22" cy="20" r="2.5" fill="currentColor" />
    <circle
      cx="14"
      cy="14"
      r="1.6"
      fill="none"
      stroke="currentColor"
      strokeWidth="1.2"
    />
    <line x1="14" y1="8.5" x2="14" y2="12.6" stroke="currentColor" strokeWidth="1.2" />
    <line x1="8" y1="18.5" x2="12.7" y2="14.8" stroke="currentColor" strokeWidth="1.2" />
    <line x1="20" y1="18.5" x2="15.3" y2="14.8" stroke="currentColor" strokeWidth="1.2" />
  </svg>
 );
 export const baseOptions: BaseLayoutProps = {
  nav: {
-    title: 'Molecule AI',
+    title: (
      <span className="flex items-center gap-2 font-semibold tracking-tight">
        {MoleculeLogo}
        <span>Molecule AI</span>
        <span className="text-xs uppercase tracking-[0.08em] text-fd-muted-foreground font-mono">
          Docs
        </span>
      </span>
    ),
    url: 'https://doc.moleculesai.app',
  },
  links: [
    { text: 'Platform', url: 'https://app.moleculesai.app', external: true },
    { text: 'Marketplace', url: 'https://market.moleculesai.app', external: true },
    { text: 'Landing', url: 'https://www.moleculesai.app', external: true },
  ],
  githubUrl: 'https://github.com/Molecule-AI',
 };
--- a/app/layout.tsx
+++ b/app/layout.tsx
@ -1,10 +1,16 @@
 import './global.css';
 import { RootProvider } from 'fumadocs-ui/provider/next';
-import { Inter } from 'next/font/google';
+import { Geist, JetBrains_Mono } from 'next/font/google';
 import type { ReactNode } from 'react';
-const inter = Inter({
+const geist = Geist({
  subsets: ['latin'],
  variable: '--font-geist',
 });
 const jetbrains = JetBrains_Mono({
  subsets: ['latin'],
  variable: '--font-mono',
 });
 export const metadata = {
@ -19,8 +25,12 @@ export const metadata = {
 export default function Layout({ children }: { children: ReactNode }) {
  return (
-    <html lang="en" className={inter.className} suppressHydrationWarning>
+    <html
-      <body className="flex flex-col min-h-screen">
+      lang="en"
      className={`${geist.variable} ${jetbrains.variable}`}
      suppressHydrationWarning
    >
      <body className="flex flex-col min-h-screen font-sans">
        <RootProvider>{children}</RootProvider>
      </body>
    </html>
--- a/content/docs/agent-runtime/workspace-runtime.md
+++ b/content/docs/agent-runtime/workspace-runtime.md
@ -9,16 +9,18 @@ The `workspace/` directory is Molecule AI's unified runtime image. Every provisi
 ## Runtime Matrix In Current `main`
-Current `main` ships six adapters:
+Current `main` ships eight adapters:
 - `langgraph`
 - `deepagents`
 - `claude-code`
 - `langgraph`
 - `crewai`
 - `autogen`
 - `deepagents`
 - `hermes`
 - `gemini-cli`
 - `openclaw`
-This is the merged runtime surface today. Branch-level experiments such as NemoClaw are separate and should be treated as roadmap/WIP, not merged support.
+This is the merged runtime surface today. The canonical allowlist lives in `workspace-server/internal/handlers/admin_workspace_images.go` (`AllRuntimes`). Anything not on this list — including BYO runtimes — registers via the [external workspace](../external-agents.mdx) path, not as a built-in adapter.
 Adapter-specific behavior is documented in [Agent Runtime Adapters](./cli-runtime.md).
@ -73,6 +75,94 @@ At a high level, `workspace/main.py` does this:
 10. Start the skill watcher when skills are configured.
 11. Serve the A2A app through Uvicorn.
 ## Boot-Smoke Contract (`MOLECULE_SMOKE_MODE`)
 The image-publish CI pipeline runs each template's image with `MOLECULE_SMOKE_MODE=1` to exercise lazy imports inside `executor.execute()` against stub credentials and no network. The runtime detects the env var, invokes `executor.execute()` once with a stubbed `RequestContext` and a short timeout, then exits — registration, heartbeats, and the A2A server are skipped.
 This catches lazy imports that pure `python3 -c "import adapter"` smokes miss: imports nested inside `if`-branches, deferred until first call, or behind `importlib.import_module()`.
 ### What adapter authors need to do
 **Most adapters need to do nothing.** If `setup()` only writes files, parses config, or instantiates Python objects, the smoke gate just works.
 **Adapters whose `setup()` does real I/O must opt out of that I/O under smoke mode.** This applies to:
 - spawning subprocesses that require valid credentials (e.g. a gateway daemon)
 - making real network calls
 - writing to filesystem locations that need a specific uid/gid the smoke harness can't guarantee
 The contract:
 ```python
 async def setup(self, config: AdapterConfig) -> None:
    if os.environ.get("MOLECULE_SMOKE_MODE") == "1":
        return  # skip real I/O; runtime's smoke short-circuit handles the rest
    # ... real setup ...
 ```
 For shell entrypoints that wrap `molecule-runtime`:
 ```bash
 if [ "${MOLECULE_SMOKE_MODE:-0}" = "1" ]; then
  exec molecule-runtime
 fi
 ```
 ### What gets exercised under smoke mode
 - All `/app/*.py` modules import cleanly (covered by a separate static-import smoke step)
 - `adapter.setup()` runs (with the opt-out above for I/O-heavy adapters)
 - `adapter.create_executor()` runs
 - `executor.execute()` is invoked once against a stub `RequestContext`/`EventQueue` with `MOLECULE_SMOKE_TIMEOUT_SECS` (default 5s); a clean timeout exits 0, an import error exits non-zero
 ### What the gate does NOT prove
 A green gate means **"imports are healthy enough that `executor.execute()` reaches its body"** — that's the regression class the gate exists to catch (lazy `from x import y` inside an `if`-branch, or `importlib.import_module()` on a path that breaks after a wheel bump).
 It does **not** prove that `execute()` produces the right output for real input. The harness reports PASS in three distinct cases:
 1. **Clean return** — execute() ran to completion within the timeout.
 2. **Timeout** — execute() was still running when the timer fired (typical for adapters that do real I/O inside execute(): subprocess to a gateway, httpx call to an upstream LLM).
 3. **Any non-import exception** — execute() raised `RuntimeError`, auth errors, validation errors, etc. The harness only fails on `ImportError`/`ModuleNotFoundError`.
 The stub `RequestContext` carries a non-empty `"smoke test"` text message (so adapters relying on `extract_message_text(ctx)` returning input still work), and the harness never drains the `EventQueue` — what `execute()` writes back is ignored.
 If you need correctness coverage, write a separate integration test that runs the workspace against real or mocked infrastructure — the smoke gate is a strict subset.
 ### Stub env the smoke harness sets
 | Var | Value |
 |---|---|
 | `MOLECULE_SMOKE_MODE` | `1` |
 | `MOLECULE_SMOKE_TIMEOUT_SECS` | `10` (CI default) |
 | `WORKSPACE_ID` | `fake-smoke` |
 | `PYTHONPATH` | `/app` (mirrors the platform provisioner) |
 | `CLAUDE_CODE_OAUTH_TOKEN`, `ANTHROPIC_API_KEY`, `GEMINI_API_KEY`, `OPENAI_API_KEY` | `sk-fake-smoke-*` |
 A `config.yaml` from the template repo's root is mounted at `/configs/config.yaml`.
 ## Runtime Distribution: PyPI Is Canonical, The Git Mirror May Lag
 The runtime ships as **two surfaces**, and only one of them is wire-truth.
 | Surface | Repo / location | Role |
 |---|---|---|
 | **PyPI wheel** | `pip install molecule-ai-workspace-runtime==X.Y.Z` | **Canonical artifact.** Workspace template images, the controlplane runtime smoke harness, and self-hosters all consume this. |
 | **Git mirror** | [`Molecule-AI/molecule-ai-workspace-runtime`](https://github.com/Molecule-AI/molecule-ai-workspace-runtime) | **Human-readable copy.** Exists for browsing + giving `mirror-guard` a concrete branch to enforce its "no direct PRs" policy against. |
 Both are produced by the [`publish-runtime.yml`](https://github.com/Molecule-AI/molecule-monorepo/blob/main/.github/workflows/publish-runtime.yml) workflow on every push to `molecule-monorepo/workspace/`, but **the wheel publish and the mirror push are separate steps**. The mirror push can lag the wheel by hours, or be skipped entirely on transient failures while the wheel still ships.
 If you're chasing "is module X in the published runtime yet?", trust the wheel listing, not the mirror's `git log`:
 ```bash
 pip download molecule-ai-workspace-runtime==X.Y.Z --no-deps
 unzip -l molecule_ai_workspace_runtime-X.Y.Z-*.whl | grep your_module
 ```
 To find out what version the controlplane is actually deploying, check the workspace template image's `requirements.txt` pin (it's a `>=`, so the resolved version is whatever PyPI hands back at image-build time — not whatever's in the mirror).
 **Do not edit the git mirror directly.** `mirror-guard` rejects all PRs to `molecule-ai-workspace-runtime`. Edit `molecule-monorepo/workspace/` and let `publish-runtime.yml` regenerate both surfaces.
 ## Core Runtime Pieces
 | File | Responsibility |
--- a/content/docs/architecture.mdx
+++ b/content/docs/architecture.mdx
@ -77,7 +77,8 @@ The Platform is the central control plane responsible for:
 | `REDIS_URL` | (required) | Redis connection string |
 | `PORT` | `8080` | Server listen port |
 | `PLATFORM_URL` | `http://host.docker.internal:PORT` | URL passed to agent containers |
-| `SECRETS_ENCRYPTION_KEY` | (optional) | AES-256 key, 32 bytes |
+| `SECRETS_ENCRYPTION_KEY` | (optional) | AES-256 key, 32 bytes — static-mode envelope encryption (dev/self-host). Set `KMS_KEY_ARN` instead for production AWS KMS envelopes. |
 | `KMS_KEY_ARN` | (optional) | AWS KMS CMK ARN — production envelope encryption with per-secret data keys |
 | `CORS_ORIGINS` | `http://localhost:3000,http://localhost:3001` | Allowed CORS origins |
 | `RATE_LIMIT` | `600` | Requests per minute |
 | `MOLECULE_ENV` | (optional) | Set `production` to hide test endpoints |
@ -295,7 +296,9 @@ docker compose up
 ### SaaS
-Hosted at `moleculesai.app` with per-tenant isolation. Each tenant gets a dedicated Fly Machine running the tenant image. The `MOLECULE_ORG_ID` env var gates API access -- every non-allowlisted request must carry a matching `X-Molecule-Org-Id` header or gets a 404. When unset, the guard is a passthrough so self-hosted and dev environments are unaffected.
+Hosted at `moleculesai.app` with per-tenant isolation. Each tenant gets a dedicated AWS EC2 instance running the tenant image, provisioned by the control plane (`api.moleculesai.app`, hosted on Railway). The `MOLECULE_ORG_ID` env var gates API access -- every non-allowlisted request must carry a matching `X-Molecule-Org-Id` header or gets a 404. When unset, the guard is a passthrough so self-hosted and dev environments are unaffected.
 > **Migration note (Apr 2026):** SaaS infrastructure was migrated from Fly Machines to AWS EC2 (workspaces) + Railway (control plane). See the [`molecule-controlplane` README "Migration history"](https://github.com/Molecule-AI/molecule-controlplane#migration-history) for the canonical record.
 ### Tenant Image
--- a/content/docs/architecture/database-schema.md
+++ b/content/docs/architecture/database-schema.md
@ -80,7 +80,7 @@ CREATE TABLE workspace_secrets (
 );
 ```
-Stores API keys, credentials, and other secrets needed by workspace agents. Values are encrypted with AES-256 at the application layer. The encryption key comes from the `SECRETS_ENCRYPTION_KEY` environment variable on the platform — never stored in the database.
+Stores API keys, credentials, and other secrets needed by workspace agents. Values are sealed with envelope encryption — AWS KMS (per-secret data keys via `GenerateDataKey`, AES-256-GCM payload) when `KMS_KEY_ARN` is configured (production), or static-key AES-256-GCM under `SECRETS_ENCRYPTION_KEY` (dev / self-host). The static key, when used, is read from the platform environment and is never stored in the database. Both modes coexist during a KMS cutover, distinguished by a v2 prefix byte on KMS blobs. Implementation: `workspace-server/internal/crypto/envelope.go`.
 The provisioner reads secrets from this table, decrypts them, and injects them as environment variables when spinning up workspace containers. Secrets are never included in bundles (see [Constraints — Rule 5](../development/constraints-and-rules.md)).
--- a/content/docs/architecture/molecule-technical-doc.md
+++ b/content/docs/architecture/molecule-technical-doc.md
@ -149,7 +149,7 @@ Six runtime adapters ship production-ready on `main`: LangGraph, DeepAgents, Cla
 - Event broadcasting (Redis pub/sub → WebSocket fanout)
 - Docker provisioner with T1–T4 tier enforcement
 - Activity logging with configurable retention (default 7 days)
- Secrets management (AES-256-GCM encryption)
+- Secrets management (KMS-envelope encryption in prod; AES-256-GCM static-key mode for dev/self-host)
 - File, terminal, bundle, template, traces APIs
 - Langfuse integration
 - Prometheus metrics endpoint
@ -186,7 +186,7 @@ Six runtime adapters ship production-ready on `main`: LangGraph, DeepAgents, Cla
 |-------|---------|-------------|
 | `workspaces` | Current state registry | `id`, `name`, `role`, `tier` (1-4), `status`, `parent_id`, `agent_card` (JSONB), `url`, `forwarded_to`, `last_heartbeat_at`, `last_error_rate`, `active_tasks`, `uptime_seconds`, `current_task`, `runtime` |
 | `agents` | Agent assignment history | `workspace_id`, `model`, `status`, `removed_at`, `removal_reason` |
-| `workspace_secrets` | Encrypted credentials | `workspace_id`, `key`, `encrypted_value` (BYTEA, AES-256-GCM) |
+| `workspace_secrets` | Encrypted credentials | `workspace_id`, `key`, `encrypted_value` (BYTEA — KMS envelope in prod, AES-256-GCM static-key blob in dev/self-host) |
 | `agent_memories` | HMA-scoped memory | `workspace_id`, `content`, `scope` (LOCAL/TEAM/GLOBAL) |
 | `structure_events` | **Immutable** event log (APPEND-ONLY, never UPDATE/DELETE) | `event_type`, `workspace_id`, `agent_id`, `target_id`, `payload` (JSONB) |
 | `activity_logs` | Operational activity with retention | `workspace_id`, `activity_type`, `source_id`, `target_id`, `method`, `request_body`, `response_body`, `duration_ms`, `status`, `error_detail` |
@ -573,14 +573,16 @@ compliance:
 | Adapter | Core Strength | Image Tag |
 |---------|--------------|-----------|
 | **LangGraph** | Graph-based state machine, tool use, streaming | `workspace-template:langgraph` |
 | **DeepAgents** | Deep planning, multi-step task decomposition | `workspace-template:deepagents` |
 | **Claude Code** | Native coding workflows, CLI continuity, OAuth auth | `workspace-template:claude-code` |
 | **LangGraph** | Graph-based state machine, tool use, streaming | `workspace-template:langgraph` |
 | **CrewAI** | Role-based crews, structured task orchestration | `workspace-template:crewai` |
 | **AutoGen** | Multi-agent conversations, explicit strategies | `workspace-template:autogen` |
 | **DeepAgents** | Deep planning, multi-step task decomposition | `workspace-template:deepagents` |
 | **Hermes** | Multi-provider dispatch (Anthropic/Gemini native + OpenAI-compatible shim) | `workspace-template:hermes` |
 | **Gemini CLI** | Google Gemini CLI workspace | `workspace-template:gemini-cli` |
 | **OpenClaw** | CLI-native runtime, own session model | `workspace-template:openclaw` |
-**Branch-level WIP**: NemoClaw (NVIDIA T4 + Docker socket) on `feat/nemoclaw-t4-docker`.
+The canonical allowlist lives in `workspace-server/internal/handlers/admin_workspace_images.go` (`AllRuntimes`). Anything outside this list registers via the external-workspace path.
 Each adapter implements `setup()` + `create_executor()`. The base adapter provides shared infrastructure: system prompt assembly, skill loading, tool registration, coordinator detection, plugin injection.
@ -822,7 +824,7 @@ workspace-server/
 │   ├── events/                # 3 files — event broadcasting + Postgres persistence
 │   ├── router/                # 2 files — route definitions + middleware
 │   ├── db/                    # 6 files — Postgres + Redis drivers, migrations
-│   └── crypto/                # 2 files — AES-256-GCM secrets encryption
+│   └── crypto/                # 2 files — envelope encryption (KMS or AES-256-GCM static key)
 └── migrations/                # 11 SQL migration files
 ```
@ -905,7 +907,8 @@ Postgres + Redis + Langfuse only (for local development without containerized wo
 | `REDIS_URL` | `redis://localhost:6379` | Redis connection |
 | `PORT` | `8080` | Platform listen port |
 | `PLATFORM_URL` | `http://host.docker.internal:8080` | Injected to workspace containers |
-| `SECRETS_ENCRYPTION_KEY` | Optional | AES-256 key (32 bytes) for secret encryption |
+| `SECRETS_ENCRYPTION_KEY` | Optional | AES-256 key (32 bytes) for static-mode secret encryption — used when `KMS_KEY_ARN` is unset (dev/self-host) or to decrypt legacy blobs during a KMS cutover |
 | `KMS_KEY_ARN` | Optional | AWS KMS CMK ARN — when set, secrets use KMS envelope encryption (per-secret data keys via `GenerateDataKey`); production deployments use this path |
 | `CONFIGS_DIR` | `/configs` | Workspace config template directory |
 | `PLUGINS_DIR` | `/plugins` | Shared plugin directory |
 | `ACTIVITY_RETENTION_DAYS` | `7` | Activity log retention |
@ -949,7 +952,7 @@ Postgres + Redis + Langfuse only (for local development without containerized wo
 |---------|-------------|
 | **A2A streaming response** | Real-time task result delivery via SSE (`message/sendSubscribe`) |
 | **Onboarding wizard** | 4-step guided first-run experience in Canvas |
-| **Global API keys** | Platform-wide secrets with per-workspace override + AES-256 encryption |
+| **Global API keys** | Platform-wide secrets with per-workspace override; KMS envelope encryption in prod (AES-256-GCM static-key mode in dev/self-host) |
 | **Coordinator enforcement** | Team leads cannot do work, only route and aggregate |
 | **Cascade pause/resume** | Pausing a parent cascades to all children; paused children can't be individually resumed |
 | **Graceful A2A errors** | `[A2A_ERROR]` sentinel + retry with exponential backoff + fallback |
@ -978,7 +981,6 @@ Tools call `resp.json()` without catching JSON decode errors. Should wrap in try
 | Branch | Feature | Status |
 |--------|---------|--------|
 | `feat/nemoclaw-t4-docker` | NemoClaw adapter (NVIDIA T4 support) | WIP |
 | Backlog | Firecracker backend (faster cold starts) | Planned |
 | Backlog | E2B backend (cloud-hosted code sandbox) | Planned |
 | Backlog | pgvector semantic memory search | Planned |
--- a/content/docs/architecture/overview.md
+++ b/content/docs/architecture/overview.md
@ -20,7 +20,7 @@ Canvas (Next.js :3000) ←WebSocket→ Platform (Go :8080) ←HTTP→ Postgres +
 - **Workspace Server** (`workspace-server/`): Go/Gin control plane — workspace CRUD, registry, discovery, WebSocket hub, liveness monitoring.
 - **Canvas** (`canvas/`): Next.js 15 + React Flow (@xyflow/react v12) + Zustand + Tailwind — visual workspace graph.
- **Workspace Runtime** (`workspace/`): Shared runtime published as [`molecule-ai-workspace-runtime`](https://pypi.org/project/molecule-ai-workspace-runtime/) on PyPI. Supports LangGraph, Claude Code, OpenClaw, DeepAgents, CrewAI, AutoGen. Each adapter lives in its own standalone template repo (e.g. `molecule-ai-workspace-template-claude-code`). See `docs/workspace-runtime-package.md` for the full picture.
+- **Workspace Runtime** (`workspace/`): Shared runtime published as [`molecule-ai-workspace-runtime`](https://pypi.org/project/molecule-ai-workspace-runtime/) on PyPI. Supports Claude Code, LangGraph, CrewAI, AutoGen, DeepAgents, Hermes, Gemini CLI, and OpenClaw. Each adapter lives in its own standalone template repo (e.g. `molecule-ai-workspace-template-claude-code`). See `docs/workspace-runtime-package.md` for the full picture.
 - **molecli** (`workspace-server/cmd/cli/`): Go TUI dashboard (Bubbletea + Lipgloss) — real-time workspace monitoring, event log, health overview, delete/filter operations.
 ## Key Architectural Patterns
--- a/content/docs/architecture/staging-environment.md
+++ b/content/docs/architecture/staging-environment.md
@ -27,7 +27,8 @@ CP (Railway):       staging service                 production service
                    staging.api.moleculesai.app     api.moleculesai.app
 Tenant EC2s:        staging EC2 instances            production EC2 instances
-                    *.staging.moleculesai.app        *.moleculesai.app
+                    <slug>.staging.moleculesai.app   <slug>.moleculesai.app
                    (per-tenant CNAME, no wildcard)  (per-tenant CNAME, no wildcard)
 App (Vercel):       staging.app.moleculesai.app     app.moleculesai.app
                    (Vercel preview)                 (Vercel production)
@ -38,8 +39,10 @@ DB (Neon):          staging branch                   main branch
 Docker images:      platform-tenant:staging          platform-tenant:latest
                    (GHCR)                           (GHCR)
-Cloudflare:         *.staging.moleculesai.app        *.moleculesai.app
+Cloudflare:         per-tenant CNAMEs under          per-tenant CNAMEs under
-                    (separate tunnel/worker)         (tunnel per tenant)
+                    staging.moleculesai.app          moleculesai.app
                    (one CNAME + one tunnel          (one CNAME + one tunnel
                     per provisioned tenant)          per provisioned tenant)
 ```
 ## Deploy flow
@ -115,15 +118,35 @@ platform-tenant:sha-xxxxx  — immutable, pinned to specific commit
 #       pushes :latest only on manual promote
 ```
-### 5. Cloudflare: staging subdomain
+### 5. Cloudflare: per-tenant CNAMEs (no wildcard)
-Option A (simple): `*.staging.moleculesai.app` with its own tunnel/worker
+There is **no `*.staging.moleculesai.app` wildcard record** and there is no
-Option B (full): separate Cloudflare zone for staging (overkill)
+`*.moleculesai.app` wildcard either. The control plane writes a per-tenant
 CNAME at provision time, pointing `<slug>.<env-domain>` at that tenant's
 Cloudflare tunnel (`<tunnel-id>.cfargotunnel.com`).
-Recommend Option A:
+Verified in `molecule-controlplane/internal/provisioner/ec2.go` — the
- Add `staging.moleculesai.app` DNS records
+provisioner calls `Tunnel.CreateTunnelDNS(ctx, slug, domain, tunnelID)`
- Staging tenants get `slug.staging.moleculesai.app` subdomains
+during workspace provision, then records a `cf_dns` row in
- Production tenants get `slug.moleculesai.app` (unchanged)
+`tenant_resources` with `type=CNAME` for symmetric create/delete audit.
 Implications for staging:
 - Staging tenants get `<slug>.staging.moleculesai.app` only **after** they
  are provisioned through the staging control plane. The CNAME is
  written as part of `Provision()`.
 - Production tenants get `<slug>.moleculesai.app` the same way, against
  the production CP.
 - Pre-provision, an unknown slug returns **NXDOMAIN**. This is correct
  behavior, not a regression — there is no wildcard to catch the lookup.
 - Tests that hit a staging slug they have not provisioned themselves
  will fail with `getaddrinfo ENOTFOUND` (Node) or `Name or service not
  known` (curl). The fix is to provision your own slug against the
  staging CP first; do not file this as an infrastructure bug.
 The same model applies to both environments — the only difference is
 the parent zone (`staging.moleculesai.app` vs `moleculesai.app`) and the
 CP that writes the records.
 ### 6. EC2: staging tag
--- a/content/docs/development/constraints-and-rules.md
+++ b/content/docs/development/constraints-and-rules.md
@ -59,7 +59,7 @@ Direct A2A calls between workspaces are unauthenticated in MVP. Access control i
 ## 11. Secrets in Postgres, Encrypted
-Workspace secrets (API keys, credentials) are stored in Postgres with AES-256 encryption at the application layer. The encryption key comes from the `SECRETS_ENCRYPTION_KEY` environment variable. Secrets are never included in bundles, never logged, never exposed via API responses.
+Workspace secrets (API keys, credentials) are stored in Postgres under envelope encryption. Production deployments use AWS KMS (`KMS_KEY_ARN`): each secret gets a fresh data key via `GenerateDataKey`, the payload is sealed with AES-256-GCM, and the KMS-encrypted DEK is stored alongside the ciphertext — rotating the CMK is a no-op for existing blobs. Dev and self-host deployments fall back to static-key AES-256-GCM under `SECRETS_ENCRYPTION_KEY`. Secrets are never included in bundles, never logged, never exposed via API responses.
 ## 12. Last-Write-Wins for MVP
--- a/content/docs/external-agents.mdx
+++ b/content/docs/external-agents.mdx
@ -21,6 +21,17 @@ register and heartbeat by hand. Use it when your agent can't run an MCP
 stdio server.
 </Callout>
 ## Pick the right path
 | Your agent runs as | Best path | Why |
 |---|---|---|
 | **An MCP-aware runtime** (Claude Code, Hermes, OpenCode, Cursor, Cline) | [Bring Your Own Runtime (MCP)](/docs/runtime-mcp) | Universal `molecule-mcp` wheel — no HTTP server, no tunnel. |
 | **A Claude Code session on your laptop** | [Claude Code Channel Plugin](/docs/guides/claude-code-channel-plugin) | Polling-based; no tunnel/public URL needed. Set up in under a minute. |
 | Any HTTP server with a public URL | The flow on this page (or the [Python SDK guide](/docs/guides/external-agent-registration)) | Push-based; lower latency; works for any A2A-compatible HTTP endpoint. |
 | A custom A2A server you wrote yourself | The flow on this page | Direct register + heartbeat + handler. |
 The rest of this doc covers the third + fourth rows. For Claude Code or other MCP runtimes, follow the linked guides.
 ## Prerequisites
 - A running Molecule AI platform (default `http://localhost:8080`)
--- a/content/docs/glossary.md
+++ b/content/docs/glossary.md
@ -26,7 +26,7 @@ lands in the watch list with a colliding term, add a row here.
 | **team** | A named cluster of workspaces under a PM (org template `expand_team`). Used for role grouping in Canvas. | **CrewAI**: a "crew" is a sequence of agents that pass a task through a declared order. Our "team" is an org-chart abstraction, not an execution order. |
 | **skill** | A directory with `SKILL.md` that an agent invokes via the `Skill` tool. Skills are documentation + optional scripts that teach an agent a recipe. | **Anthropic Skills API**: nearly identical. **CrewAI tool**: closer to our plugin's MCP tool, not our skill. |
 | **channel** | An outbound/inbound social integration (Telegram, Slack, …) per-workspace, wired in `workspace_channels`. | Slack's "channel": the container for messages. We use "channel" for the adapter + credentials, not the conversation itself. |
-| **runtime** | The execution engine image tag for a workspace: one of `langgraph`, `claude-code`, `openclaw`, `crewai`, `autogen`, `deepagents`, `hermes`. | **LangGraph runtime**: the Python process running the graph. We use "runtime" for the Docker image + adapter pairing, not the inner process. |
+| **runtime** | The execution engine image tag for a workspace: one of `claude-code`, `langgraph`, `crewai`, `autogen`, `deepagents`, `hermes`, `gemini-cli`, `openclaw`. | **LangGraph runtime**: the Python process running the graph. We use "runtime" for the Docker image + adapter pairing, not the inner process. |
 ## GitHub Awesome Copilot disambiguation
--- a/content/docs/guides/claude-code-channel-plugin.md
+++ b/content/docs/guides/claude-code-channel-plugin.md
@ -0,0 +1,222 @@
 ---
 title: "Claude Code Channel Plugin — Connect a Claude Code Session as an External Workspace"
 description: "Bridge Molecule A2A traffic into a running Claude Code session via MCP. Polling-based, no tunnel required. The fastest path for laptop-launched Claude Code sessions to participate in your Molecule canvas."
 ---
 # Claude Code Channel Plugin
 Run [Claude Code](https://claude.com/claude-code) on your laptop and have it appear on the Molecule AI canvas as a first-class external workspace. Inbound A2A messages from peer workspaces surface as conversation turns; replies route back through Molecule's A2A endpoints.
 > **What this is:** [`Molecule-AI/molecule-mcp-claude-channel`](https://github.com/Molecule-AI/molecule-mcp-claude-channel) — an MCP-based "channel plugin" that turns a Claude Code session into a Molecule workspace.
 > **What this is NOT:** the [Python SDK / curl register flow](/docs/guides/external-agent-registration) for arbitrary HTTP-speaking agents. That flow needs a public URL the platform can POST to. This one polls — runs on any laptop behind any NAT.
 ---
 ## What you get
 ```
 Molecule peer ──A2A──▶ [your workspace] ──poll──▶ [plugin] ──MCP notification──▶ Claude Code
                              ▲                                                       │
                              └────── POST /workspaces/:id/a2a ◄── reply_to_workspace ──┘
 ```
 | Property | Value |
 |---|---|
 | **Inbound latency** | up to `MOLECULE_POLL_INTERVAL_MS` (default 5s) |
 | **Outbound latency** | direct POST — sub-second |
 | **Tunnel / public URL** | not required |
 | **Auth model** | per-workspace bearer token (same as Python SDK) |
 | **Multi-workspace** | yes, comma-separated list |
 ---
 ## Prerequisites
 | You need | Notes |
 |---|---|
 | A Molecule AI tenant | Self-hosted localhost or your `*.staging.moleculesai.app` SaaS tenant |
 | One or more workspace IDs | Created via canvas or `POST /workspaces` (see [External Agent Registration](/docs/guides/external-agent-registration)) |
 | The workspace bearer token | Shown once when the workspace is created — save it from the canvas modal |
 | Claude Code | `claude` CLI ≥ the version that supports `--channels` |
 | `bun` | The plugin runs under bun for fast startup; `bun install` is invoked automatically by `start` |
 > **Note:** The platform must be running molecule-core ≥ PR #2300, which shipped the `?since_secs=` query parameter on `GET /workspaces/:id/activity`. Available on all staging-onward and self-hosted main builds after 2026-04-29.
 ---
 ## Step 1 — Create the workspace
 In your Molecule canvas:
 1. Click **+ New workspace**
 2. Choose **External** runtime
 3. Set tier as needed; click **Create**
 4. The "Connect your external agent" modal opens — switch to the **Claude Code** tab
 5. Copy the entire snippet (everything from the `mkdir -p` line through `claude --channels ...`)
 Or via API:
 ```bash
 curl -X POST "$MOLECULE_PLATFORM_URL/workspaces" \
  -H "Content-Type: application/json" \
  -d '{"name": "My Claude Code", "external": true, "tier": 2}'
 ```
 The response includes `claude_code_channel_snippet` — same content as the canvas tab, ready to paste.
 ## Step 2 — Set up the channel config
 Run the snippet from Step 1. It does two things:
 ```bash
 mkdir -p ~/.claude/channels/molecule
 cat > ~/.claude/channels/molecule/.env <<'EOF'
 MOLECULE_PLATFORM_URL=https://your-tenant.staging.moleculesai.app
 MOLECULE_WORKSPACE_IDS=ws-uuid-1
 MOLECULE_WORKSPACE_TOKENS=<paste auth_token from create response>
 EOF
 chmod 600 ~/.claude/channels/molecule/.env
 ```
 Replace the token placeholder with the workspace bearer from Step 1.
 ## Step 3 — Launch Claude Code
 ```bash
 claude --channels plugin:molecule@Molecule-AI/molecule-mcp-claude-channel
 ```
 You should see on stderr (use `--debug` to surface):
 ```
 molecule channel: connected — watching 1 workspace(s) at https://your-tenant.staging.moleculesai.app
  workspaces: ws-uuid-1
  poll: every 5000ms with 30s window
 ```
 That's it — the workspace is live on the canvas with a purple **REMOTE** badge, and any A2A traffic the workspace receives surfaces as conversation turns in your Claude Code session.
 ---
 ## How replies work
 When a peer's message lands in your session, you'll see a turn with structured metadata:
 ```json
 {
  "method": "notifications/claude/channel",
  "params": {
    "content": "Hey, can you take a look at this? <issue body>",
    "meta": {
      "source": "molecule",
      "workspace_id": "ws-uuid-1",
      "peer_id": "ws-uuid-pm-coordinator",
      "method": "user_message",
      "activity_id": "act-...",
      "ts": "2026-04-29T..."
    }
  }
 }
 ```
 Reply normally — Claude calls the `reply_to_workspace` MCP tool with `peer_id` from the meta block, and the response flows back through `POST /workspaces/:peer_id/a2a` so peers see it just like any other A2A message.
 ---
 ## Multi-workspace setup
 Watch multiple workspaces from a single Claude Code session by comma-separating the lists. Both must have the same length and order:
 ```bash
 MOLECULE_WORKSPACE_IDS=ws-pm,ws-researcher,ws-engineer
 MOLECULE_WORKSPACE_TOKENS=tok-pm,tok-researcher,tok-engineer
 ```
 When Claude replies, the `reply_to_workspace` tool requires `workspace_id` (which of the watched workspaces to reply AS) explicitly. With a single workspace it's implicit.
 ---
 ## Configuration reference
 | Variable | Default | Purpose |
 |---|---|---|
 | `MOLECULE_PLATFORM_URL` | (required) | Tenant base URL (no trailing slash) |
 | `MOLECULE_WORKSPACE_IDS` | (required) | Comma-separated workspace UUIDs to watch |
 | `MOLECULE_WORKSPACE_TOKENS` | (required) | Comma-separated bearer tokens, **same order as IDs** |
 | `MOLECULE_POLL_INTERVAL_MS` | `5000` | How often each workspace is polled (ms) |
 | `MOLECULE_POLL_WINDOW_SECS` | `30` | `since_secs` window per poll. Wider than interval to recover from missed ticks |
 | `MOLECULE_STATE_DIR` | `~/.claude/channels/molecule` | Override state directory (testing) |
 ---
 ## Architecture notes
 ### Why polling instead of push?
 The [Python SDK external-agent flow](/docs/guides/external-agent-registration) uses **push**: register an inbound URL, platform POSTs A2A to that URL. Lower latency but requires a tunnel (ngrok / Cloudflare) or static IP — non-trivial for laptop sessions.
 This plugin uses **polling** as the default because it works through every NAT/firewall with zero infra. Cost: up to `MOLECULE_POLL_INTERVAL_MS` of inbound latency. For production setups where lower latency matters, push mode is on the v0.2 roadmap.
 ### Why the 30s window over a 5s interval?
 A single missed tick (transient network blip, GC pause, laptop sleep) shouldn't lose messages. The plugin re-fetches the last 30 seconds on every poll and dedups by `activity_id`, so 25 seconds of overlap is the recovery margin. Increase `MOLECULE_POLL_WINDOW_SECS` for noisier networks.
 ### Singleton lock
 Only one channel server runs per host — multiple instances would race the dedup state and double-deliver. The plugin maintains a PID file at `~/.claude/channels/molecule/bot.pid` and on startup kills any stale predecessor. This mirrors the [`@claude-plugins-official/telegram`](https://github.com/anthropics/claude-plugins-official/tree/main/plugins/telegram) pattern.
 ---
 ## Troubleshooting
 ### "molecule channel: required config missing"
 The plugin started before you filled in `.env`. Re-run the snippet from Step 2, then re-launch Claude Code.
 ### "molecule channel: poll `<ws-id>` returned 401"
 Bearer token mismatch. Two common causes:
 - The token in `MOLECULE_WORKSPACE_TOKENS` doesn't match the workspace whose ID is in the corresponding position of `MOLECULE_WORKSPACE_IDS`. Verify same-order pairing.
 - The workspace was rotated and the token was revoked. Generate a new token from the canvas Settings tab (or `POST /admin/workspaces/:id/tokens`).
 ### "molecule channel: poll `<ws-id>` returned 404"
 Either the workspace doesn't exist or the `MOLECULE_PLATFORM_URL` is wrong. Confirm:
 ```bash
 curl -fsS "$MOLECULE_PLATFORM_URL/workspaces/$WS_ID" \
  -H "Authorization: Bearer $WS_TOKEN" | jq '.workspace.id'
 ```
 ### A2A messages aren't surfacing
 Check that the watched workspace is actually receiving them — the plugin only pulls `activity_logs` rows whose `activity_type = a2a_receive`. If peers aren't sending to this workspace, there's nothing to surface. Verify with:
 ```bash
 curl -fsS "$MOLECULE_PLATFORM_URL/workspaces/$WS_ID/activity?type=a2a_receive&limit=10" \
  -H "Authorization: Bearer $WS_TOKEN" | jq
 ```
 If that returns events but Claude doesn't see them, file an issue at [`Molecule-AI/molecule-mcp-claude-channel`](https://github.com/Molecule-AI/molecule-mcp-claude-channel/issues) with the workspace_id + sample event.
 ---
 ## Limitations (v0.1)
 - **Polling-only inbound.** No push mode yet; latency floor is `MOLECULE_POLL_INTERVAL_MS`.
 - **No pairing flow.** Tokens are configured manually via `.env`; no canvas-side approval handshake.
 - **No file-attachment download.** URLs surface in the meta block; the host fetches on-demand.
 - **No outbound channel-init.** The plugin only sends replies (in response to inbound A2A); starting a fresh A2A conversation initiated FROM the Claude Code side requires a future `start_workspace_chat` tool.
 Track the v0.2 roadmap on the [plugin repo's README](https://github.com/Molecule-AI/molecule-mcp-claude-channel#limitations-v01).
 ---
 ## See also
 - [External Agent Registration](/docs/guides/external-agent-registration) — full A2A wire-shape reference + Python SDK + curl flow
 - [External Workspace Quickstart](/docs/guides/external-workspace-quickstart) — 5-min guide for any HTTP-speaking agent
 - [Remote Workspaces FAQ](/docs/guides/remote-workspaces-faq) — production hardening notes
 - [`Molecule-AI/molecule-mcp-claude-channel`](https://github.com/Molecule-AI/molecule-mcp-claude-channel) — plugin source code, issues, v0.2 roadmap
--- a/content/docs/guides/external-workspace-quickstart.md
+++ b/content/docs/guides/external-workspace-quickstart.md
@ -9,6 +9,8 @@ Run an agent on your laptop, a home server, a cloud VM, or any machine with inte
 > **Looking for the operator-focused reference?** See [External Agent Registration](/docs/guides/external-agent-registration) for full capability + auth details, or [Remote Workspaces FAQ](/docs/guides/remote-workspaces-faq) for hardening + production notes. This doc is the fast path.
 > **Running Claude Code on your laptop?** Skip this guide — use the [Claude Code Channel Plugin](/docs/guides/claude-code-channel-plugin) instead. It's polling-based and needs no tunnel, so your laptop session shows up on the canvas in under a minute.
 ---
 ## What is an "external workspace"?
--- a/content/docs/marketplace.mdx
+++ b/content/docs/marketplace.mdx
@ -0,0 +1,166 @@
 ---
 title: Marketplace
 description: A tiered library of plugins, agents, and bundles you can mount into any Molecule workspace.
 ---
 ## Overview
 The Molecule **Marketplace** is the distribution surface for reusable agent
 infrastructure. It surfaces three tiers of artifacts — from a single MCP
 plugin to a full team topology — and the same governance, memory, and audit
 substrate runs underneath each one.
 You browse and install via the Marketplace UI at
 [`https://moleculesai.app`](https://moleculesai.app), or pin entries from
 your `workspace.yaml` for reproducible deployments.
 ---
 ## Three Tiers
 | Tier | Name | Granularity | Mount as |
 |------|------|-------------|----------|
 | **L1** | Plugins | A single MCP server / tool pack | Tool capability on an agent or workspace |
 | **L2** | Agents | A prebuilt single-agent skill (prompts + tools + policy) | Workspace member |
 | **L3** | Bundles | A full team topology (root + children with their own scopes) | Workspace |
 The tier model is intentionally additive — an L3 Bundle is composed of L2
 Agents, which in turn use L1 Plugins. Forking a Bundle gives you the lineage
 to swap any constituent piece without rewiring the operating model.
 ### L1 — Plugins
 Plugins are MCP servers or agentskills.io packs. Examples:
 - `postgres` — read/write Postgres with role-scoped credentials
 - `slack` — post and search Slack with workspace-scoped tokens
 - `linear` — create / triage / comment on Linear issues
 - `gh-actions` — query and dispatch GitHub Actions runs
 - `sentry` — read incident timeline, ack alerts
 Plugins follow the [two-axis source/shape model](/docs/plugins) and install
 from either a curated `local://` source or a pinned `github://owner/repo#tag`.
 ### L2 — Agents
 Agents are single-purpose skills mounted as a workspace member. They ship with:
 - A **system prompt** baked in
 - A **tool manifest** specifying which L1 plugins they require
 - A **policy** declaring scope reads/writes and approval requirements
 Examples:
 - `code-reviewer` — five-axis review, posts inline comments via `gh-actions`
 - `oncall-triager` — reads Sentry, drafts a runbook step, requests approval before paging
 - `churn-analyst` — periodic Postgres + Stripe rollup, posts a weekly Slack summary
 Mount an agent via the workspace UI or `workspace.yaml`:
 ```yaml
 members:
  - kind: agent
    source: marketplace://l2/code-reviewer
    version: ^1.2.0
    scopes:
      - read: pull_requests
      - write: pull_request_comments
 ```
 ### L3 — Bundles
 Bundles are complete team topologies. A bundle ships:
 - A **root agent** that coordinates the team
 - One or more **child agents**, each with its own scope, memory, and tool
  list
 - A **policy graph** declaring which scopes the root can write through and
  which approvals route to humans
 Examples:
 - `growth-team` — root strategist + content-writer + analytics-rollup +
  experiment-designer
 - `platform-ops` — root SRE + on-call triager + change-reviewer +
  incident-scribe
 - `revenue-pod` — root commercial lead + churn-analyst + cs-summarizer +
  expansion-prospector
 Mount a bundle as a workspace:
 ```yaml
 workspace:
  bundle: marketplace://l3/platform-ops
  bundle_version: ^0.4.0
  overrides:
    members:
      change-reviewer:
        scopes:
          - read: ["github:Molecule-AI/*", "linear:eng"]
 ```
 Forking is encouraged — the bundle author publishes the operating model;
 your team tunes it for your processes without rebuilding the substrate.
 ---
 ## Trust Tiers
 Every Marketplace entry carries a **trust tier** that signals review depth
 and supply-chain provenance:
 | Trust | Vetting | Provenance |
 |-------|---------|------------|
 | **Verified** | Reviewed by Molecule for safety, prompt-injection resistance, and policy correctness | Published from a Molecule-controlled identity |
 | **Partner** | Reviewed by a Marketplace partner; carries the partner's identity badge | Published from a verified partner account |
 | **Community** | Self-published; static analysis + sandbox runtime; no human review | Pinned to a specific commit SHA |
 The trust tier is shown on every listing card and gated by enterprise
 policy: organizations on the Enterprise plan can restrict installs to
 Verified-only via `policy.marketplace.min_trust = verified`.
 ---
 ## Installing from the Marketplace
 Browse listings at [`https://moleculesai.app`](https://moleculesai.app).
 Each card shows tier (L1/L2/L3), trust badge, runtime compatibility, and
 required scopes. The "Install" flow:
 1. Picks a workspace (or creates a new one) to mount into.
 2. Surfaces required scopes for review and approval.
 3. Pins to a specific version (semver range, exact tag, or commit SHA).
 4. Writes the entry into your `workspace.yaml` and triggers a workspace
   redeploy.
 You can also install non-interactively:
 ```bash
 curl -X POST https://app.moleculesai.app/cp/orgs/$ORG/marketplace/install \
  -H "Authorization: Bearer $CP_ADMIN_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "tier": "l2",
    "slug": "code-reviewer",
    "version": "^1.2.0",
    "workspace_id": "ws_abc123"
  }'
 ```
 ---
 ## Listing on the Marketplace
 If you have built reusable agent infrastructure — a plugin, agent, or
 bundle — you can list it on the Marketplace and reach every Molecule
 organization. See [Listing on the Marketplace](/docs/marketplace/creators)
 for the full builder workflow.
 ---
 ## See also
 - [Plugins](/docs/plugins) — L1 source/shape model and install mechanics
 - [External Agents](/docs/external-agents) — bringing a non-Molecule agent runtime
 - [Workspace Configuration](/docs/workspace-config) — `workspace.yaml` reference
 - [Listing on the Marketplace](/docs/marketplace/creators) — builder workflow
--- a/content/docs/marketplace/creators.mdx
+++ b/content/docs/marketplace/creators.mdx
@ -0,0 +1,164 @@
 ---
 title: Listing on the Marketplace
 description: How builders ship plugins, agents, and bundles to every Molecule organization.
 ---
 ## Overview
 The Marketplace is open to external builders. If you have authored reusable
 agent infrastructure — an MCP plugin, a single-agent skill, or a full team
 bundle — you can list it and reach every Molecule organization. We handle
 distribution, billing, and policy; you keep the IP and the upgrade cadence.
 This page walks through the three-step workflow: **Build · List · Earn**.
 ---
 ## 1. Build
 You author your artifact against the open Molecule SDK. The same primitives
 we use internally are available to you:
 - **Workspace** — the durable boundary for memory, members, and policy
 - **A2A** — agent-to-agent messaging, used to talk to runtimes you don't
  own (LangGraph, CrewAI, etc.)
 - **Memory scopes** — hierarchical, governance-aware persistence
 - **Audit** — every action is captured at the orchestration layer
 Pick the tier that matches your artifact's granularity:
 ### L1 — Plugins
 A plugin is an MCP server (or an agentskills.io pack). The
 [two-axis source/shape model](/docs/plugins) describes how the workspace
 runtime loads it. Authoring requirements:
 - A `plugin.yaml` manifest declaring tools, required scopes, and runtime
  compatibility.
 - A README documenting the tool surface and side effects.
 - For MCP plugins: an HTTP or stdio MCP server pinned to a tagged commit.
 A reference plugin lives at
 [`Molecule-AI/molecule-ai-plugin-template`](https://github.com/Molecule-AI/molecule-ai-plugin-template).
 ### L2 — Agents
 An agent is a single workspace member with a baked-in prompt + tools +
 policy. Authoring requirements:
 - An `agent.yaml` manifest declaring system prompt, required L1 plugins,
  scope reads/writes, and approval triggers.
 - A `prompts/` directory with the system prompt and any reusable templates.
 - A `tests/` directory exercising the prompt against canned scenarios.
 Reference: [`Molecule-AI/molecule-ai-agent-template`](https://github.com/Molecule-AI/molecule-ai-agent-template).
 ### L3 — Bundles
 A bundle ships a complete team topology — a root agent plus children, each
 with its own scope and memory. Authoring requirements:
 - A `bundle.yaml` declaring members, their scopes, and the policy graph
  (which scopes the root can write through, which approvals route to
  humans).
 - A `members/` directory containing member-specific overrides if any
  member is a fork of an L2 agent.
 - A `topology.svg` diagram (auto-rendered from `bundle.yaml`, but you can
  override).
 Reference: [`Molecule-AI/molecule-ai-bundle-template`](https://github.com/Molecule-AI/molecule-ai-bundle-template).
 ---
 ## 2. List
 Submit through the **Creator Portal** at
 [`https://moleculesai.app/creators`](https://moleculesai.app/creators).
 The submission flow:
 1. **Connect** — link the GitHub repository hosting your artifact. We pull
   from tagged releases; we never re-tag or modify your code.
 2. **Manifest check** — we validate `plugin.yaml` / `agent.yaml` /
   `bundle.yaml` against the schema for your tier and surface any gaps.
 3. **Static analysis** — credential-shape scan, prompt-injection-pattern
   scan, and dependency vulnerability check on every tagged release.
 4. **Sandbox boot** — your artifact is mounted into a throwaway workspace
   to verify it boots, declares its scopes correctly, and surfaces a
   reasonable error path.
 5. **Trust tier** — every artifact starts at **Community**. Apply for
   **Partner** or **Verified** review once you have a couple of releases
   under your belt.
 Pricing is configured at submission:
 - **Free** — no charge to install.
 - **Per-seat** — a flat monthly amount per workspace member that mounts
  the artifact.
 - **Per-use** — metered against a unit you define (token calls, runs,
  alerts handled).
 - **Hybrid** — base seat fee plus metered overages.
 You can change pricing on subsequent releases; existing installs are
 grandfathered to the version they pinned.
 ---
 ## 3. Earn
 Once your listing is live, you receive:
 - **Distribution** — every Molecule organization sees your listing in the
  Marketplace UI, gated only by their policy (`min_trust`, region, etc.).
 - **Billing** — Molecule handles the charge to the installing
  organization, deducts the platform fee (15% as of writing; check the
  current rate in the Creator Portal), and pays out monthly.
 - **Audit visibility** — you see install counts, version distribution,
  and aggregated usage metrics in the Creator Portal. You do **not** see
  per-organization data.
 - **Upgrade cadence** — semver: bump tags, organizations on a `^range`
  pin pull updates on their next workspace redeploy. Major bumps require
  re-approval of any new scopes.
 ---
 ## Policy & Safety
 By listing, you agree to:
 - **No exfiltration** — your code does not transmit organization data
  outside the scopes it declares.
 - **Pinned releases** — every version is pinned to an immutable commit;
  retagging is not permitted.
 - **Disclose model usage** — if your agent calls an LLM API, declare the
  provider and model so enterprise plans can route through their own
  keys.
 - **Respect approval triggers** — if your `agent.yaml` declares a scope
  that requires human approval (e.g. `write: pull_request_merge`), you
  must call the approval API before acting.
 Listings that violate these terms are de-listed; refunds for affected
 installs are paid from your account.
 ---
 ## Maintenance
 Once a listing is live, you can:
 - Push new tagged releases — they enter the static-analysis + sandbox
  flow automatically.
 - Mark older versions as **deprecated** to nudge installs to upgrade.
 - File **security advisories** that surface to every organization on a
  vulnerable pinned version.
 - Yank a release in the rare case of a critical bug; organizations
  pinned to the yanked tag are notified and offered the next safe version.
 ---
 ## See also
 - [Marketplace](/docs/marketplace) — tier model and installation overview
 - [Plugins](/docs/plugins) — L1 plugin source/shape mechanics
 - [Workspace Configuration](/docs/workspace-config) — pinning marketplace
  entries in `workspace.yaml`
 - [Security &raquo; OWASP Agentic Top 10](/docs/security/owasp-agentic-top-10) — supply-chain considerations relevant to bundle authors
--- a/content/docs/meta.json
+++ b/content/docs/meta.json
@ -12,6 +12,7 @@
    "channels",
    "schedules",
    "runtime-mcp",
    "runtime-mcp/dev-channels-flag",
    "external-agents",
    "tokens",
    "api-reference",
@ -20,6 +21,9 @@
    "self-hosting/admin-token",
    "observability",
    "troubleshooting",
    "---Marketplace---",
    "marketplace",
    "marketplace/creators",
    "---Security---",
    "security/index",
    "security/safe-mcp-advisory",
@ -28,6 +32,8 @@
    "google-adk",
    "hermes",
    "---Integrations---",
-    "opencode"
+    "opencode",
    "---Migration---",
    "migration/a2a-sdk-v0-to-v1"
  ]
 }
--- a/content/docs/migration/a2a-sdk-v0-to-v1.mdx
+++ b/content/docs/migration/a2a-sdk-v0-to-v1.mdx
@ -0,0 +1,214 @@
 ---
 title: "a2a-sdk v0 → v1 migration"
 description: "Cheat sheet for migrating workspace runtime code (and forks) from a2a-sdk 0.3.x to 1.x — renamed/removed symbols, common error shapes, before/after diffs."
 ---
 import { Callout } from 'fumadocs-ui/components/callout';
 The `a2a-sdk` Python package released v1.0 in late April 2026. The
 Molecule workspace runtime migrated under tracking ID **KI-009** and
 shipped in `molecule-ai-workspace-runtime` **v0.1.11** (commit
 `d5cf872`, PR #39). The platform now runs exclusively on v1.
 If you're consuming the platform's published wheel, bumping
 `molecule-ai-workspace-runtime>=0.1.11` handles the migration for
 you. If you maintain a fork of the runtime, an external agent talking
 A2A directly, or your own adapter that imports from `a2a.*`, this page
 is your checklist.
 ## Why migrate
 - **Upstream**: `a2a-sdk` 1.0 reorganised the import surface, flattened
  `Part`, removed deprecated capability flags, and replaced the
  `A2AStarletteApplication` wrapper with explicit Starlette route
  factories.
 - **Platform**: as of 2026-04-24 the platform sends/receives via v1
  shapes natively. The SDK ships a v0_3 compat layer (enabled in the
  runtime via `enable_v0_3_compat=True` on `create_jsonrpc_routes`) so
  in-flight 0.x callers don't break, but new code should target v1.
 - **Forks/external runtimes**: v0 code throws on `import a2a.utils`
  and `from a2a.server.apps import A2AStarletteApplication` once you
  install v1, so the migration is a hard cutover at install time, not
  a soft deprecation.
 ## Cheat sheet — renamed and removed symbols
 The four breaking changes that hit the Molecule runtime during KI-009.
 All four are confirmed against
 `/Users/hongming/Documents/GitHub/molecule-monorepo/workspace/` source.
 ### 1. `new_agent_text_message` renamed to `new_text_message`
 - **v0 location**: `a2a.utils.new_agent_text_message`
 - **v1 location**: `a2a.helpers.new_text_message`
 Both the module path and the symbol name changed.
 ### 2. `Part` API flattened — `TextPart` removed
 - **v0**: `Part(root=TextPart(text="..."))` — `Part` wrapped a `root`
  union of `TextPart` / `FilePart` / `DataPart`.
 - **v1**: `Part(text="...")` — `Part` accepts the text payload
  directly. `TextPart` no longer exists as a public symbol.
 `FilePart` / `DataPart` are similarly flattened (`Part(file=...)`,
 `Part(data=...)`); the Molecule runtime only emits text parts so the
 file/data shapes weren't exercised in KI-009 and aren't covered by
 this guide.
 ### 3. `A2AStarletteApplication` removed — use route factories
 - **v0**: `from a2a.server.apps import A2AStarletteApplication` then
  `A2AStarletteApplication(agent_card, request_handler).build()`.
 - **v1**: `from a2a.server.routes import create_agent_card_routes,
  create_jsonrpc_routes` then build a Starlette app from the returned
  route lists.
 The factories also let you mount the JSON-RPC endpoint at any path
 (the runtime mounts at `/` because the platform POSTs to root, see
 `workspace/main.py:279`).
 ### 4. `state_transition_history` capability flag removed
 - **v0**: `AgentCapabilities(streaming=..., push_notifications=...,
  state_transition_history=True)` was a per-agent opt-in.
 - **v1**: the field is gone from `AgentCapabilities`. Per the SDK's own
  `a2a/compat/v0_3/conversions.py`: *"No longer supported in v1.0"*.
  The capability is now universal — `Task.history` is always available
  and `tasks/get` accepts `historyLength` via `apply_history_length()`.
 If you pass `state_transition_history=...` as a kwarg to
 `AgentCapabilities` under v1, Pydantic will reject it. Drop the kwarg.
 See [`workspace/main.py:215`](https://github.com/Molecule-AI/molecule-monorepo/blob/main/workspace/main.py#L215)
 for the explanatory comment that prevents future accidental re-adds.
 ## Common error shapes
 When v0 code runs against the v1 SDK, the failure modes look like this:
 | Error | Cause |
 |---|---|
 | `ModuleNotFoundError: No module named 'a2a.utils'` | v0 import path; module renamed to `a2a.helpers`. |
 | `ImportError: cannot import name 'A2AStarletteApplication' from 'a2a.server.apps'` | The whole `a2a.server.apps` module is gone in v1. Switch to `a2a.server.routes` factories. |
 | `ImportError: cannot import name 'TextPart' from 'a2a.types'` | Flattened `Part` API; use `Part(text=...)`. |
 | `ValueError: Protocol message AgentCapabilities has no "state_transition_history" field` | Removed capability flag passed as kwarg; drop it. |
 | `ValueError: Protocol message Part has no "root" field` | v0 `Part(root=TextPart(...))` shape against v1 schema; flatten to `Part(text=...)`. |
 The protobuf-style `ValueError` messages always follow the pattern
 `Protocol message <Type> has no "<field>" field` — that's the
 fingerprint of "v0 shape against v1 schema." Treat it as a v0→v1 hint
 even if the field name isn't on the cheat sheet above.
 ## Migration checklist
 1. **Bump the dep** — `a2a-sdk[http-server]>=0.3.25` is the floor; remove
   any `<1.0` upper bound. The Molecule wheel uses
   `a2a-sdk[http-server]>=0.3.25` with no upper bound (see
   [`molecule-ai-workspace-runtime/pyproject.toml`](https://github.com/Molecule-AI/molecule-ai-workspace-runtime/blob/main/pyproject.toml)).
 2. **Fix imports** — sweep the four renamed/removed symbols above. A
   safe grep is `grep -rn "from a2a\\|import a2a"` across your tree.
 3. **Fix removed-field reads/writes** — search for
   `state_transition_history` usage and delete the kwarg/field access.
 4. **Flatten `Part` constructors** — search for `Part(root=` and
   convert to `Part(text=...)` / `Part(file=...)` / `Part(data=...)`.
 5. **Replace the app factory** — search for `A2AStarletteApplication`
   and rewrite the bootstrap using `create_agent_card_routes` +
   `create_jsonrpc_routes`. Pass `enable_v0_3_compat=True` to
   `create_jsonrpc_routes` if your peers may still be on v0.
 6. **Re-run tests** — fixture-level mocks of `a2a.helpers` /
   `a2a.utils` need to mock both names so tests still pass during the
   rename rollout (see
   [`workspace/tests/conftest.py:105-111`](https://github.com/Molecule-AI/molecule-monorepo/blob/main/workspace/tests/conftest.py#L105-L111)
   for the dual-name pattern).
 ## Before / after diffs
 ### `new_agent_text_message` → `new_text_message`
 ```diff
 -from a2a.utils import new_agent_text_message
 +from a2a.helpers import new_text_message
 async def execute(self, context, event_queue):
 -    await event_queue.enqueue_event(new_agent_text_message("hello"))
 +    await event_queue.enqueue_event(new_text_message("hello"))
 ```
 ### Flat `Part` API
 ```diff
 -from a2a.types import Part, TextPart
 +from a2a.types import Part
 -msg_parts = [Part(root=TextPart(text=final_text))]
 +msg_parts = [Part(text=final_text)]
 ```
 ### `AgentCapabilities` — drop `state_transition_history`
 ```diff
 capabilities=AgentCapabilities(
     streaming=config.a2a.streaming,
     push_notifications=config.a2a.push_notifications,
 -    state_transition_history=True,
 ),
 ```
 ### `A2AStarletteApplication` → route factories
 ```diff
 -from a2a.server.apps import A2AStarletteApplication
 +from a2a.server.routes import create_agent_card_routes, create_jsonrpc_routes
 -app = A2AStarletteApplication(
 -    agent_card=agent_card,
 -    http_handler=request_handler,
 -).build()
 +routes = []
 +routes.extend(create_agent_card_routes(agent_card))
 +routes.extend(create_jsonrpc_routes(
 +    request_handler=request_handler,
 +    rpc_url="/",
 +    enable_v0_3_compat=True,
 +))
 +app = Starlette(routes=routes)
 ```
 The `enable_v0_3_compat=True` flag on `create_jsonrpc_routes` is what
 keeps in-flight v0 callers (peers that haven't migrated yet) from
 breaking — it accepts the old method names and translates them. The
 Molecule runtime ships with this flag on (see
 [`workspace/main.py:279`](https://github.com/Molecule-AI/molecule-monorepo/blob/main/workspace/main.py#L279));
 strip it once your entire fleet is on v1.
 ## For downstream consumers
 - **Using the published wheel** (`pip install
  molecule-ai-workspace-runtime>=0.1.11`): the migration is in the
  wheel — no code changes needed in your adapter or workspace template
  beyond bumping the pin.
 - **Running a fork of the runtime**: cherry-pick or rebase against
  commit `d5cf872` ("feat: migrate a2a-sdk 1.x (KI-009) (#39)") in
  `molecule-ai-workspace-runtime`. The diff is the canonical reference
  for what KI-009 actually changed.
 - **Standalone external agent** (talking A2A without the wheel): apply
  the [Migration checklist](#migration-checklist) directly to your
  source. The four cheat-sheet items are the entire surface that
  changed for the typical agent role; only `Part` flattening and the
  `state_transition_history` removal affect on-the-wire shapes — the
  other two are import-only.
 <Callout type="info">
 The wheel keeps `enable_v0_3_compat=True` on `create_jsonrpc_routes`,
 so a v0 peer can still hit a v1 wheel and vice versa during the
 migration window. You don't need to coordinate a fleet-wide cutover —
 migrate at your own pace.
 </Callout>
 ## See also
 - [`molecule-ai-workspace-runtime` v0.1.11 release](https://github.com/Molecule-AI/molecule-ai-workspace-runtime/releases/tag/v0.1.11) — first wheel containing KI-009
 - [PR #39 — feat: migrate a2a-sdk 1.x (KI-009)](https://github.com/Molecule-AI/molecule-ai-workspace-runtime/pull/39)
 - [PR #48 — feat(a2a): dual-compat for a2a-sdk 0.3.x and 1.x](https://github.com/Molecule-AI/molecule-ai-workspace-runtime/pull/48) — runtime-side compat shim that keeps v0 peers working against the v1 wheel
 - [Bring Your Own Runtime (MCP)](/docs/runtime-mcp) — universal wheel install path
 - [External Agents](/docs/external-agents) — manual A2A path for non-MCP runtimes
--- a/content/docs/quickstart.md
+++ b/content/docs/quickstart.md
@ -11,8 +11,8 @@ Get a Molecule AI workspace running in under five minutes.
 ## 1. Install Molecule AI
 ```bash
-git clone https://github.com/Molecule-AI/molecule-core.git
+git clone https://github.com/Molecule-AI/molecule-monorepo.git
-cd molecule-core
+cd molecule-monorepo
 docker compose up -d
 ```
@ -78,4 +78,4 @@ Or type `/ask what's our deployment status?` in your connected Discord channel.
 - [Review the REST API reference](/docs/guides/org-api-keys)
 - [Browse all guides](/docs/guides)
-Explore the [GitHub repo](https://github.com/Molecule-AI/molecule-core) for self-hosting options, or visit [moleculesai.app](https://moleculesai.app) for the hosted platform.
+Explore the [GitHub repo](https://github.com/Molecule-AI/molecule-monorepo) for self-hosting options, or visit [moleculesai.app](https://moleculesai.app) for the hosted platform.
--- a/content/docs/runtime-mcp.mdx
+++ b/content/docs/runtime-mcp.mdx
@ -52,14 +52,54 @@ set.
 ### Claude Code
 Two equivalent paths — pick whichever your version supports.
 **CLI (Claude Code 2.1+):** pass each env var with `-e`, scope with
 `-s user` so the server is available in every project, and put the
 command after `--`:
 ```bash
-claude mcp add molecule -s user -- env \
+claude mcp add molecule -s user \
-  WORKSPACE_ID=<your-workspace-uuid> \
+  -e WORKSPACE_ID=<your-workspace-uuid> \
-  PLATFORM_URL=https://<your-tenant>.moleculesai.app \
+  -e PLATFORM_URL=https://<your-tenant>.moleculesai.app \
-  MOLECULE_WORKSPACE_TOKEN=<your-token> \
+  -e MOLECULE_WORKSPACE_TOKEN=<your-token> \
-  molecule-mcp
+  -- molecule-mcp
 ```
 <Callout type="info">
 Older docs used a `-- env VAR=val ... molecule-mcp` shell trick (with
 `env` as the command). It still works but produces a less idiomatic
 `~/.claude.json` entry and trips up the post-2.1 flag parser if you
 forget the `--`. Prefer the `-e` form above.
 </Callout>
 **Direct edit of `~/.claude.json`:** add the entry under the **top-level
 `mcpServers` key** (this is the user-scope location — available in
 every project). If you'd rather scope it to a single project, use a
 `.mcp.json` file in that project's root with the same `mcpServers`
 shape.
 ```json
 {
  "mcpServers": {
    "molecule": {
      "type": "stdio",
      "command": "molecule-mcp",
      "args": [],
      "env": {
        "WORKSPACE_ID": "<your-workspace-uuid>",
        "PLATFORM_URL": "https://<your-tenant>.moleculesai.app",
        "MOLECULE_WORKSPACE_TOKEN": "<your-token>"
      }
    }
  }
 }
 ```
 If `molecule-mcp` isn't on the PATH that Claude Code sees (common on
 macOS — see [Troubleshooting](#command-not-found-molecule-mcp-from-inside-the-runtime)),
 replace `"command": "molecule-mcp"` with the absolute path from `which molecule-mcp`.
 Reconnect with `/mcp` (or restart the Claude Code session) and the tools
 appear in the next turn.
@ -104,31 +144,53 @@ Cline) and restart the client.
 ## Optional — declare your identity & capabilities
-Three additional env vars control how your workspace appears on the
+Four additional env vars control how your workspace appears on the
-canvas and to peer agents calling `list_peers`:
+canvas and how the wheel's inbound-delivery contract behaves:
 | Env var | What it sets | Default |
 |---|---|---|
 | `MOLECULE_AGENT_NAME` | Display name on the canvas card | `molecule-mcp-{id[:8]}` |
 | `MOLECULE_AGENT_DESCRIPTION` | One-line description in Details/Skills tabs | empty |
 | `MOLECULE_AGENT_SKILLS` | Comma-separated skill names — e.g. `research,code-review,memory-curation` | `[]` |
 | `MOLECULE_MCP_POLL_TIMEOUT_SECS` | How long the agent blocks on `wait_for_message` per turn (the universal poll path). `0` disables polling for push-only mode (Claude Code launched with `--dangerously-load-development-channels server:molecule`). Above 60 clamps to 60. | `2` |
 Skills are surfaced two places:
 1. **Canvas Skills tab** — each skill renders as a chip with the name
 2. **Peer agents calling `list_peers`** — they see `{name, skills: [...]}` for each peer, so other agents can route delegations to the right specialist instead of guessing from name alone
-Example with all three set:
+Example with all three set (Claude Code 2.1+ CLI form):
 ```bash
-claude mcp add molecule -s user -- env \
+claude mcp add molecule -s user \
-  WORKSPACE_ID=<uuid> \
+  -e WORKSPACE_ID=<uuid> \
-  PLATFORM_URL=https://<tenant>.moleculesai.app \
+  -e PLATFORM_URL=https://<tenant>.moleculesai.app \
-  MOLECULE_WORKSPACE_TOKEN=<token> \
+  -e MOLECULE_WORKSPACE_TOKEN=<token> \
-  MOLECULE_AGENT_NAME='Research Assistant' \
+  -e MOLECULE_AGENT_NAME='Research Assistant' \
-  MOLECULE_AGENT_DESCRIPTION='Reads, summarises, cites.' \
+  -e MOLECULE_AGENT_DESCRIPTION='Reads, summarises, cites.' \
-  MOLECULE_AGENT_SKILLS=research,summarisation,citations \
+  -e MOLECULE_AGENT_SKILLS=research,summarisation,citations \
-  molecule-mcp
+  -- molecule-mcp
 ```
 Or as the equivalent `~/.claude.json` entry:
 ```json
 {
  "mcpServers": {
    "molecule": {
      "type": "stdio",
      "command": "molecule-mcp",
      "env": {
        "WORKSPACE_ID": "<uuid>",
        "PLATFORM_URL": "https://<tenant>.moleculesai.app",
        "MOLECULE_WORKSPACE_TOKEN": "<token>",
        "MOLECULE_AGENT_NAME": "Research Assistant",
        "MOLECULE_AGENT_DESCRIPTION": "Reads, summarises, cites.",
        "MOLECULE_AGENT_SKILLS": "research,summarisation,citations"
      }
    }
  }
 }
 ```
 A peer agent's `list_peers()` call would then surface this workspace
@ -158,7 +220,7 @@ status. If the workspace is still offline after ~30s, check
 | `delegate_task` | Send a task to a peer and wait for the reply |
 | `delegate_task_async` | Fire-and-forget delegation; result lands in inbox |
 | `check_task_status` | Poll an async delegation |
-| `wait_for_message` | Block until the next inbound A2A message arrives |
+| `wait_for_message` | Block until the next inbound A2A message arrives — the universal inbound-delivery primitive (see [Inbound delivery](#inbound-delivery-universal-poll-optional-push)) |
 | `inbox_peek` / `inbox_pop` | Inspect / acknowledge queued inbound messages |
 | `send_message_to_user` | Push a chat bubble to the user's canvas |
 | `commit_memory` / `recall_memory` | Persistent KV (local / team / global scope) |
@ -168,29 +230,130 @@ External runtimes can't accept inbound HTTP, so the wheel polls
 through `wait_for_message` + `inbox_peek` / `inbox_pop`. Use those
 instead of waiting for an HTTP webhook — there isn't one.
-### Push-UX for notification-capable hosts
+### Inbound delivery: universal poll, optional push
-On top of the polling tools, the wheel emits a JSON-RPC notification
+Inbound messages reach the agent via one of two paths. The wheel
-(`notifications/claude/channel`) on every new inbound message. Hosts
+exposes both; which one fires depends on the host's capabilities.
-that recognise that method (Claude Code today; any compliant client
+Both paths converge on the same `inbox_pop` ack so dedup is automatic.
 tomorrow) treat the notification as a conversation interrupt — the
 message text becomes the next agent turn without the agent having to
 call `wait_for_message` first.
-Hosts that don't recognise the method silently ignore it, so the same
+**Poll path (universal default — works on every spec-compliant MCP
-wheel works for both push-capable and poll-only runtimes. There is no
+client).** The wheel's `initialize` handshake includes an `instructions`
-config flag to toggle: pollers keep polling, notification-capable hosts
+field telling the agent: *"At the start of every turn, before producing
-get push automatically.
+your final response, call `wait_for_message(timeout_secs=N)` to check
 for inbound messages."* Every MCP client surfaces `instructions` to
 the agent's system prompt automatically, so Claude Code, Cursor, Cline,
 OpenCode, hermes-agent, and codex all receive the polling contract
 without any per-client wiring. The 2-second default is tuned for the
 "peer A2A landed seconds before my turn started" common case; tune
 via the `MOLECULE_MCP_POLL_TIMEOUT_SECS` env var
 (see "Optional — declare your identity & capabilities" above).
 **Push path (Claude Code with channel push enabled — strictly
 better when available).** On top of the poll path, the wheel emits a
 JSON-RPC notification (`notifications/claude/channel`) on every new
 inbound message and declares the matching `experimental.claude/channel`
 capability in `initialize`. Claude Code with channel push enabled
 turns the notification into an inline `<channel source="molecule"
 ...>` synthetic user turn — zero agent-side polling cost, zero
 per-turn stall.
 **Today (research preview), Claude Code's channel push requires
 either the `--dangerously-load-development-channels` launch flag OR
 an entry on Claude Code's approved channel-server allowlist.** The
 wheel ships the wire shape correctly, but a standard `claude` launch
 without the flag silently drops the notification — which is why the
 poll path has to be the floor.
 See [Dev-channels flag — tagged-form requirement](/docs/runtime-mcp/dev-channels-flag)
 for the exact form the flag must take, the failure mode when it's
 wrong, and when operators need to set it manually vs. when the
 hosted SaaS / workspace template handles it for them.
 Since Claude Code 2.1.x the flag takes a tagged allowlist, not a bare
 switch. Pass each MCP server you want to push from as `server:<name>`
 (matching the name you registered the server under in Claude Code's
 config — `molecule` if you followed [Step 2](#claude-code) above):
 ```bash
 claude --dangerously-load-development-channels server:molecule
 ```
 Multiple entries are space-separated:
 `server:molecule server:telegram`. A bare
 `--dangerously-load-development-channels` (no value) is rejected with
 `argument missing`; an untagged value (`molecule`) is rejected with
 `entries must be tagged`. Easy way to confirm push is live: the
 session header prints `Listening for channel messages from:
 server:molecule`, and inbound canvas messages render inline as
 `← molecule: <text>` instead of arriving via `inbox_peek`.
 Set `MOLECULE_MCP_POLL_TIMEOUT_SECS=0` to disable polling entirely
 when you're running Claude Code with the dev-channels flag and don't
 want the per-turn stall. The instructions adapt automatically: with
 polling disabled, the agent is told push is the only delivery path.
 #### `<channel>` envelope attributes
 Every inbound message — push or poll — carries the same metadata
 shape. On the push path, attributes render inline as XML-style attrs
 on the `<channel>` tag; on the poll path, the same fields appear in
 the JSON returned by `inbox_peek` / `wait_for_message`. Either way,
 the agent sees a consistent view.
 | Attribute | When present | Description |
 |---|---|---|
 | `source` | always | Always `molecule` — distinguishes our channel from other registered servers (`telegram`, etc.). |
 | `kind` | always | `canvas_user` (a human in the canvas chat) or `peer_agent` (another workspace's agent). Drives reply routing. |
 | `peer_id` | always | Empty for `canvas_user`; the sender's workspace UUID for `peer_agent`. Use as `workspace_id` when calling `delegate_task` to reply. |
 | `peer_name` | `peer_agent` only | The peer's display name (e.g. `ops-agent`) resolved from the platform registry. Absent on registry-lookup failure — the push still delivers. |
 | `peer_role` | `peer_agent` only | The peer's declared role (e.g. `sre`, `coordinator`). Same registry source as `peer_name`; same graceful-degrade rule. |
 | `agent_card_url` | `peer_agent` only | URL of the platform's discover endpoint for this peer. Fetch it if you need the peer's full capability list (skills, runtime, etc.). |
 | `activity_id` | always | The inbox row ID. **Pass it to `inbox_pop` after handling** so the message isn't re-delivered on the next push or poll cycle. |
 | `ts` | always | ISO-8601 timestamp of when the message landed in the platform's activity log. |
 `peer_name` and `peer_role` are added by the wheel via a TTL'd
 registry lookup keyed on `peer_id`. Cache TTL is 5 minutes — long
 enough that a busy multi-peer chat doesn't hit the registry on every
 push, short enough that role/name renames propagate within a single
 agent session. Lookup failure is silent: the attributes are simply
 absent and the push delivers anyway, so a registry stall can never
 block inbound messages.
 `agent_card_url` is constructed deterministically from `peer_id`, so
 it's present even if the registry is down. The agent can hit it
 later to enumerate the sender's capabilities once the registry is
 back up.
 Worked push example for a `peer_agent` arrival:
 ```
 <channel source="molecule" kind="peer_agent"
         peer_id="11111111-2222-3333-4444-555555555555"
         peer_name="ops-agent" peer_role="sre"
         agent_card_url="https://platform.example.com/registry/discover/11111111-2222-3333-4444-555555555555"
         activity_id="act-742" ts="2026-05-01T12:34:56Z">
  Can you check the deploy status for the canary?
 </channel>
 ```
 | Client | Push path | Poll path |
 |---|---|---|
 | Claude Code with `--dangerously-load-development-channels server:molecule` | ✅ inline `← molecule:` tag | ✅ also works |
 | Claude Code (standard launch) | ❌ silently dropped | ✅ via instructions |
 | Cursor / Cline / OpenCode / codex | ❌ method ignored | ✅ via instructions |
 | hermes-agent | ❌ method ignored | ✅ naturally polls every cycle |
 ### MCP spec compliance
 The wheel speaks MCP protocol version **2024-11-05** over stdio
-JSON-RPC, declaring only the `tools` capability. It implements the
+JSON-RPC. It declares the standard `tools` capability plus the
-standard request methods and nothing client-specific:
+`experimental.claude/channel` capability for the optional push path
 (see [Inbound delivery](#inbound-delivery-universal-poll-optional-push)).
 It implements the standard request methods and nothing client-specific:
 | MCP method | Behavior |
 |---|---|
-| `initialize` | Echoes `protocolVersion: "2024-11-05"`, `serverInfo`, declares `tools` capability |
+| `initialize` | Echoes `protocolVersion: "2024-11-05"`, `serverInfo`, declares `tools` + `experimental.claude/channel` capabilities, returns the dual-path delivery `instructions` |
 | `notifications/initialized` | No-op (no response — per spec) |
 | `tools/list` | Returns all exposed tools in one response (no pagination cursor — surface is small) |
 | `tools/call` | Dispatches by name, returns `content: [{ type: "text", text: ... }]` |
@ -198,8 +361,10 @@ standard request methods and nothing client-specific:
 The push-UX notification (`notifications/claude/channel`) is the only
 non-standard method emitted, and it's a one-way notification — clients
-that don't handle it discard it per JSON-RPC semantics. No part of the
+that don't handle it discard it per JSON-RPC semantics. The poll path
-wheel's tool surface depends on a client recognizing it.
+(via the standard `instructions` field) carries delivery for those
 clients, so no part of the wheel's tool surface depends on a client
 recognizing the notification.
 This means **any spec-compliant MCP client** can drive the wheel:
 Claude Code, Cursor, Cline, OpenCode, hermes-agent, or anything else
@ -305,6 +470,21 @@ A quick way to confirm: `ps aux | grep molecule-mcp` and check the
 PID hasn't changed across `/mcp` reconnects. If the same PID stays
 alive, the runtime is still using the old config.
 ### `claude mcp add` rejects the install command on Claude Code 2.1+
 Two common shapes from older docs trip the 2.1+ parser:
 - `claude mcp add molecule -s user -- env VAR=val molecule-mcp` — works
  but lands as `command: "env"` with positional args, which surprises
  some MCP clients on older 2.1.x patch builds.
 - `claude mcp add molecule -e VAR=val molecule-mcp` (missing `--`) — the
  CLI parses `molecule-mcp` as a flag value, not a command, and either
  errors or silently registers nothing.
 Use the `-e` form **with** `--` (see [Step 2](#claude-code)), or skip the
 CLI entirely and write the JSON shape into `~/.claude.json` directly.
 The on-disk shape is the source of truth and not version-sensitive.
 ### `command not found: molecule-mcp` from inside the runtime
 The runtime's `PATH` may differ from your interactive shell — common
@ -319,6 +499,44 @@ which molecule-mcp
 Then point `command` at that absolute path in `claude mcp add` /
 `.cursor/mcp.json` / `mcp_servers.yaml`.
 ### `error: option '--dangerously-load-development-channels <servers...>' argument missing`
 You're on Claude Code 2.1.x or later. The flag changed from a bare
 switch to an allowlist that takes tagged entries. See
 [Inbound delivery](#inbound-delivery-universal-poll-optional-push) for
 the right form — short answer:
 ```bash
 claude --dangerously-load-development-channels server:molecule
 ```
 ### `--dangerously-load-development-channels entries must be tagged: molecule`
 The flag value needs the `server:` (or `plugin:`) prefix. Pass
 `server:molecule` (the registered MCP server name), not bare
 `molecule`.
 ### `Control request timeout: initialize` from the workspace agent
 This is the symptom of forwarding the dev-channels flag to a nested
 `claude` CLI through the `claude-agent-sdk` with the wrong shape. If
 you embed the wheel inside an SDK-driven agent (e.g. the claude-code
 workspace template's `claude_sdk_executor.py`), pass the tagged value
 through `extra_args`:
 ```python
 ClaudeAgentOptions(
    ...,
    extra_args={"dangerously-load-development-channels": "server:molecule"},
 )
 ```
 The SDK forwards `extra_args` keys as `--<key> <value>` to the spawned
 CLI. Passing `None` renders as a bare switch and the post-2.1.x CLI
 rejects it with `argument missing`, which surfaces upstream as
 `Control request timeout: initialize` (the SDK never gets a response
 to its initialize control message).
 ## When to use this vs. the manual A2A path
 | Scenario | Use |
--- a/content/docs/runtime-mcp/dev-channels-flag.mdx
+++ b/content/docs/runtime-mcp/dev-channels-flag.mdx
@ -0,0 +1,176 @@
 ---
 title: "Dev-channels flag — tagged-form requirement"
 description: "Why Claude Code 2.1.x+ requires `--dangerously-load-development-channels server:molecule` (not the bare flag) to enable inline channel push from the molecule-mcp wheel."
 ---
 import { Callout } from 'fumadocs-ui/components/callout';
 The `molecule-mcp` wheel emits a JSON-RPC `notifications/claude/channel`
 notification on every inbound A2A message so Claude Code can render it
 as an inline `<channel>` synthetic user turn — zero polling, zero
 per-turn stall. During the channels research preview, Claude Code only
 processes that notification when the host is launched with the
 `--dangerously-load-development-channels` flag *and the flag carries a
 matching tagged allowlist entry*.
 This page covers the form that flag must take, what breaks when it's
 wrong, and when an operator has to think about it.
 <Callout type="warn">
 The bare flag (no value) is rejected by the post-2.1 CLI parser, and
 the failure mode propagates upstream as a `Control request timeout:
 initialize` from any SDK that spawns the CLI — every A2A turn wedges
 100% of the time. See [Failure mode](#failure-mode) below.
 </Callout>
 ## The flag
 ```
 --dangerously-load-development-channels <entries...>
 ```
 Available in Claude Code **2.1.x and later**. It opts the CLI into
 processing experimental `notifications/<channel>` JSON-RPC methods
 emitted by registered MCP servers and plugin channels. Without it, the
 CLI silently drops those notifications during the allowlist check, even
 though the wheel ships the wire shape correctly.
 ## Required form: tagged allowlist entries
 Each entry must carry one of two prefixes:
 | Form | Use for |
 |---|---|
 | `server:<MCP-server-name>` | Manually configured MCP servers — the name matches what you registered with `claude mcp add <name> ...` or the key under `mcpServers` in `~/.claude.json`. |
 | `plugin:<plugin-name>@<owner>/<repo>` | Plugin channels installed from a Claude Code plugin marketplace. |
 Multiple entries are space-separated:
 ```bash
 claude --dangerously-load-development-channels server:molecule server:telegram
 ```
 Untagged values (`molecule` instead of `server:molecule`) are rejected
 with `--dangerously-load-development-channels entries must be tagged`.
 ## Failure mode
 A bare flag (`--dangerously-load-development-channels` with no value)
 walks through three layers of damage before surfacing:
 1. **CLI**: rejects the invocation with
   `error: option '--dangerously-load-development-channels <servers...>' argument missing`.
 2. **SDK**: `claude-agent-sdk` (used by `claude_sdk_executor.py` in the
   Claude Code workspace template) renders the kwarg as a bare switch when
   the value is `None`. The CLI then never responds to the SDK's first
   `initialize` control message.
 3. **Workspace agent**: the SDK times out with
   `Control request timeout: initialize`. Every A2A turn wedges — 100%
   reproducible. Caught live on workspace `dd40faf8` on 2026-05-01.
 Two small fixes prevent this: pass a tagged value (don't let `None`
 render as a bare switch), and verify the CLI accepts your specific
 entries before going broad.
 ## For Molecule operators
 Pass `server:molecule` to enable the inbox bridge → MCP
 `notifications/claude/channel` push for the `molecule-mcp` wheel.
 ```bash
 claude --dangerously-load-development-channels server:molecule
 ```
 The `molecule` here matches the name you registered the wheel under in
 [Step 2 of the runtime-mcp guide](/docs/runtime-mcp#claude-code) (the
 key under `mcpServers`, or the first positional arg to `claude mcp add`).
 If you registered the wheel as `mol` or `molecule-prod`, use that name
 in the tag.
 When push is live, the session header prints:
 ```
 Listening for channel messages from: server:molecule
 ```
 …and inbound canvas/peer-agent messages render inline as
 `<channel source="molecule" ...>` synthetic user turns instead of
 arriving via `inbox_peek`.
 ### Embedding in an SDK-driven agent
 If you spawn `claude` through `claude-agent-sdk` (e.g. the Claude Code
 workspace template's `claude_sdk_executor.py`), forward the tagged value
 through `extra_args`:
 ```python
 from claude_agent_sdk import ClaudeAgentOptions
 ClaudeAgentOptions(
    model=self.model,
    permission_mode="bypassPermissions",
    cwd=self._resolve_cwd(),
    mcp_servers=mcp_servers,
    system_prompt=self._build_system_prompt(),
    resume=self._session_id,
    extra_args={"dangerously-load-development-channels": "server:molecule"},
 )
 ```
 The SDK forwards `extra_args` keys as `--<key> <value>` to the spawned
 CLI. Passing `None` as the value renders as a bare switch and trips the
 [Failure mode](#failure-mode) chain above.
 ## Verification
 Verified live on 2026-05-02: with the tagged value in `extra_args`,
 the in-workspace agent received `<channel source="molecule" kind="..."
 peer_id="..." activity_id="..." ts="...">` tags inline as synthetic
 user turns. No `wait_for_message` poll was needed for delivery. A2A
 returned coherent replies on every turn.
 ## When this matters
 Only when both of the following apply:
 - You're running Claude Code (any version 2.1.x or later) as the
  workspace runtime, AND
 - The in-workspace `molecule-mcp` server is configured (it is, by
  default, in the `claude-code` workspace template).
 **Hosted Molecule SaaS handles this automatically** — the executor
 passes `extra_args={"dangerously-load-development-channels": "server:molecule"}`
 when spawning the CLI. Operators on hosted SaaS do not need to do
 anything.
 **Self-hosted operators using the Claude Code workspace template** also
 get this for free since the template's executor sets `extra_args`. The
 flag only needs operator attention when:
 - Forking the Claude Code workspace template and stripping `extra_args`
  inadvertently.
 - Running `claude` directly outside the template (e.g. interactive
  sessions on a developer laptop) and wanting inline `<channel>` push.
 - Adding a second tagged source (e.g. `server:telegram` alongside
  `server:molecule`) — append, don't replace.
 Operators on Cursor, Cline, OpenCode, codex, hermes-agent, or any
 non-Claude-Code MCP host are unaffected: those clients ignore the
 notification and the wheel's poll path delivers via
 `wait_for_message` as the universal fallback.
 ## Forward note
 This requirement is a **research-preview gate**. Once Claude Code
 graduates `notifications/<channel>` from research preview to a default
 allowlist, the `--dangerously-load-development-channels` flag will no
 longer be required for the `molecule` server. Drop the `extra_args`
 entry in `claude_sdk_executor.py` (and any operator launch wrappers)
 when that happens — the wheel emits the wire shape correctly today
 and will continue to do so post-graduation.
 ## See also
 - [Bring Your Own Runtime (MCP) — Inbound delivery](/docs/runtime-mcp#inbound-delivery-universal-poll-optional-push)
 - [Bring Your Own Runtime (MCP) — Step 2: Claude Code](/docs/runtime-mcp#claude-code)
 - [Troubleshooting — Control request timeout: initialize](/docs/runtime-mcp#control-request-timeout-initialize-from-the-workspace-agent)
--- a/content/docs/security/owasp-agentic-top-10.mdx
+++ b/content/docs/security/owasp-agentic-top-10.mdx
@ -59,9 +59,11 @@ documents — through tool calls, logs, or responses.
 **Molecule AI controls:**
- **Encrypted secrets at rest:** Workspace secrets are encrypted with
+- **Encrypted secrets at rest:** Workspace secrets are sealed with envelope
-  `SECRETS_ENCRYPTION_KEY` (AES-256) before storage. Plaintext never hits the
+  encryption before storage — AWS KMS (per-secret data keys via `GenerateDataKey`,
-  database.
+  AES-256-GCM payload) when `KMS_KEY_ARN` is set, or static-key AES-256-GCM
  under `SECRETS_ENCRYPTION_KEY` for dev / self-host. Plaintext never hits the
  database, and the platform refuses to start with neither configured.
 - **Secrets scoped per-workspace:** A token scoped to workspace A cannot access
  workspace B's secrets.
 - **Memory access controls:** The MCP server's memory tools respect workspace
--- a/content/docs/self-hosting.mdx
+++ b/content/docs/self-hosting.mdx
@ -17,8 +17,8 @@ description: Run the full Molecule AI stack on your own infrastructure.
 The fastest way to get Molecule AI running locally:
 ```bash
-git clone https://github.com/Molecule-AI/molecule-core.git
+git clone https://github.com/Molecule-AI/molecule-monorepo.git
-cd molecule-core
+cd molecule-monorepo
 ./scripts/dev-start.sh
 # Canvas: http://localhost:3000
 # Platform: http://localhost:8080
@ -98,7 +98,8 @@ docker compose up
 | `PORT` | `8080` | Platform HTTP port |
 | `PLATFORM_URL` | `http://host.docker.internal:PORT` | URL passed to agent containers to reach the platform |
 | `CORS_ORIGINS` | `http://localhost:3000,http://localhost:3001` | Comma-separated allowed origins |
-| `SECRETS_ENCRYPTION_KEY` | -- | AES-256 key (32 bytes) for encrypting workspace secrets |
+| `SECRETS_ENCRYPTION_KEY` | -- | AES-256 key (32 bytes) for static-mode envelope encryption of workspace secrets (dev/self-host path) |
 | `KMS_KEY_ARN` | -- | AWS KMS CMK ARN — when set, secrets use KMS envelope encryption (per-secret data keys); production SaaS deployments use this path |
 | `WORKSPACE_DIR` | -- | Global fallback host path for `/workspace` bind-mount |
 | `MOLECULE_ENV` | -- | Set to `production` to hide E2E helper endpoints |
 | `ACTIVITY_RETENTION_DAYS` | `7` | How long activity logs are retained |
@ -154,7 +155,9 @@ This image serves both the API and the canvas frontend from a single container.
 ### Secrets Encryption
-Set `SECRETS_ENCRYPTION_KEY` to a 32-byte AES-256 key to encrypt workspace secrets at rest. Without this variable, secrets are stored in plaintext.
+The platform supports two envelope-encryption modes for workspace secrets, picked at boot:
 **Static mode (self-host / dev).** Set `SECRETS_ENCRYPTION_KEY` to a 32-byte AES-256 key. Each secret is sealed with AES-256-GCM under that single long-lived key. Without this variable (and without `KMS_KEY_ARN`), the platform refuses to start rather than silently storing plaintext.
 ```bash
 # Generate a key
@ -163,6 +166,12 @@ openssl rand -hex 32
 **Warning:** `SECRETS_ENCRYPTION_KEY` cannot be rotated without a data migration. Choose carefully before deploying to production.
 **KMS mode (production SaaS).** Set `KMS_KEY_ARN` to an AWS KMS Customer Master Key ARN. Each `Encrypt()` call asks KMS for a fresh per-secret data encryption key (`GenerateDataKey`), seals the secret payload with AES-256-GCM under that DEK, and stores the KMS-encrypted DEK alongside the ciphertext (envelope encryption). Rotating the CMK is a no-op for existing blobs — KMS tracks key versions internally.
 The two modes coexist during cutover: a v2 prefix byte tags KMS blobs; older static-mode blobs decrypt with `SECRETS_ENCRYPTION_KEY` until they're next written. Operators migrating to KMS can leave both env vars set during the transition.
 Implementation: `workspace-server/internal/crypto/envelope.go`.
 ### Rate Limiting
 The `RATE_LIMIT` variable (default 600 requests/min) applies per client. Adjust based on your expected traffic.
--- a/content/docs/self-hosting/admin-token.mdx
+++ b/content/docs/self-hosting/admin-token.mdx
@ -30,7 +30,28 @@ platform.
 ## Setting ADMIN_TOKEN in production
-### Fly.io (recommended for self-hosted)
+The platform reads `ADMIN_TOKEN` from the process environment, so any
 production-grade host with a secrets store works. Pick the path that
 matches your deployment target.
 > **Note (Apr 2026):** The Molecule AI SaaS itself runs on AWS EC2
 > (workspaces) + Railway (control plane). Self-hosters can use any host
 > with secret-injection (Railway, Fly.io, AWS, GCP, bare-metal) — the
 > examples below are illustrative, not prescriptive. See the
 > [`molecule-controlplane` README "Migration history"](https://github.com/Molecule-AI/molecule-controlplane#migration-history)
 > for the canonical SaaS infrastructure record.
 ### Railway
 In the Railway dashboard, go to your service → Variables and add
 `ADMIN_TOKEN`, then redeploy. Or via CLI:
 ```bash
 railway variables --set ADMIN_TOKEN="your-generated-token"
 railway up
 ```
 ### Fly.io
 ```bash
 fly secrets set ADMIN_TOKEN="your-generated-token"
@ -92,10 +113,13 @@ payload with a `count` field, the token is working.
 To rotate without downtime:
-1. **Deploy** the new token: `fly secrets set ADMIN_TOKEN="new-token" && fly deploy`
+1. **Deploy** the new token via your host's secrets store, e.g.
-2. **Verify** the new token works (see above)
+   `railway variables --set ADMIN_TOKEN="new-token" && railway up`
-3. **Remove** the old token: `fly secrets unset OLD_TOKEN_NAME` (Fly does not
+   or `fly secrets set ADMIN_TOKEN="new-token" && fly deploy`.
-   persist old secret values after unset)
+2. **Verify** the new token works (see above).
 3. **Remove** the old token from the secrets store. Most managed hosts
   (Railway, Fly.io, AWS Secrets Manager) do not persist old secret
   values after unset.
 ## Related
--- a/content/docs/tutorials/aws-ec2-provisioner.md
+++ b/content/docs/tutorials/aws-ec2-provisioner.md
@ -0,0 +1,284 @@
 ---
 title: "Provisioning Workspaces on AWS EC2 (production SaaS provisioner)"
 description: "How the molecule-controlplane EC2 provisioner turns POST /cp/orgs and POST /workspaces calls into running tenant + workspace EC2 instances — env vars, lifecycle, tier sizing, and the migration off Fly Machines."
 ---
 # Provisioning Workspaces on AWS EC2 (production SaaS provisioner)
 As of April 2026, Molecule AI's SaaS control plane provisions both **tenants**
 (per-org platform VMs) and **workspaces** (per-agent inference VMs) on
 AWS EC2 instances. The provisioner lives at
 [`molecule-controlplane/internal/provisioner/ec2.go`](https://github.com/Molecule-AI/molecule-controlplane/blob/main/internal/provisioner/ec2.go)
 and is auto-wired by [`cmd/server/main.go`](https://github.com/Molecule-AI/molecule-controlplane/blob/main/cmd/server/main.go)
 whenever AWS credentials are present in the control-plane environment. The
 platform manages workspace lifecycle, auth, and routing; AWS manages the
 underlying EC2, security groups, and network plumbing.
 This tutorial documents what env vars the provisioner reads, what AWS
 actions it performs on a `POST /workspaces`, and how to operate it. It is
 the replacement for the deprecated [Fly Machines provisioner](./fly-machines-provisioner.md)
 tutorial.
 > **Audience:** operators running a self-hosted Molecule AI control plane
 > against their own AWS account, and contributors debugging the
 > production CP. End-users of `*.moleculesai.app` do not need any of
 > this — provisioning happens transparently when you create an org or
 > workspace in the canvas.
 ## When EC2 is the active provisioner
 `cmd/server/main.go` switches on whether `AWS_ACCESS_KEY_ID` is set in the
 process environment. If yes, it constructs an `*provisioner.EC2` from the
 config below and registers it as the tenant provisioner. There is **no**
 `CONTAINER_BACKEND=ec2` switch — the dispatcher key is presence of AWS
 credentials. (The legacy `flyio` backend still has dead code in the tree
 but is no longer wired in `main.go`.)
 A typical Railway-hosted control plane log line on boot:
 ```
 provisioner: EC2 (region=us-east-2, ami=ami-0ea3c35c5c3284d82)
 tenant provisioner: EC2 ✓
 ```
 If `AWS_ACCESS_KEY_ID` is unset, you'll see `provisioner: disabled`
 instead — useful for local dev where you want orgs CRUD to work without
 AWS access.
 ## Environment variables
 The full list of env vars `cmd/server/main.go` passes into
 `provisioner.EC2Config`. Anything not listed here is unused by the
 provisioner.
 ### Required for any EC2 provisioning
 | Var | Default | Purpose |
 |-----|---------|---------|
 | `AWS_ACCESS_KEY_ID` | — | Toggle: presence enables EC2 wiring at all |
 | `AWS_SECRET_ACCESS_KEY` | — | Standard AWS SDK credential pair |
 | `AWS_REGION` | `us-east-1` | Region for tenant + workspace launches |
 | `EC2_AMI` | `ami-0ea3c35c5c3284d82` (Ubuntu 22.04 us-east-2) | Default AMI when no `thin_ami_pins` row matches |
 | `EC2_VPC_ID` | — | VPC for per-tenant SG creation; falls back to `EC2_SECURITY_GROUP` if unset |
 | `EC2_SUBNET_ID` | — | Subnet for `RunInstances` |
 | `SECRETS_ENCRYPTION_KEY` | — | KMS-envelope DEK for tenant secret-at-rest; provisioner stays disabled until set |
 ### Required for production (#44 secure bootstrap)
 | Var | Purpose |
 |-----|---------|
 | `EC2_TENANT_IAM_PROFILE` | Instance profile attached to every tenant EC2 so it can fetch its bootstrap bundle from Secrets Manager at boot. Without this set, `Provision` returns the error `"Secrets Manager + IAM instance profile are required (#113 — plaintext user-data path removed)"`. |
 | `PROVISION_SHARED_SECRET` | Shared HMAC-secret stored alongside the tenant bootstrap bundle so workspace-server can authenticate inbound `/cp/...` callbacks |
 | `CP_ADMIN_API_TOKEN` | Token the tenant uses to call admin endpoints back on the control plane |
 | `CP_BASE_URL` | URL the tenant boot script uses to reach the control plane (typically `https://api.moleculesai.app`) |
 ### Required for the canvas Terminal tab
 | Var | Purpose |
 |-----|---------|
 | `EIC_ENDPOINT_SG_ID` | Security-group ID of the region's [EC2 Instance Connect endpoint](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-instance-connect-endpoint.html). The provisioner adds a `tcp/22` ingress rule to every per-tenant + per-workspace SG sourced from this SG, so the canvas Terminal can EIC-tunnel into the box for diagnostic ssh. Empty leaves the canvas Terminal broken with `failed to open EIC tunnel`. Discover with `aws ec2 describe-instance-connect-endpoints --region <region>`. |
 ### Cloudflare integration (per-tenant subdomains)
 | Var | Purpose |
 |-----|---------|
 | `CLOUDFLARE_API_TOKEN` | Enables CF DNS client; provisioner creates the per-tenant `<slug>.<APP_DOMAIN>` CNAME |
 | `CLOUDFLARE_ACCOUNT_ID` | Enables CF Tunnel client (preferred over Worker + wildcard DNS) |
 | `CLOUDFLARE_ZONE_ID` | DNS zone the tenant CNAMEs are written under |
 | `APP_DOMAIN` | Default `moleculesai.app`; tenant FQDN becomes `<slug>.<APP_DOMAIN>` |
 ### Optional — runtime images, tier image, backups, canary, multi-env
 | Var | Purpose |
 |-----|---------|
 | `MOLECULE_ENV` | `dev` / `staging` / `prod`; stamped on every EC2 tag and scopes the orphan-report's AWS lister so envs don't false-positive each other |
 | `EC2_INSTANCE_TYPE` | Default `t3.small` for tenant VMs (workspaces use the per-tier table below) |
 | `EC2_SECURITY_GROUP` | Fallback shared SG when `EC2_VPC_ID` is unset; production should leave this empty |
 | `EC2_KEY_NAME` | Optional EC2 KeyPair name for emergency console SSH |
 | `TENANT_IMAGE` | OCI ref for the tenant platform image (e.g. `ghcr.io/molecule-ai/platform-tenant:staging-<sha>`) |
 | `CANARY_TENANT_IMAGE` | Override `TENANT_IMAGE` for orgs flagged `is_canary=true` |
 | `CANARY_ROLE_ARN`, `CANARY_REGION`, `CANARY_VPC_ID`, `CANARY_SUBNET_ID` | Second-AWS-account target for canary tenant launches; all four required together |
 | `TENANT_BACKUP_S3_PREFIX` | Empty disables nightly `pg_dump`; set `s3://bucket/path` to enable |
 | `TENANT_BACKUP_REPORT_URL` | Defaults to `${CP_BASE_URL}/cp/tenants/backup-report` |
 | `GHCR_PULL_TOKEN` | GHCR pull token written into the tenant bootstrap bundle (private images only) |
 For the always-current set, grep
 [`cmd/server/main.go` lines 86–158](https://github.com/Molecule-AI/molecule-controlplane/blob/main/cmd/server/main.go#L86-L158)
 for `os.Getenv` calls inside the `provisioner.NewEC2` block.
 ## What happens on `POST /cp/orgs` (tenant provision)
 `OrgsHandler.Create` calls into `(*EC2).Provision(ctx, cfg)`. Roughly:
 1. **Cloudflare cleanup** — `cleanupStaleSlugArtifacts` scrubs any
   leftover tunnel/DNS rows from a previously-purged org with the same
   slug, so the slug is reusable.
 2. **Cloudflare Tunnel + DNS** — `CreateTunnel` → `CreateTunnelDNS`
   (writes `<slug>.<APP_DOMAIN>` → `<tunnel-id>.cfargotunnel.com`) →
   `ConfigureTunnelIngress` (registers the hostname on the tunnel's
   remote config so CF's edge knows to forward). DNS or ingress
   failures roll back the tunnel and abort the provision — fail-fast
   behavior added 2026-04-26 after a six-hour outage in which
   unreachable tenants timed out at 600–900s instead of surfacing the
   real CF API problem.
 3. **Bootstrap secrets to AWS Secrets Manager** — the provisioner
   generates a per-tenant DB password + admin token, packages them with
   the GHCR pull token, tunnel token, encryption key, and shared
   secret, and `PutSecret`s them at `awsapi.TenantSecretName(orgID)`.
   The tenant fetches this bundle at boot via its instance profile —
   no plaintext secrets in user-data (see #113).
 4. **Per-tenant SG creation** — `createPerTenantSG` calls
   `CreateSecurityGroup` with the resolved VPC, the per-org name, and
   the ingress rules from `tenantIngressRules(vpcCidr, EICEndpointSGID)`.
   The SG ingress always includes the canvas-terminal EIC `tcp/22`
   rule sourced from the EIC endpoint's own SG (UserIdGroupPairs, not
   `0.0.0.0/0` — only AWS EIC's endpoint can use it).
 5. **`RunInstances`** — `awsClient.RunInstance(ctx, awsapi.LaunchConfig{...})`
   launches with `InstanceType = TenantInstanceType` (default
   `t3.small`), the resolved AMI, IAM instance profile, base64-encoded
   user-data, and tags `OrgID` / `OrgSlug` / `Role=tenant` / `TunnelID`
   / `SGID`. Volume size is 30 GB.
 6. **Audit row** — every CF, SG, Secrets Manager, and EC2 lifecycle
   event is recorded in the `tenant_resources` audit table (#2343)
   so the orphan reconciler can diff claims vs live state.
 `Provision` returns a `*Result` whose fields (`FlyMachineID`, `FlyRegion`,
 `AdminToken`) are still named after Fly. The EC2 provisioner fake-fills
 them with EC2 equivalents (`InstanceID`, `AWSRegion`); a column-rename
 migration is on the controlplane backlog.
 ## What happens on `POST /workspaces` (workspace provision)
 `workspace-server`'s `POST /workspaces` reaches the control plane via
 `/cp/workspaces/provision`, which calls
 `(*EC2).ProvisionWorkspace(ctx, workspaceID, runtime, orgID, tier, platformURL, env)`:
 1. **Resolve tier resources** — `workspaceTierResources(tier)` returns
   `(instanceType, volumeSize)` per the table below. Hermes runtime
   floors `volumeSize` to 50 GB regardless of tier (uv + Python venv +
   Node.js gateway pegs disk at 18–25 GB during install).
 2. **Resolve AMI** — `resolveWorkspaceAMI` looks up `thin_ami_pins`
   for the runtime + region. A pin row means the AMI is pre-baked
   (per `packer/scripts/install-base.sh`) and user-data can skip
   apt-update + the Python/Node installs (60–140 s saved per
   provision, RFC #388). Fallback to the static `WorkspaceAMI`.
 3. **Resolve runtime image** — `resolveRuntimeImage` looks up
   `runtime_image_pins` and emits the containerized user-data path
   (docker pull + run) when present. Independent of the AMI gate
   above; the new path also installs Docker if missing on a thin/stock
   AMI.
 4. **Per-workspace SG creation** — same `createPerTenantSG` call with
   `namePrefix="workspace"`. Workspace SGs get
   `workspaceIngressRules(EICEndpointSGID)` — currently the EIC
   `tcp/22` rule and nothing else (workspaces sit behind the
   Cloudflare Tunnel for HTTP).
 5. **`RunInstance`** — launches with `wsShort = workspaceID[:12]`
   prefixed name, the resolved instance type + volume + AMI +
   user-data, and tags `WorkspaceID` / `Runtime` / `Role=workspace`
   / `SGID` / `OrgID`. The `OrgID` tag is what lets
   `DeprovisionInstance` cascade-terminate workspace EC2s when their
   tenant is deleted (incident 2026-04-23: ~27 orphaned workspace
   EC2s pinned staging at the 64 vCPU limit before the tag was
   added).
 6. **Audit row** — `tenant_resources` `KindEC2Instance` `StateCreated`
   with role / runtime / tier / workspace metadata.
 The boot script registers the workspace agent with the platform via
 `/workspaces/:id/register`, the platform issues an A2A auth token, and
 the agent comes up ready for `message/send` calls.
 ## Tier-based resource sizing
 `workspaceTierResources` is the single source of truth. As of writing,
 all tiers below T4 are clamped up to T4 (the SaaS floor) and tiers
 above T4 are also clamped down to T4 (today's max):
 | Tier | Instance type | Volume | Effective use |
 |------|---------------|--------|---------------|
 | T1 / T2 | clamped to T4 | clamped to T4 | not in production |
 | T3 | `t3.medium` | 40 GB | reserved (clamped today) |
 | T4 | `t3.large` | 80 GB | all production workspaces |
 If you set a tier outside `[3, 4]` the clamp lifts it to T4 — a cheap
 mis-provision rather than a fall-through to the unset `t3.small`
 default. The clamp was added in PR #434 follow-up after `tier=5`
 silently yielded `t3.small`.
 Hermes overrides volume to 50 GB minimum regardless of tier.
 ## Lifecycle — stop, restart, redeploy, teardown
 | Operation | Mechanism |
 |-----------|-----------|
 | **Stop / start a tenant** | `POST /cp/admin/tenants/:slug/{stop,start}` → `(*EC2).Stop` / `Start` via the EC2 API (no termination) |
 | **Redeploy a tenant** (in-place new image) | `POST /cp/admin/tenants/:slug/redeploy` → SSM Run Command pulls the latest `TENANT_IMAGE` and recreates the platform container; never reboots EC2 |
 | **Refresh workspace template images** | `POST /cp/admin/tenants/:slug/workspaces/redeploy` (single-tenant) or `POST /cp/admin/tenants/workspaces/redeploy-fleet` (canary-batched fleet); HTTP-only, no SSM |
 | **Delete a workspace** | platform `DELETE /workspaces/:id` → CP `DeprovisionInstance(workspaceInstanceID, ...)` terminates the EC2 + cleans DNS + SG |
 | **Delete a tenant (Art. 17 cascade)** | `DELETE /cp/orgs/:slug` → cascade-terminates all workspace EC2s tagged with this `OrgID`, then terminates the tenant EC2, then deletes the SG, Secrets Manager bundle, CF tunnel + CNAME |
 | **Orphan recovery** | `tenant_resources` audit table + 30-min reconciler that diffs claims vs live AWS state and exposes orphan counts via `/cp/admin/stats` |
 `DeprovisionInstance` polls termination under its own deadline so a
 stuck shutdown surfaces as a deprovision failure (and the caller's
 retry replays the cascade) instead of becoming a silent leak (#263).
 ## Why EC2 (vs Fly Machines)
 The control plane has migrated infrastructure twice in April 2026 — both
 documented in the
 [molecule-controlplane README "Migration history"](https://github.com/Molecule-AI/molecule-controlplane#migration-history):
 - **Apr 2026 — CP host:** Fly (`molecule-cp.fly.dev`) → Railway
  (`api.moleculesai.app`).
 - **Apr 2026 — tenant + workspace compute:** Fly Machines → AWS EC2
  with SSM Run Command for redeploy.
 The drivers were production needs Fly couldn't easily meet:
 - **Region + data-residency control.** EU customers required
  EU-resident tenant data; AWS regional pinning per tenant is
  straightforward, Fly's region routing is per-app and harder to
  guarantee per-tenant.
 - **AWS-native auth chain for the canvas Terminal.** EC2 Instance
  Connect lets the platform open SSH tunnels to a tenant box via
  short-lived (60 s) IAM-signed public keys — no shared SSH keys,
  no inbound `0.0.0.0/0` rules. The same path powers the Files API
  EIC writes (see [SaaS file writes via EC2 Instance Connect](./saas-file-writes-eic.md)).
 - **Secrets Manager + IAM instance profiles** for tenant bootstrap
  secrets (#113 removed the plaintext user-data path).
 - **Cloudflare Tunnels** instead of public IPs — no inbound exposure
  on tenant EC2s; CF edge is the only ingress.
 - **`tenant_resources` audit table + reconciler** for cascade-cleanup
  guarantees that Fly's flat machine list couldn't enforce.
 Old `internal/flyapi/` and `internal/provisioner/fly.go` files remain
 in the controlplane tree as legacy code awaiting cleanup; they are not
 wired in `cmd/server/main.go`.
 ## Operating notes
 - **Schema names still say "fly".** The `org_instances` columns
  `fly_app` / `fly_machine_id` / `fly_region` are fake-filled with EC2
  equivalents; a rename migration is on the controlplane backlog
  (`PLAN.md`).
 - **`SECRETS_ENCRYPTION_KEY` gates the whole provisioner.** The crypto
  envelope is required even when only AWS creds are present; without
  it, `tenant provisioner: DISABLED` is logged and `POST /cp/orgs`
  accepts the row but never spins a tenant.
 - **Per-tenant SG creation needs `EC2_VPC_ID`.** If you only set
  `EC2_SECURITY_GROUP` (the legacy shared-SG fallback), every tenant
  shares one SG — caught the bug in PR #434 review. Production must
  set `EC2_VPC_ID`.
 - **`EIC_ENDPOINT_SG_ID` is silently load-bearing.** If unset, the
  canvas Terminal hangs with `failed to open EIC tunnel` and the
  Files API EIC write path returns 500 — the EC2 boots fine, the
  symptom only shows when an operator opens the canvas Terminal tab.
 ## References
 - [`molecule-controlplane/internal/provisioner/ec2.go`](https://github.com/Molecule-AI/molecule-controlplane/blob/main/internal/provisioner/ec2.go) — provisioner source
 - [`molecule-controlplane/cmd/server/main.go`](https://github.com/Molecule-AI/molecule-controlplane/blob/main/cmd/server/main.go) — env-var wiring
 - [`molecule-controlplane` README "Migration history"](https://github.com/Molecule-AI/molecule-controlplane#migration-history) — canonical record
 - [AWS EC2 Instance Connect endpoints](https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-instance-connect-endpoint.html)
 - [AWS Secrets Manager](https://docs.aws.amazon.com/secretsmanager/latest/userguide/intro.html)
 - [SaaS file writes via EC2 Instance Connect](./saas-file-writes-eic.md) — EIC is also the Files API write channel
 - [Fly Machines provisioner (DEPRECATED)](./fly-machines-provisioner.md) — previous backend, retained for migration history
--- a/content/docs/tutorials/fly-machines-provisioner.md
+++ b/content/docs/tutorials/fly-machines-provisioner.md
@ -1,8 +1,21 @@
 ---
-title: "Provisioning Workspaces on Fly Machines (CONTAINER_BACKEND=flyio)"
+title: "Provisioning Workspaces on Fly Machines (CONTAINER_BACKEND=flyio) — DEPRECATED"
 ---
 # Provisioning Workspaces on Fly Machines (CONTAINER_BACKEND=flyio)
 > **DEPRECATED — historical reference only.** As of April 2026, the SaaS
 > control plane and tenant/workspace fleets migrated off Fly Machines to
 > **AWS EC2 (workspaces) + Railway (control plane)**. For the current
 > production provisioner, read
 > [Provisioning Workspaces on AWS EC2](./aws-ec2-provisioner.md) — env
 > vars, lifecycle, tier sizing, and the migration rationale. The Fly
 > provisioner code (`fly.go`, `internal/flyapi/`) remains in the
 > [`molecule-controlplane`](https://github.com/Molecule-AI/molecule-controlplane)
 > tree as legacy awaiting cleanup but is no longer the production path.
 > See the [`molecule-controlplane` README "Migration history"](https://github.com/Molecule-AI/molecule-controlplane#migration-history)
 > for the canonical record. This page is preserved as the original PR
 > #501 lineage record; do not follow it for new self-hosted deployments.
 Molecule AI can provision agent workspaces on [Fly Machines](https://fly.io/docs/machines/) instead of local Docker containers. When `CONTAINER_BACKEND=flyio` is set, every `POST /workspaces` creates a Fly Machine and boots the workspace agent inside it — with tier-based resource limits, env-var injection, and A2A registration handled automatically. The platform manages the workspace (lifecycle, auth, routing); Fly manages the machine it runs on.
 > **Scope note (PR #501):** Workspace images must already be published to GHCR before provisioning. The `delete` and `restart` platform endpoints are not yet fully wired to the Fly provisioner — use `flyctl machine stop/destroy` for teardown until a follow-up PR lands.
--- a/content/docs/tutorials/saas-federation.md
+++ b/content/docs/tutorials/saas-federation.md
@ -226,7 +226,7 @@ This terminates all EC2 instances, drops the Neon branch, and removes the org re
 |---|---|---|
 | Database | Neon branch-per-tenant | Tenant's branch, operator has no direct access |
 | Compute | EC2 in tenant's VPC | Control plane provisions, operator manages SG rules |
-| Credentials | No Fly/API tokens on tenant | All cloud credentials held by control plane |
+| Credentials | No AWS/cloud API tokens on tenant | All cloud credentials held by control plane |
 | API access | Org-scoped API keys | Tenant manages their own keys; operator has CP-level override |
 | Network | Security group: port 443 from platform only | Control plane manages; tenant can't modify |