molecule-core/infra/scripts/setup.sh
Hongming Wang 96cc4b0c42 fix(quickstart): wire up template/plugin registry via manifest.json
The Canvas template palette was empty on a fresh clone because
`workspace-configs-templates/`, `org-templates/`, and `plugins/` are
gitignored and nothing populated them. The registry already exists —
`manifest.json` at repo root lists every curated
`workspace-template-*`, `org-template-*`, and `plugin-*` repo, and
`scripts/clone-manifest.sh` clones them — but the step was absent
from the README and setup.sh, so new users never ran it.

### What this commit does

**1. `setup.sh` runs `clone-manifest.sh` automatically** (once).
After starting the Docker network but before booting infra, iterate
`manifest.json` and clone any workspace_templates / org_templates /
plugins that aren't already populated. Idempotent — subsequent
runs skip dirs that have content. Requires `jq`; when jq is missing
the step prints a clear install hint and skips (doesn't fail).

**2. `clone-manifest.sh` is idempotent.** Before running `git clone`,
check whether the target directory already exists and is non-empty —
skip if so. Lets `setup.sh` rerun safely without forcing the operator
to delete already-cloned template repos.

**3. `ListTemplates` logs the reason it skips a template.** The
handler previously swallowed `resolveYAMLIncludes` errors with
`continue`, so a broken template showed up as an empty palette with
no log trail. Now the include-expansion and yaml.Unmarshal failure
paths both emit a descriptive `log.Printf` — the exact message that
made the stale `org-templates/molecule-dev/` snapshot debuggable:

    ListTemplates: skipping molecule-dev — !include expansion failed:
      !include "core-platform.yaml" at line 25: open .../teams/
      core-platform.yaml: no such file or directory

**4. Remove the in-tree `org-templates/molecule-dev/` snapshot** (170
files). Matches the explicit intent of prior commit
`bfec9e53` — "remove org-templates/molecule-dev/ — standalone repo
is source of truth". A later "full staging snapshot" re-added a
partial copy that had `!include` references to 7 role files that
never existed in the snapshot (`core-platform.yaml`,
`controlplane.yaml`, `app-docs.yaml`, `infra.yaml`, `sdk.yaml`,
`release-manager/workspace.yaml`, `integration-tester/workspace.yaml`).
`clone-manifest.sh` repopulates it fresh from
`Molecule-AI/molecule-ai-org-template-molecule-dev`.

.gitignore exception for `molecule-dev/` is dropped accordingly
— the whole `/org-templates/*` tree is now gitignored, symmetric
with `/plugins/` and `/workspace-configs-templates/`.

**5. Doc updates** (README, README.zh-CN, CONTRIBUTING) mention `jq`
as a prerequisite and describe what setup.sh now does.

### Verification

On a fresh-nuked DB with the updated branch:

1. `bash infra/scripts/setup.sh` — cleanly clones 33/33 manifest
   repos (20 plugins, 8 workspace_templates, 5 org_templates), then
   boots infra. Second run skips all 33 (idempotent).
2. `go run ./cmd/server` — "Applied 41 migrations", :8080 healthy.
3. `curl http://localhost:8080/org/templates` returns 4 templates
   (was `[]`):

       - Free Beats All
       - MeDo Smoke Test
       - Molecule AI Worker Team (Gemini)
       - Reno Stars Agent Team

4. `bash tests/e2e/test_api.sh` — 61/61 pass.
5. `npx vitest run` in canvas — 902/902 pass.
6. `shellcheck infra/scripts/setup.sh` — clean.

### SaaS parity

All changes are local-dev surface. `setup.sh`, `clone-manifest.sh`,
and the local `org-templates/` directory aren't part of the CP
provisioner path — SaaS tenant machines get their templates via
Dockerfile layers or CP-side provisioning, not `clone-manifest.sh`.
The `ListTemplates` log addition is harmless either way (replaces a
silent `continue` with a `log.Printf + continue`).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 14:55:34 -07:00

86 lines
3.6 KiB
Bash
Executable File

#!/usr/bin/env bash
set -euo pipefail
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
ROOT_DIR="$(cd "$SCRIPT_DIR/../.." && pwd)"
echo "==> Ensuring shared docker network exists..."
docker network create molecule-monorepo-net 2>/dev/null || true
# Populate the template / plugin registry.
# workspace-configs-templates/, org-templates/, and plugins/ are intentionally
# gitignored — the curated set lives in manifest.json as external repos. Without
# them the Canvas template palette is empty and workspace provisioning falls
# through to a bare default. The script itself is idempotent (skips dirs that
# already have content), so re-running setup.sh is safe.
if [ -f "$ROOT_DIR/manifest.json" ] && [ -f "$ROOT_DIR/scripts/clone-manifest.sh" ]; then
if ! command -v jq >/dev/null 2>&1; then
echo "==> NOTE: jq not installed — skipping template registry populate."
echo " Install with: brew install jq (macOS) / apt install jq (Debian)"
echo " Then rerun: bash scripts/clone-manifest.sh manifest.json \\"
echo " workspace-configs-templates/ org-templates/ plugins/"
else
echo "==> Populating template / plugin registry from manifest.json..."
bash "$ROOT_DIR/scripts/clone-manifest.sh" \
"$ROOT_DIR/manifest.json" \
"$ROOT_DIR/workspace-configs-templates" \
"$ROOT_DIR/org-templates" \
"$ROOT_DIR/plugins"
fi
fi
echo "==> Starting infrastructure..."
docker compose -f "$ROOT_DIR/docker-compose.infra.yml" up -d
echo "==> Waiting for Postgres..."
until docker compose -f "$ROOT_DIR/docker-compose.infra.yml" exec -T postgres pg_isready -U "${POSTGRES_USER:-dev}" 2>/dev/null; do
sleep 1
done
echo " Postgres is ready."
echo "==> Waiting for Redis..."
until docker compose -f "$ROOT_DIR/docker-compose.infra.yml" exec -T redis redis-cli ping 2>/dev/null | grep -q PONG; do
sleep 1
done
echo " Redis is ready."
echo "==> Verifying Redis KEA config..."
KEA=$(docker compose -f "$ROOT_DIR/docker-compose.infra.yml" exec -T redis redis-cli config get notify-keyspace-events | tail -1)
echo " notify-keyspace-events = $KEA"
# Migrations are intentionally not applied here. The platform's own runner
# (workspace-server/internal/db/postgres.go::RunMigrations) tracks applied
# files in `schema_migrations` on every boot. Applying them out-of-band via
# psql leaves that table empty, so the platform re-applies everything and
# fails on non-idempotent ALTER TABLE statements. Let `go run ./cmd/server`
# handle it.
echo "==> Infrastructure ready!"
echo " Postgres: localhost:5432"
echo " Redis: localhost:6379"
echo " Langfuse: localhost:3001"
echo " Temporal: localhost:7233 (gRPC) / localhost:8233 (UI)"
echo ""
echo " Next: cd workspace-server && go run ./cmd/server"
echo " (the platform applies pending migrations on first boot)"
# Source .env if it exists so the ADMIN_TOKEN check below reflects what the
# platform will actually see at startup, not just the current shell env.
if [ -f "$ROOT_DIR/.env" ]; then
set -a
# shellcheck disable=SC1091
. "$ROOT_DIR/.env"
set +a
fi
# Security check — issue #684 (AdminAuth bearer bypass, PR #729).
# Without ADMIN_TOKEN, any valid workspace bearer token can call /admin/* routes.
if [ -z "${ADMIN_TOKEN:-}" ]; then
echo ""
echo " ⚠ WARNING: ADMIN_TOKEN is not set."
echo " Until it is, AdminAuth falls back to accepting any workspace bearer token"
echo " — the #684 vulnerability is NOT closed in this deployment."
echo " Generate one: openssl rand -base64 32"
echo " Then export ADMIN_TOKEN=<value> or add it to your .env before starting the platform."
fi