molecule-core/docs/development/local-development.md
claude-ceo-assistant d9e380c5bc
Some checks failed
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 7s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 7s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 5s
CI / Detect changes (pull_request) Successful in 7s
CI / Canvas (Next.js) (pull_request) Successful in 7s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m38s
E2E API Smoke Test / detect-changes (pull_request) Successful in 7s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 7s
Harness Replays / detect-changes (pull_request) Successful in 7s
Harness Replays / Harness Replays (pull_request) Failing after 42s
CI / Platform (Go) (pull_request) Successful in 3m32s
CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (pull_request) Successful in 1s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CodeQL / Analyze (${{ matrix.language }}) (go) (pull_request) Successful in 1s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 6s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 7s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 7s
CI / Python Lint & Test (pull_request) Successful in 6s
CodeQL / Analyze (${{ matrix.language }}) (python) (pull_request) Successful in 1s
Retarget main PRs to staging / Retarget to staging (pull_request) Has been skipped
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 4s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 5s
feat(workspace-server): local-dev provisioner builds from Gitea source when MOLECULE_IMAGE_REGISTRY is unset (#63, Task #194)
OSS contributors who clone molecule-core and `go run ./workspace-server/cmd/server`
now get a working end-to-end provision without authenticating to GHCR or AWS ECR.

Pre-fix: with MOLECULE_IMAGE_REGISTRY unset, the provisioner attempted to pull
ghcr.io/molecule-ai/workspace-template-<runtime>:latest, which has been
returning 403 since the 2026-05-06 GitHub-org suspension.

Post-fix: when MOLECULE_IMAGE_REGISTRY is unset, the provisioner switches to
local-build mode — looks up the workspace-template-<runtime> repo's HEAD sha
on Gitea via a single API call, shallow-clones into ~/.cache/molecule/, and
runs `docker build --platform=linux/amd64`. SHA-pinned cache key skips the
clone+build entirely on subsequent provisions.

Production tenants are unaffected: every prod tenant sets the var to its
private ECR mirror, so the SaaS pull path is byte-for-byte identical.

SSOT for mode detection lives in Resolve() (registry_mode.go) returning a
discriminated RegistrySource{Mode, Prefix} so call sites that branch on
mode get a compile-time push instead of a string-equality footgun.

Coverage:
* registry_mode.go            — new SSOT (Resolve, RegistryMode, IsKnownRuntime)
* registry_mode_test.go       — 8 tests pinning mode-decision contract
* localbuild.go               — clone+build pipeline (570 LOC, fully unit-tested)
* localbuild_test.go          — 22 tests covering happy/sad paths, fail-closed
* provisioner.go              — Start() inserts ensureLocalImageHook in local mode
* docs/adr/ADR-002            — design rationale + alternatives + security review
* docs/development/local-development.md — local-build flow + env overrides

Security:
* Allowlist-only runtime names (knownRuntimes) gate the clone path.
* Repo prefix hardcoded to git.moleculesai.app/molecule-ai/molecule-ai-workspace-template-;
  forks via opt-in MOLECULE_LOCAL_TEMPLATE_REPO_PREFIX.
* MOLECULE_GITEA_TOKEN masked in every log line via maskTokenInURL/maskTokenInString.
* Fail-closed: Gitea unreachable / runtime not mirrored → clear error, never
  silently fall back to GHCR/ECR.
* docker build invocation passes no --build-arg from external input.
* HTTP body cap 64KB on Gitea API responses (defence vs malicious upstream).

Closes #63 / Task #194.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 15:16:51 -07:00

7.8 KiB
Raw Permalink Blame History

Local Development

Workspace Template Images: Local-Build Mode (Issue #63)

OSS contributors who run molecule-core locally do not need to authenticate to GHCR or AWS ECR. When the MOLECULE_IMAGE_REGISTRY env var is unset, the platform automatically:

  1. Looks up the HEAD sha of https://git.moleculesai.app/molecule-ai/molecule-ai-workspace-template-<runtime> (single API call, no clone).
  2. If a local image tagged molecule-local/workspace-template-<runtime>:<sha12> already exists, reuses it (cache hit).
  3. Otherwise, shallow-clones the repo into ~/.cache/molecule/workspace-template-build/<runtime>/<sha12>/ and runs docker build --platform=linux/amd64 -t <tag> ..
  4. Hands the SHA-pinned tag to Docker for ContainerCreate.

First-provision build time: 510 min on Apple Silicon (amd64 emulation). Subsequent provisions hit the cache and start in seconds. Cache is invalidated automatically when the template repo's HEAD moves.

Currently mirrored on Gitea: claude-code, hermes, langgraph, autogen. Other runtimes (crewai, deepagents, codex, gemini-cli, openclaw) fail with an actionable "not mirrored to Gitea" error pointing at the missing repo.

Production tenants are unaffected — every prod tenant sets MOLECULE_IMAGE_REGISTRY to its private ECR mirror via Railway env / EC2 user-data, so the SaaS pull path stays identical.

Environment overrides

Var Default Use case
MOLECULE_IMAGE_REGISTRY (unset) Set to a real registry URL to switch from local-build to SaaS-pull mode.
MOLECULE_LOCAL_BUILD_CACHE ~/.cache/molecule/workspace-template-build Override cache directory.
MOLECULE_LOCAL_TEMPLATE_REPO_PREFIX https://git.moleculesai.app/molecule-ai/molecule-ai-workspace-template- Point at a fork.
MOLECULE_GITEA_TOKEN (unset) Required only if your fork has private template repos.

Verifying a switch from the GHCR-retag stopgap

Pre-fix, OSS contributors worked around the suspended GHCR org by manually retagging an :latest image. After this change, that workaround is redundant: simply unset MOLECULE_IMAGE_REGISTRY (or leave it unset), boot the platform, and provision a workspace. Logs will show:

Provisioner: local-build mode → using locally-built image molecule-local/workspace-template-claude-code:<sha12> for runtime claude-code
local-build: cloning https://git.moleculesai.app/molecule-ai/molecule-ai-workspace-template-claude-code → ...
local-build: docker build done in <duration>

If you still see ghcr.io/molecule-ai/... in the boot log, double-check env | grep MOLECULE_IMAGE_REGISTRY — a stale shell export from the pre-fix workaround could keep SaaS-mode active.

Starting the Stack

docker compose up

This starts:

Service Port Description
Postgres internal only Primary database
Redis internal only Ephemeral state
Platform (Go) :8080 Control plane API
Canvas (Next.js) :3000 Visual frontend
Langfuse web :3001 (host) / :3000 (internal) Observability UI
Langfuse worker Background processing
ClickHouse Langfuse dependency

Each workspace container is provisioned on demand by the platform when a user creates or imports one.

Langfuse uses a dedicated langfuse Postgres database. The compose stack creates it automatically before starting the Langfuse service, so it does not conflict with the platform's molecule schema.

Infrastructure Only

To start just Postgres, Redis, and Langfuse (no application code):

docker compose -f docker-compose.infra.yml up

Optional Profiles

docker compose --profile multi-provider up  # Add LiteLLM proxy (unified LLM API)
docker compose --profile local-models up    # Add Ollama (local LLM models)

Environment Variables

Platform (Go)

DATABASE_URL=postgres://dev:dev@postgres:5432/molecule?sslmode=prefer
REDIS_URL=redis://redis:6379
PORT=8080
SECRETS_ENCRYPTION_KEY=dev-key-change-in-production
WORKSPACE_DIR=/path/to/molecule-monorepo   # Optional global fallback; prefer per-workspace workspace_dir in org.yaml or API

Canvas (Next.js)

NEXT_PUBLIC_PLATFORM_URL=http://localhost:8080
NEXT_PUBLIC_WS_URL=ws://localhost:8080/ws

Workspace Runtime

WORKSPACE_ID=           # assigned by platform on provision
WORKSPACE_CONFIG_PATH=  # path to config folder inside container
MODEL_PROVIDER=         # e.g. anthropic:claude-sonnet-4-6
TIER=                   # 1, 2, 3, or 4
PLATFORM_URL=           # http://platform:8080
PARENT_ID=              # set by platform during team expansion (empty for top-level)
ANTHROPIC_API_KEY=      # or OPENAI_API_KEY, etc.
LANGFUSE_HOST=          # http://langfuse-web:3000 (internal container port; host-mapped to :3001)
LANGFUSE_PUBLIC_KEY=
LANGFUSE_SECRET_KEY=
LANGSMITH_TRACING=true  # LangGraph reads this to enable tracing

Technology Versions

Go              1.25+ (go.mod)
Python          3.11+
Node.js         22+
Next.js         15
React Flow      12   (@xyflow/react)
a2a-sdk         0.3+ (A2A server SDK, install with a2a-sdk[http-server])
langfuse        3.x  (self-hosted Docker)
Postgres        16
Redis           7
Docker Compose  2.x

Running Tests

Unit Tests

cd workspace-server && go test -race ./...               # Go tests with race detection (695 tests)
cd canvas && npm test                            # Vitest tests (357 tests)
cd workspace && python -m pytest -v     # Workspace runtime tests (1140 tests)
cd sdk/python && python -m pytest -v             # SDK tests (121 tests)
cd mcp-server && npm test                        # MCP server tests (97 Jest tests)

Integration Tests

bash tests/e2e/test_api.sh                 # 62 API tests (quick local verify; Phase 30.1 bearer-auth aware; also runs in CI)
bash tests/e2e/test_a2a_e2e.sh             # 22 A2A e2e tests (requires platform + 2 agents)
bash tests/e2e/test_activity_e2e.sh        # 25 activity/task E2E tests (requires platform + 1 agent)
bash tests/e2e/test_comprehensive_e2e.sh   # 67 endpoint/memory/bundle/approval checks

All E2E scripts share tests/e2e/_lib.sh helpers and are shellcheck-clean (enforced in CI). See ./testing-e2e.md for auth prerequisites and what CI runs.

CI Pipeline

GitHub Actions runs automatically on push to main and on PRs (.github/workflows/ci.yml):

  • platform-build — Go build, vet, go test -race with coverage profiling (25% baseline threshold; setup-go uses module cache)
  • canvas-build — npm build, vitest run (no --passWithNoTests -- tests must exist and pass)
  • mcp-server-build — npm build
  • python-lintpytest --cov=. --cov-report=term-missing (pytest-cov enabled)
  • e2e-api (added 2026-04-13) — Postgres + Redis service containers, migrations applied via docker exec, tests/e2e/test_api.sh must pass 62/62
  • shellcheck (added 2026-04-13) — lints every tests/e2e/*.sh

Postgres and Redis are not exposed to the host -- use docker compose exec postgres psql or docker compose exec redis redis-cli for direct access.

Utility Scripts

Script Purpose
infra/scripts/setup.sh Initialize the local environment
infra/scripts/nuke.sh Tear down and clean up everything
bundle-compile.sh Compile workspace config folders into .bundle.json files
test_api.sh Run 62 platform API integration tests
test_a2a_e2e.sh Run 22 A2A end-to-end tests
test_activity_e2e.sh Run 25 activity/task E2E tests
setup-org.sh Create default 15-agent org hierarchy (PM + Marketing/Research/Dev teams, all Claude Code)