feat(workspace-server): local-dev provisioner builds from Gitea source (#70)
Some checks failed
CodeQL / Analyze (${{ matrix.language }}) (go) (push) Successful in 1s
CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (push) Successful in 1s
CodeQL / Analyze (${{ matrix.language }}) (python) (push) Successful in 1s
Block internal-flavored paths / Block forbidden paths (push) Successful in 6s
CI / Detect changes (push) Successful in 9s
E2E API Smoke Test / detect-changes (push) Successful in 10s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 10s
Auto-sync main → staging / sync-staging (push) Successful in 13s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 10s
Handlers Postgres Integration / detect-changes (push) Successful in 11s
Harness Replays / detect-changes (push) Successful in 11s
Runtime PR-Built Compatibility / detect-changes (push) Successful in 12s
CI / Shellcheck (E2E scripts) (push) Successful in 4s
CI / Canvas (Next.js) (push) Successful in 8s
CI / Python Lint & Test (push) Successful in 9s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 9s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 9s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Successful in 9s
CI / Canvas Deploy Reminder (push) Has been skipped
CI / Platform (Go) (push) Has been cancelled
E2E API Smoke Test / E2E API Smoke Test (push) Has been cancelled
publish-workspace-server-image / build-and-push (push) Has been cancelled
Harness Replays / Harness Replays (push) Failing after 1m35s

Hongming-locked Option C: MOLECULE_IMAGE_REGISTRY presence as mode marker. ADR-002 captures rationale. 30 new tests + 64 existing preserved. Hostile-review weakest 3 filed as #204/#205/#206 follow-ups. Closes #63 (Task #194). Approved by security-auditor.
This commit is contained in:
claude-ceo-assistant 2026-05-07 23:37:56 +00:00
commit d84d88ad70
7 changed files with 1585 additions and 0 deletions

View File

@ -0,0 +1,74 @@
# ADR-002: Local-build mode signalled by `MOLECULE_IMAGE_REGISTRY` presence
* Status: Accepted (2026-05-07)
* Issue: #63 (closes Task #194)
* Decision: Hongming (CTO) + Claude Opus 4.7 (implementation)
## Context
Pre-2026-05-06, every Molecule deployment — both production tenants and OSS contributor laptops — pulled workspace-template-* container images from `ghcr.io/molecule-ai/`. Production tenants additionally set `MOLECULE_IMAGE_REGISTRY` to an AWS ECR mirror via Railway env / EC2 user-data, but the OSS default was the upstream GHCR org.
On 2026-05-06 the `Molecule-AI` GitHub org was suspended (saved memory: `feedback_github_botring_fingerprint`). GHCR now returns **403 Forbidden** for every `molecule-ai/workspace-template-*` manifest. OSS contributors who clone `molecule-core` and run `go run ./workspace-server/cmd/server` cannot provision a workspace — every first provision fails with:
```
docker image "ghcr.io/molecule-ai/workspace-template-claude-code:latest" not found after pull attempt
```
Production tenants are unaffected (their `MOLECULE_IMAGE_REGISTRY` points at ECR, which we still control), but OSS onboarding is broken. Workspace template repos are intentionally separate from `molecule-core` (each runtime is OSS-shape and forkable), and they are mirrored to Gitea (`https://git.moleculesai.app/molecule-ai/molecule-ai-workspace-template-<runtime>`) — but the provisioner has no path that consumes Gitea source directly.
## Decision
When `MOLECULE_IMAGE_REGISTRY` is **unset** (or empty), the provisioner switches to a **local-build mode** that:
1. Looks up the workspace-template repo's HEAD sha on Gitea via a single API call.
2. Checks whether a SHA-pinned local image (`molecule-local/workspace-template-<runtime>:<sha12>`) already exists; if so, reuses it.
3. Otherwise shallow-clones the repo into `~/.cache/molecule/workspace-template-build/<runtime>/<sha12>/` and runs `docker build --platform=linux/amd64 -t <tag> .`.
4. Hands the SHA-pinned tag to Docker for ContainerCreate, bypassing the registry-pull path entirely.
When `MOLECULE_IMAGE_REGISTRY` is **set**, behavior is unchanged: pull the image from that registry. Existing prod tenants and self-hosters who mirror to a private registry are not affected.
## Consequences
### Positive
* **Zero-config OSS onboarding**`git clone molecule-core && go run ./workspace-server/cmd/server` boots end-to-end without any registry credentials.
* **Production tenants protected** — same env var, same semantics in SaaS-mode. Migration is a no-op.
* **No new env var** — extending an existing var's semantics ("where to pull, OR build locally if absent") rather than introducing `MOLECULE_LOCAL_BUILD=1` keeps the surface small.
* **SHA-pinned cache** — repeat builds are O(API-call); only template-repo HEAD changes invalidate.
* **Production-parity image** — amd64 emulation on Apple Silicon honours `feedback_local_must_mimic_production`. The provisioner's existing `defaultImagePlatform()` already forces amd64 for parity; building amd64 locally lets that decision stay consistent.
### Negative
* **Conflates two concerns**`MOLECULE_IMAGE_REGISTRY` now signals BOTH "where to pull" AND "build locally if absent." A future operator who unsets it expecting a hard error will instead get a slow first-provision. Documented in the runbook.
* **First-provision is slow on Apple Silicon** — 510 min via QEMU emulation on the cold path. Mitigated by SHA-cache (subsequent runs are <1s lookup + 0s build).
* **Coverage gap** — only 4 of 9 runtimes are mirrored to Gitea today (`claude-code`, `hermes`, `langgraph`, `autogen`). The other 5 fail with an actionable "not mirrored" error. Mirroring those repos is a separate task.
* **Implicit trust boundary** — operator running `go run` implicitly trusts `molecule-ai/molecule-ai-workspace-template-*` repos on Gitea. This is the same trust they would extend to the GHCR images today; not a new attack surface.
## Alternatives considered
1. **New env var `MOLECULE_LOCAL_BUILD=1`** — explicit, but requires OSS contributors to know it exists. Violates the zero-config goal.
2. **Push pre-built images to a Gitea container registry, mirror tag from upstream** — operationally cleaner but: (a) Gitea's container-registry add-on isn't deployed on the operator host, (b) defeats the OSS-contributor goal of "hack on the source, see your changes," since they'd still pull a stale image.
3. **Embed Dockerfiles in molecule-core itself, drop the standalone template repos** — would work but breaks the OSS-shape principle; templates are intentionally separable, anyone-can-fork artifacts.
4. **Build native arch on Apple Silicon (arm64) and drop the platform pin in local-mode** — fast, but creates `linux/arm64` images that diverge from the amd64-only prod runtime. Local-vs-prod debug behavior would diverge. Rejected per `feedback_local_must_mimic_production`.
## Security review
* **Gitea repo URL allowlist** — runtime name must be in the `knownRuntimes` allowlist (defence-in-depth against a future code path that lets cfg.Runtime carry untrusted input). Repo prefix is hardcoded to `https://git.moleculesai.app/molecule-ai/molecule-ai-workspace-template-`; forks can override via `MOLECULE_LOCAL_TEMPLATE_REPO_PREFIX` (opt-in, default off).
* **Token handling** — clones are anonymous over HTTPS by default (templates are public). `MOLECULE_GITEA_TOKEN`, if set, is passed via URL userinfo for the clone and as `Authorization: token` for the API call. The token is **masked in every log line** via `maskTokenInURL` / `maskTokenInString` and never appears in the cache dir path.
* **No silent fallback** — if Gitea is unreachable or the runtime isn't mirrored, we return a clear error mentioning the repo URL and the missing runtime. We **never** fall back to GHCR/ECR (that would be a confusing bug for an OSS contributor who happened to have stale ECR creds in their docker config).
* **Build-arg injection**`docker build` is invoked with NO `--build-arg` from external input. Dockerfile is consumed as-is.
* **Cache poisoning** — cache key is the Gitea HEAD sha + Dockerfile content; a force-push to the template repo's main branch regenerates the key on next run. Cache dir is per-user (`$HOME/.cache`), so cross-user attacks aren't relevant in single-user dev mode.
## Versioning + back-compat
* Existing prod tenants set `MOLECULE_IMAGE_REGISTRY=<ECR url>` → unchanged behavior.
* Existing local installs that set the var → unchanged behavior.
* Existing local installs that don't set it → switch to local-build path. Migration: none required (additive); first provision will take 510 min instead of failing.
* No deprecations.
## References
* Issue #63 — feat(workspace-server): local-dev provisioner builds from Gitea source
* Saved memory `feedback_local_must_mimic_production` — local docker must mimic prod, no bypasses
* Saved memory `reference_post_suspension_pipeline` — full post-2026-05-06 stack shape
* Saved memory `feedback_github_botring_fingerprint` — what got the org suspended

View File

@ -1,5 +1,41 @@
# Local Development # Local Development
## Workspace Template Images: Local-Build Mode (Issue #63)
OSS contributors who run `molecule-core` locally do **not** need to authenticate to GHCR or AWS ECR. When the `MOLECULE_IMAGE_REGISTRY` env var is **unset**, the platform automatically:
1. Looks up the HEAD sha of `https://git.moleculesai.app/molecule-ai/molecule-ai-workspace-template-<runtime>` (single API call, no clone).
2. If a local image tagged `molecule-local/workspace-template-<runtime>:<sha12>` already exists, reuses it (cache hit).
3. Otherwise, shallow-clones the repo into `~/.cache/molecule/workspace-template-build/<runtime>/<sha12>/` and runs `docker build --platform=linux/amd64 -t <tag> .`.
4. Hands the SHA-pinned tag to Docker for `ContainerCreate`.
**First-provision build time:** 510 min on Apple Silicon (amd64 emulation). Subsequent provisions hit the cache and start in seconds. Cache is invalidated automatically when the template repo's HEAD moves.
**Currently mirrored on Gitea:** `claude-code`, `hermes`, `langgraph`, `autogen`. Other runtimes (`crewai`, `deepagents`, `codex`, `gemini-cli`, `openclaw`) fail with an actionable "not mirrored to Gitea" error pointing at the missing repo.
**Production tenants are unaffected** — every prod tenant sets `MOLECULE_IMAGE_REGISTRY` to its private ECR mirror via Railway env / EC2 user-data, so the SaaS pull path stays identical.
### Environment overrides
| Var | Default | Use case |
|-----|---------|----------|
| `MOLECULE_IMAGE_REGISTRY` | (unset) | Set to a real registry URL to switch from local-build to SaaS-pull mode. |
| `MOLECULE_LOCAL_BUILD_CACHE` | `~/.cache/molecule/workspace-template-build` | Override cache directory. |
| `MOLECULE_LOCAL_TEMPLATE_REPO_PREFIX` | `https://git.moleculesai.app/molecule-ai/molecule-ai-workspace-template-` | Point at a fork. |
| `MOLECULE_GITEA_TOKEN` | (unset) | Required only if your fork has private template repos. |
### Verifying a switch from the GHCR-retag stopgap
Pre-fix, OSS contributors worked around the suspended GHCR org by manually retagging an `:latest` image. After this change, that workaround is **redundant**: simply unset `MOLECULE_IMAGE_REGISTRY` (or leave it unset), boot the platform, and provision a workspace. Logs will show:
```
Provisioner: local-build mode → using locally-built image molecule-local/workspace-template-claude-code:<sha12> for runtime claude-code
local-build: cloning https://git.moleculesai.app/molecule-ai/molecule-ai-workspace-template-claude-code → ...
local-build: docker build done in <duration>
```
If you still see `ghcr.io/molecule-ai/...` in the boot log, double-check `env | grep MOLECULE_IMAGE_REGISTRY` — a stale shell export from the pre-fix workaround could keep SaaS-mode active.
## Starting the Stack ## Starting the Stack
```bash ```bash

View File

@ -0,0 +1,545 @@
package provisioner
import (
"context"
"crypto/sha256"
"encoding/hex"
"errors"
"fmt"
"io"
"log"
"net/http"
"net/url"
"os"
"os/exec"
"path/filepath"
"strings"
"sync"
"time"
)
// Local-build mode: clone the workspace-template-<runtime> repo from Gitea
// and `docker build` it on the host so OSS contributors can run molecule-core
// end-to-end without authenticating to (or being able to reach) GHCR/ECR.
//
// The flow:
//
// 1. ensureLocalImage(runtime) is called by the provisioner before
// ContainerCreate, but only when Resolve().Mode == RegistryModeLocal.
// 2. We compute a cache key from the Gitea repo's HEAD sha (one HTTP
// call to https://git.moleculesai.app/api/v1/repos/.../branches/main).
// 3. If `molecule-local/workspace-template-<runtime>:<sha12>` already
// exists in the local Docker image store, we return immediately.
// 4. Otherwise: shallow git-clone the repo into the cache dir, then
// `docker buildx build --platform=linux/amd64 -t <tag>` on it. We
// also tag `:latest` so `docker images` shows a friendly entry.
//
// Why amd64 emulation: the provisioner's defaultImagePlatform() forces
// linux/amd64 on Apple Silicon for parity with the (amd64-only) prod
// images. Building native arm64 in local-mode would diverge — see the
// design rationale in Issue #63 and the saved memory
// `feedback_local_must_mimic_production`.
//
// Auth: clone is anonymous (templates are public). If MOLECULE_GITEA_TOKEN
// is set, we use it via the URL's userinfo — the token is masked in
// every log line by maskTokenInURL().
//
// Failure mode: fail-closed. If Gitea is unreachable we surface a clear
// error message including the repo URL; we NEVER fall back to GHCR/ECR
// silently (would be a confusing bug for an OSS contributor who
// happens to have stale ECR creds in their docker config).
// gitTemplateRepoPrefix is the prefix all workspace-template repos live
// under on Gitea. Hardcoded so an attacker who controlled cfg.Runtime
// (defence-in-depth — today the field is platform-validated upstream)
// can only ever reach a repo under molecule-ai/.
//
// Operators who want to point local-build at a fork can override the
// full prefix via MOLECULE_LOCAL_TEMPLATE_REPO_PREFIX (e.g.
// `https://git.example.com/myorg/molecule-ai-workspace-template-`).
// Default-off; opt-in only.
const gitTemplateRepoPrefix = "https://git.moleculesai.app/molecule-ai/molecule-ai-workspace-template-"
// localBuildLockMap serializes concurrent ensureLocalImage calls per
// runtime so two workspace creates that hit the cold path together don't
// race on `docker build` (Docker's daemon would serialize anyway, but
// the duplicate clone + log spam are confusing). Lock granularity is
// per-runtime, so different runtimes still build in parallel.
var (
localBuildLockMap = make(map[string]*sync.Mutex)
localBuildLockMapMu sync.Mutex
)
func runtimeBuildLock(runtime string) *sync.Mutex {
localBuildLockMapMu.Lock()
defer localBuildLockMapMu.Unlock()
if m, ok := localBuildLockMap[runtime]; ok {
return m
}
m := &sync.Mutex{}
localBuildLockMap[runtime] = m
return m
}
// LocalBuildOptions controls the local-build path. Exposed so tests can
// inject fakes without standing up a real git+docker chain. Production
// uses zero-value defaults via newDefaultLocalBuildOptions().
type LocalBuildOptions struct {
// CacheDir is the host filesystem location where cloned template
// repos are kept between builds. Empty = use $XDG_CACHE_HOME or
// $HOME/.cache. Override via env var MOLECULE_LOCAL_BUILD_CACHE.
CacheDir string
// RepoPrefix is the URL prefix all template repos hang off. Empty
// = use gitTemplateRepoPrefix. Override via env var
// MOLECULE_LOCAL_TEMPLATE_REPO_PREFIX.
RepoPrefix string
// Token, if non-empty, is sent via URL userinfo to Gitea. Default
// empty (templates are public). Override via env var
// MOLECULE_GITEA_TOKEN.
Token string
// Platform is the buildx --platform value. Empty = host default;
// today we always pass linux/amd64 because the provisioner only
// runs amd64 images. Exposed so tests can override.
Platform string
// HTTPClient is used for the Gitea-API HEAD-sha lookup. Empty =
// http.DefaultClient with a 30s timeout.
HTTPClient *http.Client
// remoteHeadSha + dockerBuild + gitClone are seams for tests; if
// nil, the production implementations are used.
remoteHeadSha func(ctx context.Context, opts *LocalBuildOptions, runtime string) (string, error)
gitClone func(ctx context.Context, opts *LocalBuildOptions, runtime, dest string) error
dockerBuild func(ctx context.Context, opts *LocalBuildOptions, contextDir, tag string) error
dockerHasTag func(ctx context.Context, tag string) (bool, error)
dockerTag func(ctx context.Context, src, dst string) error
}
func newDefaultLocalBuildOptions() *LocalBuildOptions {
o := &LocalBuildOptions{
CacheDir: os.Getenv("MOLECULE_LOCAL_BUILD_CACHE"),
RepoPrefix: os.Getenv("MOLECULE_LOCAL_TEMPLATE_REPO_PREFIX"),
Token: os.Getenv("MOLECULE_GITEA_TOKEN"),
Platform: "linux/amd64",
}
if o.CacheDir == "" {
if xdg := os.Getenv("XDG_CACHE_HOME"); xdg != "" {
o.CacheDir = filepath.Join(xdg, "molecule", "workspace-template-build")
} else if home, err := os.UserHomeDir(); err == nil {
o.CacheDir = filepath.Join(home, ".cache", "molecule", "workspace-template-build")
} else {
// Last-resort fallback: /tmp. Loses the cache between reboots
// but at least lets the path produce builds.
o.CacheDir = filepath.Join(os.TempDir(), "molecule", "workspace-template-build")
}
}
if o.RepoPrefix == "" {
o.RepoPrefix = gitTemplateRepoPrefix
}
o.HTTPClient = &http.Client{Timeout: 30 * time.Second}
return o
}
// LocalImageTag formats the SHA-pinned tag for a runtime. Exported for
// tests + the provisioner's image-resolution branch.
func LocalImageTag(runtime, sha string) string {
short := sha
if len(short) > 12 {
short = short[:12]
}
return fmt.Sprintf("%s/workspace-template-%s:%s", localImagePrefix, runtime, short)
}
// LocalImageLatestTag returns the floating `:latest` form. Used as a
// human-readable alias and as the value RuntimeImage() returns in
// local-mode.
func LocalImageLatestTag(runtime string) string {
return fmt.Sprintf("%s/workspace-template-%s:latest", localImagePrefix, runtime)
}
// EnsureLocalImage is the entry point the provisioner calls before
// ContainerCreate when Resolve().Mode == RegistryModeLocal. Returns the
// image tag (SHA-pinned form) the caller should hand to Docker, or an
// error if the build/clone fails.
//
// Concurrency: per-runtime lock; parallel calls for the same runtime
// share the build, parallel calls for different runtimes proceed.
//
// Idempotent: a cached SHA-pinned tag short-circuits without network
// or docker calls. The Gitea HEAD lookup is the only network call on
// the cache-hit path.
func EnsureLocalImage(ctx context.Context, runtime string) (string, error) {
return ensureLocalImageWithOpts(ctx, runtime, newDefaultLocalBuildOptions())
}
// ensureLocalImageHook is the seam Start() calls into. Production code
// uses EnsureLocalImage; tests substitute a fake to exercise the
// provisioner-Start integration without standing up a real
// git+docker chain. Single-process scoped — never reassigned in
// production code.
var ensureLocalImageHook = EnsureLocalImage
func ensureLocalImageWithOpts(ctx context.Context, runtime string, opts *LocalBuildOptions) (string, error) {
if !IsKnownRuntime(runtime) {
return "", fmt.Errorf("local-build: refusing to build unknown runtime %q (must be one of %v)", runtime, knownRuntimes)
}
lock := runtimeBuildLock(runtime)
lock.Lock()
defer lock.Unlock()
// 1. HEAD lookup → cache key.
headFn := opts.remoteHeadSha
if headFn == nil {
headFn = remoteHeadShaProd
}
sha, err := headFn(ctx, opts, runtime)
if err != nil {
// Fail-closed: do not fall back to GHCR/ECR. The whole point of
// local-build mode is that GHCR is unreachable.
return "", fmt.Errorf("local-build: cannot determine HEAD sha for runtime %q at %s: %w", runtime, repoURL(opts, runtime), err)
}
if len(sha) < 12 {
return "", fmt.Errorf("local-build: Gitea returned a short sha %q for runtime %q (expected ≥12 chars)", sha, runtime)
}
tag := LocalImageTag(runtime, sha)
latest := LocalImageLatestTag(runtime)
// 2. Cache hit?
hasFn := opts.dockerHasTag
if hasFn == nil {
hasFn = dockerHasTagProd
}
exists, hasErr := hasFn(ctx, tag)
if hasErr != nil {
log.Printf("local-build: image inspect for %s failed (%v); will rebuild", tag, hasErr)
}
if exists {
log.Printf("local-build: cache hit for %s (sha=%s) — skipping clone+build", tag, sha[:12])
// Refresh the floating :latest alias so admins inspecting `docker
// images` see the current sha. Best-effort.
tagFn := opts.dockerTag
if tagFn == nil {
tagFn = dockerTagProd
}
if tErr := tagFn(ctx, tag, latest); tErr != nil {
log.Printf("local-build: best-effort retag of %s → %s failed: %v", tag, latest, tErr)
}
return tag, nil
}
// 3. Cold path — clone + build.
dest := filepath.Join(opts.CacheDir, runtime, sha[:12])
if err := os.MkdirAll(filepath.Dir(dest), 0o755); err != nil {
return "", fmt.Errorf("local-build: prepare cache dir %q: %w", filepath.Dir(dest), err)
}
// Idempotent: if the dest exists from a previous failed run, wipe and
// re-clone so we don't build a partial tree.
if _, statErr := os.Stat(dest); statErr == nil {
if rmErr := os.RemoveAll(dest); rmErr != nil {
return "", fmt.Errorf("local-build: clean stale cache dir %q: %w", dest, rmErr)
}
}
cloneFn := opts.gitClone
if cloneFn == nil {
cloneFn = gitCloneProd
}
log.Printf("local-build: cloning %s → %s (sha=%s)", redactedRepoURL(opts, runtime), dest, sha[:12])
cloneStart := time.Now()
if err := cloneFn(ctx, opts, runtime, dest); err != nil {
// Best-effort cleanup so a half-cloned tree doesn't poison future runs.
_ = os.RemoveAll(dest)
return "", fmt.Errorf("local-build: clone %s: %w", redactedRepoURL(opts, runtime), err)
}
log.Printf("local-build: clone complete in %s", time.Since(cloneStart).Round(time.Millisecond))
// 4. Sanity-check the cloned tree contains a Dockerfile at the root.
dockerfile := filepath.Join(dest, "Dockerfile")
info, statErr := os.Stat(dockerfile)
if statErr != nil || info.IsDir() {
_ = os.RemoveAll(dest)
return "", fmt.Errorf("local-build: cloned tree at %s has no Dockerfile (template repo malformed)", dest)
}
// 5. Build.
buildFn := opts.dockerBuild
if buildFn == nil {
buildFn = dockerBuildProd
}
log.Printf("local-build: docker build start for %s (platform=%s, context=%s)", tag, opts.Platform, dest)
buildStart := time.Now()
if err := buildFn(ctx, opts, dest, tag); err != nil {
return "", fmt.Errorf("local-build: docker build %s: %w", tag, err)
}
log.Printf("local-build: docker build done for %s in %s", tag, time.Since(buildStart).Round(time.Second))
// Tag :latest as a friendly alias.
tagFn := opts.dockerTag
if tagFn == nil {
tagFn = dockerTagProd
}
if err := tagFn(ctx, tag, latest); err != nil {
log.Printf("local-build: best-effort retag of %s → %s failed: %v", tag, latest, err)
}
return tag, nil
}
// repoURL composes the full Gitea repo URL for the given runtime. The
// prefix is hardcoded by default; operators can override via env so a
// fork can point local-build at their own Gitea instance.
func repoURL(opts *LocalBuildOptions, runtime string) string {
return opts.RepoPrefix + runtime
}
// redactedRepoURL returns the same value with any embedded token replaced
// by "***". Use this for log lines.
func redactedRepoURL(opts *LocalBuildOptions, runtime string) string {
return maskTokenInURL(repoURL(opts, runtime))
}
// maskTokenInURL replaces userinfo (username:password@) in a URL with
// `***@` so log lines never echo a Gitea PAT. Returns the input as-is
// on parse failures (defence: never silently corrupt the visible URL).
//
// Implementation note: net/url's URL.User stringifier percent-encodes
// the username, so `u.User = url.User("***"); u.String()` would yield
// `https://%2A%2A%2A@host/...` — unhelpful for humans grepping logs.
// We drop the userinfo via URL.User=nil, get the canonical scheme-and-
// rest, and re-insert the literal `***@` between the scheme separator
// and the host.
func maskTokenInURL(s string) string {
u, err := url.Parse(s)
if err != nil || u.User == nil {
return s
}
u.User = nil
out := u.String()
prefix := u.Scheme + "://"
if !strings.HasPrefix(out, prefix) {
return s
}
return prefix + "***@" + out[len(prefix):]
}
// remoteHeadShaProd looks up the HEAD commit sha of branch `main` for
// the workspace-template-<runtime> repo on Gitea. We use the Gitea API
// (a single HTTPS call) rather than `git ls-remote` so we don't need a
// git binary just for the HEAD lookup — we still need git for the
// clone, but the cache-hit path stays git-free.
func remoteHeadShaProd(ctx context.Context, opts *LocalBuildOptions, runtime string) (string, error) {
// Convert a `git.example.com/org/prefix-` URL into the API form
// `git.example.com/api/v1/repos/org/prefix-<runtime>/branches/main`.
// Works for both git.moleculesai.app (default) and any forks that
// share the Gitea API shape.
apiURL, err := giteaBranchAPIURL(opts.RepoPrefix, runtime, "main")
if err != nil {
return "", err
}
req, err := http.NewRequestWithContext(ctx, "GET", apiURL, nil)
if err != nil {
return "", err
}
if opts.Token != "" {
// Gitea accepts "token <PAT>" in the Authorization header for
// API calls. Userinfo is also accepted but only matters for
// the HTTPS clone, not the JSON API.
req.Header.Set("Authorization", "token "+opts.Token)
}
cli := opts.HTTPClient
if cli == nil {
cli = &http.Client{Timeout: 30 * time.Second}
}
resp, err := cli.Do(req)
if err != nil {
return "", err
}
defer func() { _ = resp.Body.Close() }()
if resp.StatusCode == http.StatusNotFound {
return "", fmt.Errorf("repo not found at %s — runtime %q may not be mirrored to Gitea (only claude-code/hermes/langgraph/autogen today)", apiURL, runtime)
}
if resp.StatusCode == http.StatusUnauthorized || resp.StatusCode == http.StatusForbidden {
return "", fmt.Errorf("auth failure (%d) at %s — verify MOLECULE_GITEA_TOKEN if private repo", resp.StatusCode, apiURL)
}
if resp.StatusCode != http.StatusOK {
return "", fmt.Errorf("HEAD lookup at %s returned %d", apiURL, resp.StatusCode)
}
body, err := io.ReadAll(io.LimitReader(resp.Body, 64<<10))
if err != nil {
return "", fmt.Errorf("read HEAD response body: %w", err)
}
// Tiny ad-hoc parser: we want commit.id, no need to drag in encoding/json
// — actually simpler to use json. Switch to it.
return parseGiteaBranchHeadSha(body)
}
// giteaBranchAPIURL maps a repo-prefix URL like
// `https://git.moleculesai.app/molecule-ai/molecule-ai-workspace-template-`
// + runtime "claude-code" + branch "main"
// to the API URL
// `https://git.moleculesai.app/api/v1/repos/molecule-ai/molecule-ai-workspace-template-claude-code/branches/main`.
func giteaBranchAPIURL(repoPrefix, runtime, branch string) (string, error) {
u, err := url.Parse(repoPrefix + runtime)
if err != nil {
return "", fmt.Errorf("parse repo URL %q: %w", repoPrefix+runtime, err)
}
parts := strings.TrimPrefix(u.Path, "/")
parts = strings.TrimSuffix(parts, "/")
if parts == "" {
return "", fmt.Errorf("repo URL %q has empty path", repoPrefix+runtime)
}
// Expect `<org>/<repo>` (single slash) — the prefix already includes
// org+partial-repo; runtime appends the rest.
if !strings.Contains(parts, "/") {
return "", fmt.Errorf("repo URL %q missing org/repo path", repoPrefix+runtime)
}
apiURL := url.URL{
Scheme: u.Scheme,
Host: u.Host,
Path: "/api/v1/repos/" + parts + "/branches/" + branch,
}
return apiURL.String(), nil
}
// parseGiteaBranchHeadSha extracts commit.id from the Gitea
// /branches/<name> response. We use a permissive substring scan so a
// missing-key in the JSON gives a clear error rather than the
// json.Decoder's somewhat opaque "missing field" message.
func parseGiteaBranchHeadSha(body []byte) (string, error) {
// Look for `"id":"<40-hex>"` inside the commit object.
idx := strings.Index(string(body), `"id":"`)
if idx < 0 {
return "", errors.New("Gitea branch response missing commit.id field")
}
rest := string(body[idx+len(`"id":"`):])
end := strings.IndexByte(rest, '"')
if end < 0 {
return "", errors.New("Gitea branch response has malformed commit.id (no closing quote)")
}
sha := rest[:end]
if len(sha) < 7 {
return "", fmt.Errorf("Gitea returned suspiciously short sha %q", sha)
}
return sha, nil
}
// gitCloneProd shallow-clones the runtime's template repo into dest.
//
// We invoke `git` rather than implementing the protocol ourselves —
// every host that runs the workspace-server already needs git available
// (it's a hard dep of go-mod for vendored repos) and the OSS contributor
// onboarding doc lists it as a prerequisite.
func gitCloneProd(ctx context.Context, opts *LocalBuildOptions, runtime, dest string) error {
cloneURL := repoURL(opts, runtime)
if opts.Token != "" {
// HTTPS clone with userinfo: https://oauth2:<token>@host/...
u, err := url.Parse(cloneURL)
if err == nil {
u.User = url.UserPassword("oauth2", opts.Token)
cloneURL = u.String()
}
// On parse failure we silently fall through to the public URL —
// better to attempt the anonymous clone than to refuse outright.
}
cmd := exec.CommandContext(ctx, "git", "clone", "--depth=1", "--branch=main", "--single-branch", cloneURL, dest)
// Drop git's askpass prompts so we fail-fast on auth errors instead
// of hanging waiting for an interactive password.
cmd.Env = append(os.Environ(), "GIT_TERMINAL_PROMPT=0", "GIT_ASKPASS=/bin/echo")
out, err := cmd.CombinedOutput()
if err != nil {
// Mask the token in any error string git emits via stderr — git
// occasionally echoes the URL verbatim on failure.
errMsg := maskTokenInString(string(out), opts.Token)
return fmt.Errorf("%w: %s", err, strings.TrimSpace(errMsg))
}
return nil
}
// maskTokenInString replaces literal occurrences of the token with `***`.
// Defence against git binary or docker echoing the URL into stderr.
func maskTokenInString(s, token string) string {
if token == "" {
return s
}
return strings.ReplaceAll(s, token, "***")
}
// dockerBuildProd invokes the docker CLI to build the workspace-template
// image. We shell out rather than use the Docker SDK's ImageBuild — the
// SDK requires hand-tarballing the build context, which adds a
// non-trivial code path with its own bug surface. The docker CLI is
// already a hard dep of the workspace-server (the provisioner needs the
// daemon), so requiring the CLI binary on PATH adds nothing.
//
// Uses the legacy `docker build` (not `docker buildx build`) because
// buildx isn't always installed by default on Linux distros and the
// legacy builder produces an image the local Docker daemon picks up
// automatically. We pass --platform=linux/amd64 directly; with Docker
// 20.10+ this works without buildx because the legacy builder
// auto-promotes to BuildKit when available, falling back to v1
// otherwise (still produces an amd64 image via QEMU).
func dockerBuildProd(ctx context.Context, opts *LocalBuildOptions, contextDir, tag string) error {
args := []string{"build"}
if opts.Platform != "" {
args = append(args, "--platform="+opts.Platform)
}
args = append(args,
"-t", tag,
"-f", filepath.Join(contextDir, "Dockerfile"),
contextDir,
)
cmd := exec.CommandContext(ctx, "docker", args...)
cmd.Env = append(os.Environ(), "DOCKER_BUILDKIT=1")
out, err := cmd.CombinedOutput()
if err != nil {
// Sanitize defensive — docker build output shouldn't contain a
// token, but maskTokenInString is a no-op when token is empty.
return fmt.Errorf("%w: %s", err, strings.TrimSpace(maskTokenInString(string(out), opts.Token)))
}
return nil
}
// dockerHasTagProd returns true iff the given tag exists in the local
// image store. Used as the fast cache-hit check.
func dockerHasTagProd(ctx context.Context, tag string) (bool, error) {
cmd := exec.CommandContext(ctx, "docker", "image", "inspect", "--format={{.Id}}", tag)
out, err := cmd.CombinedOutput()
if err == nil {
return strings.TrimSpace(string(out)) != "", nil
}
// `docker image inspect` exits 1 with "Error: No such image" when
// missing — that's a definitive false, not an error condition.
low := strings.ToLower(string(out))
if strings.Contains(low, "no such image") || strings.Contains(low, "not found") {
return false, nil
}
return false, fmt.Errorf("%w: %s", err, strings.TrimSpace(string(out)))
}
// dockerTagProd creates an alias from src → dst. Used to refresh the
// floating `:latest` after a build or cache hit.
func dockerTagProd(ctx context.Context, src, dst string) error {
cmd := exec.CommandContext(ctx, "docker", "tag", src, dst)
out, err := cmd.CombinedOutput()
if err != nil {
return fmt.Errorf("%w: %s", err, strings.TrimSpace(string(out)))
}
return nil
}
// CacheKey is exposed for diagnostic logs / tests so the cache-key shape
// is documented in code rather than only as a string format.
//
// cache_key = sha256(runtime || head_sha || repoPrefix)[:16]
//
// Today only the SHA is consumed, but the helper is kept for future
// extensions (e.g. include Dockerfile-content-hash to invalidate when
// only the Dockerfile changes between two runs targeting the same SHA).
func CacheKey(runtime, sha, repoPrefix string) string {
h := sha256.Sum256([]byte(runtime + "|" + sha + "|" + repoPrefix))
return hex.EncodeToString(h[:8])
}

View File

@ -0,0 +1,662 @@
package provisioner
import (
"context"
"errors"
"fmt"
"net/http"
"net/http/httptest"
"os"
"path/filepath"
"strings"
"sync"
"testing"
)
// makeTestOpts produces a LocalBuildOptions where every external seam
// (Gitea HEAD, git clone, docker build/has/tag) is replaced by a stub.
// Tests override the stub for the behavior they want to assert.
func makeTestOpts(t *testing.T) *LocalBuildOptions {
t.Helper()
tmp := t.TempDir()
return &LocalBuildOptions{
CacheDir: tmp,
RepoPrefix: "https://git.test/molecule-ai/molecule-ai-workspace-template-",
Platform: "linux/amd64",
HTTPClient: &http.Client{},
remoteHeadSha: func(ctx context.Context, opts *LocalBuildOptions, runtime string) (string, error) {
return "abcdef0123456789abcdef0123456789abcdef01", nil
},
gitClone: func(ctx context.Context, opts *LocalBuildOptions, runtime, dest string) error {
// Write a fake Dockerfile so the sanity-check passes.
if err := os.MkdirAll(dest, 0o755); err != nil {
return err
}
return os.WriteFile(filepath.Join(dest, "Dockerfile"), []byte("FROM scratch\n"), 0o644)
},
dockerBuild: func(ctx context.Context, opts *LocalBuildOptions, contextDir, tag string) error {
return nil
},
dockerHasTag: func(ctx context.Context, tag string) (bool, error) {
return false, nil
},
dockerTag: func(ctx context.Context, src, dst string) error {
return nil
},
}
}
// TestEnsureLocalImage_Success — happy path: HEAD lookup succeeds, no
// cache hit, clone + build run, returned tag is SHA-pinned.
func TestEnsureLocalImage_Success(t *testing.T) {
opts := makeTestOpts(t)
tag, err := ensureLocalImageWithOpts(context.Background(), "claude-code", opts)
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
want := "molecule-local/workspace-template-claude-code:abcdef012345"
if tag != want {
t.Errorf("tag = %q, want %q", tag, want)
}
}
// TestEnsureLocalImage_CacheHit — second call with a cached image must
// skip clone + build entirely.
func TestEnsureLocalImage_CacheHit(t *testing.T) {
opts := makeTestOpts(t)
var cloneCount, buildCount int
opts.gitClone = func(ctx context.Context, opts *LocalBuildOptions, runtime, dest string) error {
cloneCount++
return os.WriteFile(filepath.Join(dest, "Dockerfile"), []byte("FROM scratch\n"), 0o644)
}
opts.dockerBuild = func(ctx context.Context, opts *LocalBuildOptions, contextDir, tag string) error {
buildCount++
return nil
}
opts.dockerHasTag = func(ctx context.Context, tag string) (bool, error) {
return true, nil // cached
}
if _, err := ensureLocalImageWithOpts(context.Background(), "hermes", opts); err != nil {
t.Fatalf("unexpected error: %v", err)
}
if cloneCount != 0 {
t.Errorf("cache hit triggered %d clones, want 0", cloneCount)
}
if buildCount != 0 {
t.Errorf("cache hit triggered %d builds, want 0", buildCount)
}
}
// TestEnsureLocalImage_UnknownRuntime — the allowlist guard rejects
// arbitrary runtime names before any network or filesystem call.
func TestEnsureLocalImage_UnknownRuntime(t *testing.T) {
opts := makeTestOpts(t)
for _, bad := range []string{
"", "unknown", "../../../etc/passwd", "claude-code; rm -rf /",
} {
t.Run(bad, func(t *testing.T) {
_, err := ensureLocalImageWithOpts(context.Background(), bad, opts)
if err == nil {
t.Errorf("EnsureLocalImage(%q) should fail (not a known runtime)", bad)
}
if err != nil && !strings.Contains(err.Error(), "unknown runtime") {
t.Errorf("error = %v, want one mentioning %q", err, "unknown runtime")
}
})
}
}
// TestEnsureLocalImage_GiteaUnreachable — fail-closed when the HEAD
// lookup fails. Must NOT fall back to GHCR/ECR.
func TestEnsureLocalImage_GiteaUnreachable(t *testing.T) {
opts := makeTestOpts(t)
opts.remoteHeadSha = func(ctx context.Context, opts *LocalBuildOptions, runtime string) (string, error) {
return "", errors.New("dial tcp: no such host")
}
_, err := ensureLocalImageWithOpts(context.Background(), "langgraph", opts)
if err == nil {
t.Fatalf("expected error, got nil")
}
if !strings.Contains(err.Error(), "cannot determine HEAD sha") {
t.Errorf("error = %v, want one mentioning HEAD sha lookup", err)
}
// Critical: error must NOT mention ghcr or ecr (no silent fallback).
low := strings.ToLower(err.Error())
if strings.Contains(low, "ghcr") || strings.Contains(low, "ecr") {
t.Errorf("error message %q must not mention ghcr/ecr (no silent fallback)", err.Error())
}
}
// TestEnsureLocalImage_RepoNotFound — Gitea returned 404. Must surface
// a runtime-naming error so the OSS contributor can file the right
// mirroring task.
func TestEnsureLocalImage_RepoNotFound(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(http.StatusNotFound)
_, _ = w.Write([]byte(`{"message":"repo not found"}`))
}))
defer srv.Close()
opts := makeTestOpts(t)
opts.RepoPrefix = srv.URL + "/molecule-ai/molecule-ai-workspace-template-"
opts.HTTPClient = srv.Client()
opts.remoteHeadSha = nil // exercise real HTTP path
_, err := ensureLocalImageWithOpts(context.Background(), "crewai", opts)
if err == nil {
t.Fatalf("expected error, got nil")
}
if !strings.Contains(err.Error(), "not mirrored") && !strings.Contains(err.Error(), "not found") {
t.Errorf("error = %v, want a missing-repo message", err)
}
}
// TestEnsureLocalImage_AuthFailure — Gitea returned 401/403. Must
// produce an actionable error (mentions the token env var so an OSS
// contributor knows what to set).
func TestEnsureLocalImage_AuthFailure(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.WriteHeader(http.StatusUnauthorized)
}))
defer srv.Close()
opts := makeTestOpts(t)
opts.RepoPrefix = srv.URL + "/molecule-ai/molecule-ai-workspace-template-"
opts.HTTPClient = srv.Client()
opts.remoteHeadSha = nil
_, err := ensureLocalImageWithOpts(context.Background(), "claude-code", opts)
if err == nil {
t.Fatalf("expected error, got nil")
}
if !strings.Contains(err.Error(), "MOLECULE_GITEA_TOKEN") {
t.Errorf("error = %v, want one mentioning MOLECULE_GITEA_TOKEN", err)
}
}
// TestEnsureLocalImage_HeadShaWithRealJSON — exercise the JSON parser
// against a Gitea-shaped response to catch parse drift.
func TestEnsureLocalImage_HeadShaWithRealJSON(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
// Real Gitea response shape (truncated for relevance).
w.Header().Set("Content-Type", "application/json")
_, _ = w.Write([]byte(`{
"name":"main",
"commit":{
"id":"3c849b3ba778abcdef0123456789abcdef012345",
"message":"feat: stuff"
}
}`))
}))
defer srv.Close()
opts := makeTestOpts(t)
opts.RepoPrefix = srv.URL + "/molecule-ai/molecule-ai-workspace-template-"
opts.HTTPClient = srv.Client()
opts.remoteHeadSha = nil // exercise real HTTP path
tag, err := ensureLocalImageWithOpts(context.Background(), "claude-code", opts)
if err != nil {
t.Fatalf("unexpected error: %v", err)
}
if !strings.Contains(tag, "3c849b3ba778") {
t.Errorf("tag = %q, want one containing the parsed sha", tag)
}
}
// TestEnsureLocalImage_BuildFailure — surfaces docker-build errors with
// the build context so an operator can debug locally.
func TestEnsureLocalImage_BuildFailure(t *testing.T) {
opts := makeTestOpts(t)
opts.dockerBuild = func(ctx context.Context, opts *LocalBuildOptions, contextDir, tag string) error {
return errors.New("Dockerfile syntax error")
}
_, err := ensureLocalImageWithOpts(context.Background(), "autogen", opts)
if err == nil {
t.Fatalf("expected error, got nil")
}
if !strings.Contains(err.Error(), "docker build") {
t.Errorf("error = %v, want one mentioning docker build", err)
}
}
// TestEnsureLocalImage_MissingDockerfile — the cloned tree must contain
// a Dockerfile at root; absence is a malformed-template-repo error.
func TestEnsureLocalImage_MissingDockerfile(t *testing.T) {
opts := makeTestOpts(t)
opts.gitClone = func(ctx context.Context, opts *LocalBuildOptions, runtime, dest string) error {
// Empty dir, no Dockerfile.
return os.MkdirAll(dest, 0o755)
}
_, err := ensureLocalImageWithOpts(context.Background(), "hermes", opts)
if err == nil {
t.Fatalf("expected error, got nil")
}
if !strings.Contains(err.Error(), "no Dockerfile") {
t.Errorf("error = %v, want one mentioning missing Dockerfile", err)
}
}
// TestEnsureLocalImage_ConcurrentSameRuntime — two goroutines hitting
// the same runtime serialize via the per-runtime lock; the build runs
// once.
func TestEnsureLocalImage_ConcurrentSameRuntime(t *testing.T) {
opts := makeTestOpts(t)
var (
buildCount int
buildMu sync.Mutex
)
opts.dockerHasTag = func(ctx context.Context, tag string) (bool, error) {
// First call: cache miss. Second call (after first build): hit.
buildMu.Lock()
defer buildMu.Unlock()
return buildCount > 0, nil
}
opts.dockerBuild = func(ctx context.Context, opts *LocalBuildOptions, contextDir, tag string) error {
buildMu.Lock()
buildCount++
buildMu.Unlock()
return nil
}
const N = 5
var wg sync.WaitGroup
wg.Add(N)
for i := 0; i < N; i++ {
go func() {
defer wg.Done()
_, _ = ensureLocalImageWithOpts(context.Background(), "langgraph", opts)
}()
}
wg.Wait()
if buildCount != 1 {
t.Errorf("buildCount = %d, want 1 (lock should serialize concurrent calls)", buildCount)
}
}
// TestMaskTokenInURL — Gitea PATs in URLs must NEVER appear in logs.
func TestMaskTokenInURL(t *testing.T) {
cases := []struct {
in string
want string
}{
{"https://oauth2:secret123@git.example.com/foo/bar", "https://***@git.example.com/foo/bar"},
{"https://user:tok@host/path", "https://***@host/path"},
{"https://no-userinfo.example.com/path", "https://no-userinfo.example.com/path"},
{"not a url", "not a url"},
{"", ""},
}
for _, tc := range cases {
t.Run(tc.in, func(t *testing.T) {
got := maskTokenInURL(tc.in)
if got != tc.want {
t.Errorf("maskTokenInURL(%q) = %q, want %q", tc.in, got, tc.want)
}
})
}
}
// TestMaskTokenInString — defence against git/docker echoing the token
// into stderr on failure.
func TestMaskTokenInString(t *testing.T) {
got := maskTokenInString("error: clone https://oauth2:abc123@git.test/foo: failed", "abc123")
if strings.Contains(got, "abc123") {
t.Errorf("masked string %q still contains the token", got)
}
if !strings.Contains(got, "***") {
t.Errorf("masked string %q should have *** in place of token", got)
}
// No-op when token is empty.
if got := maskTokenInString("hello world", ""); got != "hello world" {
t.Errorf("empty token must not modify string, got %q", got)
}
}
// TestGiteaBranchAPIURL — the URL composer must produce the canonical
// /api/v1/repos/<org>/<repo>/branches/<branch> shape.
func TestGiteaBranchAPIURL(t *testing.T) {
cases := []struct {
prefix, runtime, branch, want string
}{
{
"https://git.moleculesai.app/molecule-ai/molecule-ai-workspace-template-",
"claude-code",
"main",
"https://git.moleculesai.app/api/v1/repos/molecule-ai/molecule-ai-workspace-template-claude-code/branches/main",
},
{
"http://localhost:3000/myorg/template-",
"foo",
"main",
"http://localhost:3000/api/v1/repos/myorg/template-foo/branches/main",
},
}
for _, tc := range cases {
t.Run(tc.runtime, func(t *testing.T) {
got, err := giteaBranchAPIURL(tc.prefix, tc.runtime, tc.branch)
if err != nil {
t.Fatalf("err = %v", err)
}
if got != tc.want {
t.Errorf("got %q, want %q", got, tc.want)
}
})
}
}
// TestGiteaBranchAPIURL_RejectsMalformed — malformed prefixes (no org
// path) produce an error rather than a malformed API call.
func TestGiteaBranchAPIURL_RejectsMalformed(t *testing.T) {
for _, bad := range []string{
"https://example.com/", // no path component
"://broken",
} {
t.Run(bad, func(t *testing.T) {
if _, err := giteaBranchAPIURL(bad, "claude-code", "main"); err == nil {
t.Errorf("expected error for malformed prefix %q", bad)
}
})
}
}
// TestParseGiteaBranchHeadSha — pin the parser against representative
// Gitea responses so a future Gitea API rev that adds fields doesn't
// silently break detection.
func TestParseGiteaBranchHeadSha(t *testing.T) {
good := []byte(`{"name":"main","commit":{"id":"abc123def456","message":"hi"}}`)
got, err := parseGiteaBranchHeadSha(good)
if err != nil {
t.Fatalf("err = %v", err)
}
if got != "abc123def456" {
t.Errorf("got %q, want abc123def456", got)
}
for _, bad := range [][]byte{
[]byte(`{}`),
[]byte(`{"name":"main","commit":{}}`),
[]byte(`{"commit":{"id":"`), // truncated
[]byte(`<html>404</html>`),
} {
if _, err := parseGiteaBranchHeadSha(bad); err == nil {
t.Errorf("expected error for malformed body %q", string(bad))
}
}
}
// TestLocalImageTag_ShortSha — caller-supplied SHA gets truncated to
// 12 chars in the tag so `docker images` output stays readable.
func TestLocalImageTag_ShortSha(t *testing.T) {
got := LocalImageTag("claude-code", "abcdef0123456789abcdef0123456789abcdef01")
want := "molecule-local/workspace-template-claude-code:abcdef012345"
if got != want {
t.Errorf("got %q, want %q", got, want)
}
}
// TestLocalImageLatestTag — the floating alias used as the human-readable
// :latest entry.
func TestLocalImageLatestTag(t *testing.T) {
got := LocalImageLatestTag("hermes")
want := "molecule-local/workspace-template-hermes:latest"
if got != want {
t.Errorf("got %q, want %q", got, want)
}
}
// TestRemoteHeadShaProd_IncludesAuthHeader — when a token is configured,
// the API request must carry the `Authorization: token <pat>` header.
func TestRemoteHeadShaProd_IncludesAuthHeader(t *testing.T) {
var got string
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
got = r.Header.Get("Authorization")
w.Header().Set("Content-Type", "application/json")
_, _ = w.Write([]byte(`{"commit":{"id":"deadbeef0000aaaa1111bbbb2222cccc33334444"}}`))
}))
defer srv.Close()
opts := makeTestOpts(t)
opts.RepoPrefix = srv.URL + "/myorg/template-"
opts.HTTPClient = srv.Client()
opts.Token = "secret-pat-do-not-log"
if _, err := remoteHeadShaProd(context.Background(), opts, "claude-code"); err != nil {
t.Fatalf("err = %v", err)
}
if got != "token secret-pat-do-not-log" {
t.Errorf("Authorization header = %q, want %q", got, "token secret-pat-do-not-log")
}
}
// TestCacheKey_Stable — the helper must be deterministic and incorporate
// each input.
func TestCacheKey_Stable(t *testing.T) {
a := CacheKey("claude-code", "abc", "https://git/")
b := CacheKey("claude-code", "abc", "https://git/")
if a != b {
t.Errorf("CacheKey is non-deterministic: %q vs %q", a, b)
}
if a == CacheKey("claude-code", "def", "https://git/") {
t.Errorf("CacheKey ignores sha")
}
if a == CacheKey("hermes", "abc", "https://git/") {
t.Errorf("CacheKey ignores runtime")
}
}
// TestRedactedRepoURL_NoToken — a repo URL with no embedded credential
// is unmodified.
func TestRedactedRepoURL_NoToken(t *testing.T) {
opts := &LocalBuildOptions{RepoPrefix: "https://git.example.com/org/template-"}
got := redactedRepoURL(opts, "claude-code")
want := "https://git.example.com/org/template-claude-code"
if got != want {
t.Errorf("got %q, want %q", got, want)
}
}
// TestRepoURL_AppendsRuntime — the prefix + runtime composer is stable.
func TestRepoURL_AppendsRuntime(t *testing.T) {
opts := &LocalBuildOptions{RepoPrefix: "https://git.example.com/org/template-"}
got := repoURL(opts, "claude-code")
if got != "https://git.example.com/org/template-claude-code" {
t.Errorf("got %q", got)
}
}
// TestNewDefaultLocalBuildOptions_RespectsEnvOverrides — the env var
// overrides documented in the runbook actually take effect.
func TestNewDefaultLocalBuildOptions_RespectsEnvOverrides(t *testing.T) {
t.Setenv("MOLECULE_LOCAL_BUILD_CACHE", "/var/tmp/molecule-test")
t.Setenv("MOLECULE_LOCAL_TEMPLATE_REPO_PREFIX", "https://my.fork/org/tpl-")
t.Setenv("MOLECULE_GITEA_TOKEN", "tok-from-env")
opts := newDefaultLocalBuildOptions()
if opts.CacheDir != "/var/tmp/molecule-test" {
t.Errorf("CacheDir = %q", opts.CacheDir)
}
if opts.RepoPrefix != "https://my.fork/org/tpl-" {
t.Errorf("RepoPrefix = %q", opts.RepoPrefix)
}
if opts.Token != "tok-from-env" {
t.Errorf("Token = %q", opts.Token)
}
if opts.Platform != "linux/amd64" {
t.Errorf("Platform = %q, want linux/amd64", opts.Platform)
}
}
// TestNewDefaultLocalBuildOptions_DefaultCacheDir — XDG-compliant
// fallback when nothing is overridden.
func TestNewDefaultLocalBuildOptions_DefaultCacheDir(t *testing.T) {
t.Setenv("MOLECULE_LOCAL_BUILD_CACHE", "")
t.Setenv("XDG_CACHE_HOME", "")
t.Setenv("MOLECULE_LOCAL_TEMPLATE_REPO_PREFIX", "")
opts := newDefaultLocalBuildOptions()
if !strings.Contains(opts.CacheDir, ".cache") && !strings.Contains(opts.CacheDir, "molecule") {
t.Errorf("CacheDir = %q, want one under .cache/molecule", opts.CacheDir)
}
if opts.RepoPrefix != gitTemplateRepoPrefix {
t.Errorf("RepoPrefix = %q, want default %q", opts.RepoPrefix, gitTemplateRepoPrefix)
}
}
// TestEnsureLocalImage_ShortSha — a remote that returns a too-short
// sha is rejected (defence against a misbehaving Gitea proxy).
func TestEnsureLocalImage_ShortSha(t *testing.T) {
opts := makeTestOpts(t)
opts.remoteHeadSha = func(ctx context.Context, opts *LocalBuildOptions, runtime string) (string, error) {
return "abc", nil
}
_, err := ensureLocalImageWithOpts(context.Background(), "claude-code", opts)
if err == nil {
t.Fatalf("expected error for short sha")
}
if !strings.Contains(err.Error(), "short sha") {
t.Errorf("error = %v, want short-sha message", err)
}
}
// TestEnsureLocalImage_StaleCacheDirCleaned — a partial clone left over
// from a previous failed run must not poison the next attempt.
func TestEnsureLocalImage_StaleCacheDirCleaned(t *testing.T) {
opts := makeTestOpts(t)
// Pre-create a stale dir at the cache target (with a partial Dockerfile).
staleDir := filepath.Join(opts.CacheDir, "claude-code", "abcdef012345")
if err := os.MkdirAll(staleDir, 0o755); err != nil {
t.Fatalf("setup: %v", err)
}
if err := os.WriteFile(filepath.Join(staleDir, "stale-marker"), []byte("delete me"), 0o644); err != nil {
t.Fatalf("setup: %v", err)
}
if _, err := ensureLocalImageWithOpts(context.Background(), "claude-code", opts); err != nil {
t.Fatalf("err = %v", err)
}
if _, err := os.Stat(filepath.Join(staleDir, "stale-marker")); !os.IsNotExist(err) {
t.Errorf("stale-marker should have been wiped before re-clone (err=%v)", err)
}
// Dockerfile from the new clone should be present.
if _, err := os.Stat(filepath.Join(staleDir, "Dockerfile")); err != nil {
t.Errorf("expected Dockerfile from re-clone, got err=%v", err)
}
}
// TestEnsureLocalImage_ContextCancelled — context cancellation
// propagates to the network/clone seams (best-effort: the test asserts
// that no work happens after Done()).
func TestEnsureLocalImage_ContextCancelled(t *testing.T) {
ctx, cancel := context.WithCancel(context.Background())
cancel()
opts := makeTestOpts(t)
opts.remoteHeadSha = func(ctx context.Context, opts *LocalBuildOptions, runtime string) (string, error) {
if err := ctx.Err(); err != nil {
return "", err
}
return "deadbeef00000000aaaa1111bbbb2222cccc33334444", nil
}
_, err := ensureLocalImageWithOpts(ctx, "claude-code", opts)
if err == nil {
t.Fatalf("expected error from cancelled context")
}
}
// TestEnsureLocalImage_RetagAfterCacheHit — a cache-hit must refresh
// the floating :latest alias so admins inspecting `docker images` see
// the current SHA.
func TestEnsureLocalImage_RetagAfterCacheHit(t *testing.T) {
opts := makeTestOpts(t)
var src, dst string
opts.dockerHasTag = func(ctx context.Context, tag string) (bool, error) { return true, nil }
opts.dockerTag = func(ctx context.Context, s, d string) error {
src, dst = s, d
return nil
}
tag, err := ensureLocalImageWithOpts(context.Background(), "claude-code", opts)
if err != nil {
t.Fatalf("err = %v", err)
}
if src != tag {
t.Errorf("retag src = %q, want %q", src, tag)
}
wantDst := "molecule-local/workspace-template-claude-code:latest"
if dst != wantDst {
t.Errorf("retag dst = %q, want %q", dst, wantDst)
}
}
// TestRemoteHeadShaProd_BodyOverflow — defence against a malicious or
// misbehaving Gitea returning a multi-MB body.
func TestRemoteHeadShaProd_BodyOverflow(t *testing.T) {
srv := httptest.NewServer(http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
w.Header().Set("Content-Type", "application/json")
// Stream a 100MB body. The reader should cap at 64KB and yield
// a parse error rather than OOM.
_, _ = w.Write([]byte(`{"commit":{"id":"`))
_, _ = w.Write([]byte(strings.Repeat("a", 64<<10))) // 64KB of 'a'
// Connection drops here; we don't write the closing quote.
}))
defer srv.Close()
opts := makeTestOpts(t)
opts.RepoPrefix = srv.URL + "/myorg/template-"
opts.HTTPClient = srv.Client()
_, err := remoteHeadShaProd(context.Background(), opts, "claude-code")
if err == nil {
t.Fatalf("expected error from over-long sha (no closing quote within cap)")
}
}
// TestProvisionerStartUsesLocalBuild_LocalMode — pin the provisioner→
// local-build wiring at the integration boundary. We don't want a future
// refactor to silently bypass EnsureLocalImage when registry is unset.
//
// This test inspects the mode-decision logic without standing up Docker.
func TestProvisionerStartUsesLocalBuild_LocalMode(t *testing.T) {
t.Setenv("MOLECULE_IMAGE_REGISTRY", "")
src := Resolve()
if src.Mode != RegistryModeLocal {
t.Fatalf("Resolve in unset env = %q, want local", src.Mode)
}
// The provisioner Start() branches on this same Resolve() call before
// reaching ContainerCreate. Pinning the boolean here means a refactor
// that flips the sense (e.g. `if src.Mode == RegistryModeSaaS`) is
// caught by this test.
}
// TestEnsureLocalImageHook_DefaultIsRealFunction — pin that the
// production hook points at EnsureLocalImage. Tests that swap the hook
// must restore it via t.Cleanup; this test catches a leaked override.
func TestEnsureLocalImageHook_DefaultIsRealFunction(t *testing.T) {
// Sanity: hook is set to a non-nil function. We can't compare
// function pointers directly with == in Go (compiler error), so
// we exercise it instead — but we don't want to actually clone
// from the network in the unit test, so use an unknown runtime
// and assert the known-error path runs.
_, err := ensureLocalImageHook(context.Background(), "this-runtime-cannot-exist-194")
if err == nil {
t.Fatalf("expected error from EnsureLocalImage on unknown runtime")
}
if !strings.Contains(err.Error(), "unknown runtime") {
t.Errorf("hook = unexpected function (got error %q, want one mentioning unknown runtime)", err.Error())
}
}
// TestProvisionerStartUsesLocalBuild_SaaSMode — and the symmetric guard:
// in SaaS-mode, no local-build path runs.
func TestProvisionerStartUsesLocalBuild_SaaSMode(t *testing.T) {
t.Setenv("MOLECULE_IMAGE_REGISTRY", "registry.example.com/molecule-ai")
src := Resolve()
if src.Mode != RegistryModeSaaS {
t.Fatalf("Resolve with registry set = %q, want saas", src.Mode)
}
if src.Prefix != "registry.example.com/molecule-ai" {
t.Fatalf("Prefix = %q", src.Prefix)
}
}
// silence unused warning if we ever drop fmt usage
var _ = fmt.Sprintf

View File

@ -320,6 +320,26 @@ func (p *Provisioner) Start(ctx context.Context, cfg WorkspaceConfig) (string, e
image := selectImage(cfg) image := selectImage(cfg)
// Local-build mode (issue #63 / Task #194): when MOLECULE_IMAGE_REGISTRY
// is unset, the OSS contributor path skips the registry pull entirely
// and instead clones the workspace-template-<runtime> repo from Gitea
// + `docker build`s it locally. Replace the placeholder image ref with
// the SHA-pinned tag of the freshly-built image before ContainerCreate.
//
// Pinned overrides (cfg.Image set, e.g. via runtime_image_pins for
// production thin-AMI launches) bypass this path — they pin a digest
// the operator chose explicitly.
if cfg.Image == "" && cfg.Runtime != "" {
if src := Resolve(); src.Mode == RegistryModeLocal {
builtTag, buildErr := ensureLocalImageHook(ctx, cfg.Runtime)
if buildErr != nil {
return "", fmt.Errorf("local-build mode: ensure image for runtime %q: %w", cfg.Runtime, buildErr)
}
image = builtTag
log.Printf("Provisioner: local-build mode → using locally-built image %s for runtime %s", image, cfg.Runtime)
}
}
containerCfg := &container.Config{ containerCfg := &container.Config{
Image: image, Image: image,
Env: env, Env: env,

View File

@ -0,0 +1,96 @@
package provisioner
import "os"
// localImagePrefix is the synthetic registry hostname used for images
// that the local-build path produces. It is intentionally NOT a real
// hostname — Docker won't try to pull it from the network (no DNS
// resolution path), and the workspace-image-refresh / image-watch
// paths short-circuit on it.
//
// Tag scheme: `molecule-local/workspace-template-<runtime>:<tag>` where
// `<tag>` is either the 12-char Gitea HEAD sha for SHA-pinned references
// or the moving `:latest` for human inspection (the provisioner
// consumes the SHA-pinned form via EnsureLocalImage()).
//
// Issue #63 / Task #194.
const localImagePrefix = "molecule-local"
// RegistryMode classifies how the provisioner sources workspace-template
// container images. The two modes are mutually exclusive and selected
// by presence/absence of the MOLECULE_IMAGE_REGISTRY env var (Q2 design
// lock, 2026-05-07): set ⇒ SaaS-mode pull; unset ⇒ local-build mode.
//
// Discriminated value rather than a bare string return so every call
// site that decides on image source has to acknowledge the two modes —
// a bare string returning `""` on local-mode would silently produce
// malformed image refs (e.g. `/workspace-template-foo:latest`).
type RegistryMode string
const (
// RegistryModeSaaS — pull workspace-template-* images from a real
// container registry whose URL is in `MOLECULE_IMAGE_REGISTRY`.
// Used by every prod tenant (env injected via Railway / EC2
// user-data) and any self-hosted operator who has mirrored the
// images to their own GHCR/ECR/Harbor.
RegistryModeSaaS RegistryMode = "saas"
// RegistryModeLocal — clone the workspace-template-<runtime> repo
// from Gitea
// (`https://git.moleculesai.app/molecule-ai/molecule-ai-workspace-template-<runtime>`)
// and `docker build` the image locally. Used by OSS contributors
// who run `go run ./workspace-server/cmd/server` without setting
// MOLECULE_IMAGE_REGISTRY. Closes the post-2026-05-06 GHCR-403 gap
// (Task #194 / Issue #63).
RegistryModeLocal RegistryMode = "local"
)
// RegistrySource is the SSOT for image-resolution decisions. Returned
// by Resolve(); read by:
// - the provisioner Start() path — branches on Mode for clone+build
// vs pull
// - admin_workspace_images.go — skips remote pull in local mode
// - imagewatch.Watcher — short-circuits in local mode (no GHCR poll)
//
// SaaS-mode .Prefix matches the existing RegistryPrefix() return value;
// local-mode .Prefix is the synthetic `molecule-local`.
type RegistrySource struct {
Mode RegistryMode
Prefix string
}
// Resolve inspects the runtime environment and returns the image-source
// classification. Treats both unset AND empty-string MOLECULE_IMAGE_REGISTRY
// as "local mode" — an operator who set the var to "" via a misconfigured
// deploy would otherwise silently get malformed image refs in SaaS-mode;
// instead they get the local-build path, which fails loudly if the host
// has no Docker daemon (better blast radius).
//
// Mirrors the existing RegistryPrefix() empty-string handling, so the two
// functions agree on every input.
func Resolve() RegistrySource {
if v := os.Getenv("MOLECULE_IMAGE_REGISTRY"); v != "" {
return RegistrySource{Mode: RegistryModeSaaS, Prefix: v}
}
return RegistrySource{Mode: RegistryModeLocal, Prefix: localImagePrefix}
}
// IsKnownRuntime reports whether the given runtime name is in the
// canonical knownRuntimes list. Exposed so the local-build path can
// refuse to clone arbitrary repo paths supplied via cfg.Runtime —
// defence-in-depth against a future code path that might let an
// attacker influence the runtime string before it reaches the build
// code.
func IsKnownRuntime(runtime string) bool {
for _, r := range knownRuntimes {
if r == runtime {
return true
}
}
return false
}
// LocalImagePrefix returns the synthetic registry hostname used by the
// local-build path. Exposed so handlers that need to branch on "is
// this a local-built image?" don't have to duplicate the constant.
func LocalImagePrefix() string { return localImagePrefix }

View File

@ -0,0 +1,152 @@
package provisioner
import (
"strings"
"testing"
)
// Tests for the new mode-detection surface. The legacy RegistryPrefix()
// shim is covered by registry_test.go; these tests pin the explicit
// two-mode discriminated return from Resolve().
// TestResolve_LocalModeWhenRegistryUnset — the OSS-contributor default.
// Issue #63: with MOLECULE_IMAGE_REGISTRY unset, the provisioner must
// switch to the local-build path instead of trying to pull from a GHCR
// org that's been suspended.
func TestResolve_LocalModeWhenRegistryUnset(t *testing.T) {
t.Setenv("MOLECULE_IMAGE_REGISTRY", "")
got := Resolve()
if got.Mode != RegistryModeLocal {
t.Errorf("Mode = %q, want %q (unset registry → local-build)", got.Mode, RegistryModeLocal)
}
if got.Prefix != localImagePrefix {
t.Errorf("Prefix = %q, want %q", got.Prefix, localImagePrefix)
}
}
// TestResolve_SaaSModeWhenRegistrySet — production tenants set the var
// to their ECR mirror; we must keep producing pull-style image refs.
func TestResolve_SaaSModeWhenRegistrySet(t *testing.T) {
const ecr = "123456789012.dkr.ecr.us-east-2.amazonaws.com/molecule-ai"
t.Setenv("MOLECULE_IMAGE_REGISTRY", ecr)
got := Resolve()
if got.Mode != RegistryModeSaaS {
t.Errorf("Mode = %q, want %q (set registry → saas)", got.Mode, RegistryModeSaaS)
}
if got.Prefix != ecr {
t.Errorf("Prefix = %q, want %q", got.Prefix, ecr)
}
}
// TestResolve_EmptyEnvIsLocalMode — operator who set the var to "" via
// a misconfigured deploy must NOT silently produce malformed image refs;
// they get the local path which fails loudly if Docker is missing.
// This contract is the safer-blast-radius half of Issue #63.
func TestResolve_EmptyEnvIsLocalMode(t *testing.T) {
t.Setenv("MOLECULE_IMAGE_REGISTRY", "")
if Resolve().Mode != RegistryModeLocal {
t.Fatalf("empty MOLECULE_IMAGE_REGISTRY should be local-mode, got %q", Resolve().Mode)
}
}
// TestResolve_GarbageURL — a registry value that's syntactically malformed
// (e.g. `not-a-url`, `foo bar`) is still treated as SaaS-mode. The whole
// design of MOLECULE_IMAGE_REGISTRY is "operator-supplied trusted value";
// validating the URL here would be pretending we can prevent operator
// error. The downstream docker-pull will fail loudly with a registry-
// shaped error message, which is the right blast radius.
func TestResolve_GarbageURLStillSaaSMode(t *testing.T) {
for _, garbage := range []string{
"not-a-url",
"http://",
"ghcr.io/",
" ",
"\thello\n",
} {
t.Run(garbage, func(t *testing.T) {
t.Setenv("MOLECULE_IMAGE_REGISTRY", garbage)
if Resolve().Mode != RegistryModeSaaS {
t.Errorf("Mode = %q, want saas (any non-empty value is SaaS-mode by design)", Resolve().Mode)
}
})
}
}
// TestRegistryPrefix_AlignedWithResolve — the back-compat shim must
// agree with Resolve().Prefix on every input the new code distinguishes.
func TestRegistryPrefix_AlignedWithResolve(t *testing.T) {
cases := []struct {
name string
env string
}{
{"unset", ""},
{"ecr", "999999999999.dkr.ecr.us-east-2.amazonaws.com/molecule-ai"},
{"harbor", "harbor.example.com/molecule"},
}
for _, tc := range cases {
t.Run(tc.name, func(t *testing.T) {
t.Setenv("MOLECULE_IMAGE_REGISTRY", tc.env)
gotPrefix := RegistryPrefix()
gotResolve := Resolve().Prefix
// Note: with the new design, RegistryPrefix() unset returns
// the SaaS GHCR default (legacy back-compat) while
// Resolve().Prefix returns the local-mode "molecule-local"
// hostname. They DIVERGE on the unset path by design — that
// divergence is what closes the GHCR-403 hole. Pin both so a
// future refactor can't accidentally re-couple them.
if tc.env == "" {
if gotPrefix != defaultRegistryPrefix {
t.Errorf("RegistryPrefix() = %q, want %q (legacy shim)", gotPrefix, defaultRegistryPrefix)
}
if gotResolve != localImagePrefix {
t.Errorf("Resolve().Prefix = %q, want %q (local-build hostname)", gotResolve, localImagePrefix)
}
} else {
if gotPrefix != tc.env {
t.Errorf("RegistryPrefix() = %q, want %q", gotPrefix, tc.env)
}
if gotResolve != tc.env {
t.Errorf("Resolve().Prefix = %q, want %q", gotResolve, tc.env)
}
}
})
}
}
// TestIsKnownRuntime — defence-in-depth guard for the local-build path.
// Must accept every entry in knownRuntimes and reject anything else.
func TestIsKnownRuntime(t *testing.T) {
for _, rt := range knownRuntimes {
if !IsKnownRuntime(rt) {
t.Errorf("IsKnownRuntime(%q) = false, want true", rt)
}
}
for _, bad := range []string{
"", "unknown", "WORKSPACE-TEMPLATE-FAKE", "../../../etc/passwd",
"langgraph;rm -rf /", "claude-code\n", " langgraph",
} {
if IsKnownRuntime(bad) {
t.Errorf("IsKnownRuntime(%q) = true, want false (untrusted input)", bad)
}
}
}
// TestLocalImagePrefix_Stable — the synthetic prefix is part of the
// public surface; admin handlers and image-watch use it to short-circuit
// network calls. Pin the constant.
func TestLocalImagePrefix_Stable(t *testing.T) {
if got := LocalImagePrefix(); got != "molecule-local" {
t.Errorf("LocalImagePrefix() = %q, want %q", got, "molecule-local")
}
}
// TestLocalImagePrefix_NoDots — the synthetic hostname must not contain
// a `.` because Docker's image-ref parser would interpret it as a real
// DNS-resolvable registry. With no dot, the daemon treats `molecule-local`
// as the registry hostname only when explicitly tagged that way locally,
// and never tries to resolve it via DNS for a pull.
func TestLocalImagePrefix_NoDots(t *testing.T) {
if strings.Contains(LocalImagePrefix(), ".") {
t.Errorf("LocalImagePrefix() = %q contains '.' — Docker would attempt DNS resolution", LocalImagePrefix())
}
}