Compare commits

...

22 Commits

Author SHA1 Message Date
claude-ceo-assistant 7fbb8cb6e9 feat(plugins): atomic install — stage→snapshot→swap→marker (docker path)
CodeQL / Analyze (${{ matrix.language }}) (go) (pull_request) Successful in 4s
CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (pull_request) Successful in 4s
CodeQL / Analyze (${{ matrix.language }}) (python) (pull_request) Successful in 4s
Retarget main PRs to staging / Retarget to staging (pull_request) Has been skipped
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 10s
E2E API Smoke Test / detect-changes (pull_request) Successful in 13s
CI / Detect changes (pull_request) Successful in 15s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 13s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 14s
Harness Replays / detect-changes (pull_request) Successful in 15s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 15s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 15s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 9s
CI / Canvas (Next.js) (pull_request) Successful in 10s
CI / Python Lint & Test (pull_request) Successful in 8s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 9s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 11s
Harness Replays / Harness Replays (pull_request) Failing after 20s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m55s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 3m47s
CI / Platform (Go) (pull_request) Successful in 7m36s
Closes molecule-core#114 for the docker (local-OSS) path.
EIC (SaaS) path tracked as a follow-up — same shape, different
exec primitives (ssh vs docker exec); shipping both in one PR
doubles the test surface.

THE FOUR-STEP DANCE
  1. STAGE     — docker.CopyToContainer extracts tar into
                 /configs/plugins/.staging/<name>.<ts>/
  2. SNAPSHOT  — if /configs/plugins/<name>/ exists, mv to
                 /configs/plugins/.previous/<name>.<ts>/
  3. SWAP      — atomic mv staging → live (single rename(2))
  4. MARKER    — touch /configs/plugins/<name>/.complete

Workspace-side plugin loaders should refuse to load any plugin dir
without .complete (separate small change, not in this PR — the marker
write is the necessary precursor; consumer side is a follow-up so
existing-content plugins don't break before they're re-installed).

ROLLBACK
  - Stage failure: rm -rf staging dir; live untouched
  - Snapshot failure: rm -rf staging dir; live untouched (no rename happened)
  - Swap failure with snapshot present: mv previous back to live
  - Swap failure (no snapshot): rm -rf staging; live (which never
    existed) stays absent
  - Marker failure: content already in place, log loudly with manual
    recovery hint (touch <plugin>/.complete) — don't roll back since
    the new content is what we wanted, just unmarked

GC
  Best-effort delete of previous-version snapshot after successful
  marker write. Failures non-fatal — next install or a separate
  sweeper reclaims. Sweeper for stale .previous/* across reboots is
  follow-up scope.

CONCURRENCY
  Each install gets a unique stamp (UTC second precision), so two
  concurrent reinstalls land in distinct staging dirs and the second
  swap simply overwrites the first's live result. The atomicity is
  per-install, not cross-install — by design (the platform serializes
  POST /workspaces/:id/plugins via Go-side semaphore upstream of
  this code, so cross-install collisions don't reach here).

CHANGES
  + plugins_atomic.go        — installVersion + atomicCopyToContainer
  + plugins_atomic_tar.go    — tarWalk/tarHostDirWithPrefix helpers
  + plugins_atomic_test.go   — 5 unit tests (paths, stamp shape,
                               tar happy path, symlink-skip, prefix
                               normalization). All green.
  ~ plugins_install_pipeline.go::deliverToContainer — swap
    copyPluginToContainer call to atomicCopyToContainer

Old copyPluginToContainer is retained (still called by Download()) so
this PR is purely additive on the install path; no public API change.

PHASE 4 SELF-REVIEW (FIVE-AXIS)
  Correctness: Required (addressed) — swap-failure rollback writes mv
    of previous back to live before returning the error; if rollback
    itself fails, we wrap both errors and surface the combined fault.
    Marker-write failure is treated as content-landed-but-unmarked
    (LOG, don't roll back the new content).
  Readability: No finding — installVersion path methods make the
    /staging/.previous/live/marker layout obvious from one struct.
    tarWalk extracted from the inline filepath.Walk in
    plugins_install_pipeline.go for testability.
  Architecture: No finding — atomicCopyToContainer composes existing
    execAsRoot / docker.CopyToContainer primitives; no new dependencies.
    Old copyPluginToContainer kept for Download() — single responsibility
    per function.
  Security: No finding — symlinks still skipped during tar walk
    (defense vs hostile plugin escaping its own dir). Marker writes
    use composeable path.Join, no user input touches the path.
  Performance: No finding — adds ~3 docker exec calls per install
    (mkdir, mv-snapshot, mv-swap, touch — actually 4) on top of the
    one CopyToContainer. Each exec ~50-100ms in practice; install
    end-to-end was already seconds-scale, this rounds to noise.

REFS
  molecule-core#114 — this issue
  Companion: molecule-core#112 (hot-reload classifier — depends on .complete marker)
  Companion: molecule-core#113 (version subscription — uses install machinery)
  EIC follow-up: separate issue to be filed for SaaS path parity

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 08:22:52 -07:00
claude-ceo-assistant e37a289eb6 Merge pull request 'feat(org-import): inject per-role persona env from operator-host bootstrap dir' (#110) from feat/persona-env-injection into main
CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (push) Successful in 7s
Block internal-flavored paths / Block forbidden paths (push) Successful in 8s
CodeQL / Analyze (${{ matrix.language }}) (go) (push) Successful in 7s
CodeQL / Analyze (${{ matrix.language }}) (python) (push) Successful in 7s
Auto-sync main → staging / sync-staging (push) Failing after 12s
CI / Detect changes (push) Successful in 11s
E2E API Smoke Test / detect-changes (push) Successful in 12s
Harness Replays / detect-changes (push) Successful in 12s
Handlers Postgres Integration / detect-changes (push) Successful in 13s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 13s
publish-workspace-server-image / build-and-push (push) Failing after 13s
Runtime PR-Built Compatibility / detect-changes (push) Successful in 12s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 11s
CI / Shellcheck (E2E scripts) (push) Successful in 3s
CI / Python Lint & Test (push) Successful in 4s
CI / Canvas (Next.js) (push) Successful in 6s
CI / Canvas Deploy Reminder (push) Has been skipped
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Successful in 4s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 5s
Harness Replays / Harness Replays (push) Failing after 6s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 54s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 1m18s
CI / Platform (Go) (push) Successful in 2m23s
2026-05-08 15:17:17 +00:00
claude-ceo-assistant 61166f8848 Merge pull request 'feat(local-dev): air-based hot-reload for workspace-server in docker-compose dev mode' (#118) from feat/air-hot-reload-dev into main
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Blocked by required conditions
CodeQL / Analyze (${{ matrix.language }}) (go) (push) Successful in 0s
CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (push) Successful in 1s
CodeQL / Analyze (${{ matrix.language }}) (python) (push) Successful in 1s
Block internal-flavored paths / Block forbidden paths (push) Successful in 4s
CI / Detect changes (push) Successful in 9s
E2E API Smoke Test / detect-changes (push) Successful in 8s
Auto-sync main → staging / sync-staging (push) Failing after 10s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 8s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 7s
Harness Replays / detect-changes (push) Successful in 8s
publish-workspace-server-image / build-and-push (push) Failing after 8s
Handlers Postgres Integration / detect-changes (push) Successful in 8s
CI / Shellcheck (E2E scripts) (push) Successful in 2s
CI / Python Lint & Test (push) Successful in 4s
CI / Canvas (Next.js) (push) Successful in 4s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 3s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 5s
CI / Canvas Deploy Reminder (push) Has been skipped
Harness Replays / Harness Replays (push) Failing after 6s
CI / Platform (Go) (push) Has been cancelled
E2E API Smoke Test / E2E API Smoke Test (push) Has been cancelled
Runtime PR-Built Compatibility / detect-changes (push) Has been cancelled
2026-05-08 15:16:58 +00:00
claude-ceo-assistant 9d50a6dae4 feat(local-dev): air-based hot-reload for workspace-server
CodeQL / Analyze (${{ matrix.language }}) (go) (pull_request) Successful in 1s
CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (pull_request) Successful in 1s
CodeQL / Analyze (${{ matrix.language }}) (python) (pull_request) Successful in 1s
Retarget main PRs to staging / Retarget to staging (pull_request) Has been skipped
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 4s
CI / Detect changes (pull_request) Successful in 7s
E2E API Smoke Test / detect-changes (pull_request) Successful in 6s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 7s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 7s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 7s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 7s
Harness Replays / detect-changes (pull_request) Successful in 7s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 3s
CI / Python Lint & Test (pull_request) Successful in 3s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2s
CI / Canvas (Next.js) (pull_request) Successful in 6s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 5s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 3s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
Harness Replays / Harness Replays (pull_request) Failing after 6s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 43s
CI / Platform (Go) (pull_request) Successful in 2m1s
Closes core#116. Brings local-dev iteration parity with the canvas's
Turbopack HMR — edit a Go file, see the platform restart in <5s
instead of running 'docker compose up --build' (~30s) per change.

USAGE
  make dev   # docker compose with air-driven live reload
  make up    # production-shape stack (no air, normal Dockerfile)

WHAT THIS ADDS
  workspace-server/.air.toml      — air watch config
  workspace-server/Dockerfile.dev — air-on-golang:1.25-alpine, dev-only
  docker-compose.dev.yml          — overlay swapping platform service
                                    to Dockerfile.dev + bind-mounting
                                    workspace-server/ source
  Makefile                        — make {dev,up,down,logs,build,test}

WHAT THIS DOES NOT TOUCH
  workspace-server/Dockerfile (production multi-stage build)
  docker-compose.yml          (prod-shape stack)
  CI workflows                (build prod image directly)
  Tenant deployment / SaaS    (image swap stays the model)

Pure additive. Existing 'docker compose up' path unchanged; production
stays on the static binary. Air install pinned via go install at image
build time so the dev image is reproducible-enough for local use (we
don't pin air to a SHA — the dev image is rebuilt locally and updates
opportunistically).

PHASE 4 SELF-REVIEW (FIVE-AXIS)
  Correctness: No finding — additive change, no existing path modified.
    .air.toml watches .go + .yaml under workspace-server/, excludes
    _test.go and tests dir so test edits don't trigger rebuild.
    Dockerfile.dev mirrors prod's 'go mod download' so first rebuild
    is fast.
  Readability: No finding — three small files plus a Makefile, each
    with header comments explaining the WHY, not just the WHAT. The
    Makefile uses the standard ## help-target pattern.
  Architecture: No finding — overlay pattern (docker-compose.dev.yml
    on top of docker-compose.yml) is the standard compose convention
    for env-specific overrides. Doesn't fork the prod path.
  Security: No finding because no production code path; dev-only image
    isn't built in CI and isn't published to ECR.
  Performance: No finding — air debounce=500ms, exclude_unchanged=true
    so a save that doesn't change content is a no-op rebuild.

REFS
  core#116 — this issue
  Companion: core#117 (workspace-side config-watcher for hot-reload of
  config.yaml) — different scope; this issue is platform-only.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 08:10:50 -07:00
claude-ceo-assistant 43b33bcaa5 feat(org-import): inject per-role persona env from operator-host bootstrap dir
CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (pull_request) Successful in 0s
CodeQL / Analyze (${{ matrix.language }}) (go) (pull_request) Successful in 1s
CodeQL / Analyze (${{ matrix.language }}) (python) (pull_request) Successful in 0s
Retarget main PRs to staging / Retarget to staging (pull_request) Has been skipped
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 6s
E2E API Smoke Test / detect-changes (pull_request) Successful in 7s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 8s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 8s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 8s
Harness Replays / detect-changes (pull_request) Successful in 8s
CI / Detect changes (pull_request) Successful in 9s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 8s
CI / Python Lint & Test (pull_request) Successful in 4s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 4s
CI / Canvas (Next.js) (pull_request) Successful in 4s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 3s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 5s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
Harness Replays / Harness Replays (pull_request) Failing after 5s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2m16s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 2m24s
CI / Platform (Go) (pull_request) Successful in 3m23s
Wires the 28 dev-tree persona credentials minted 2026-05-08 into the
workspace-secrets path used by org_import. When a workspace.yaml carries
`role: <name>`, the importer now reads
$MOLECULE_PERSONA_ROOT/<role>/env (default
/etc/molecule-bootstrap/personas/<role>/env, populated by the bootstrap
kit on the tenant host) and merges the role's GITEA_USER /
GITEA_TOKEN / GITEA_TOKEN_SCOPES / GITEA_USER_EMAIL /
GITEA_SSH_KEY_PATH into the same envVars map that already feeds
workspace_secrets via parseEnvFile + crypto.Encrypt + INSERT.

PRECEDENCE
  Persona env is the LOWEST layer:
    0. Persona env (per-role)
    1. Org root .env (shared)
    2. Workspace .env (per-workspace)
  Each later layer overrides the previous, so a workspace .env can
  pin a different GITEA_TOKEN if it ever needs to (testing, override).

WHY THIS LAYERING
  Workspaces should boot with the role's identity by default. .env
  files stay the explicit-override mechanism for the (rare) case where
  a workspace needs to deviate. No new behavior for workspaces with no
  role: persona load is silent no-op when ws.Role is empty or unsafe.

SECURITY
  isSafeRoleName accepts only [A-Za-z0-9_-]+ (no '..', '/', or
  separators) — admin-only construct, but defense-in-depth keeps the
  persona dir shape invariant. Test
  TestLoadPersonaEnvFile_RejectsTraversal pins the rejection set against
  a planted target file.

OPERATOR-HOST CONTRACT
  The 28 persona env files live at /etc/molecule-bootstrap/personas/<role>/env
  (mode 600, owner root:root) with the per-role token-scope tailoring
  Hongming approved 2026-05-08 (D5). Synced via task #241. Override via
  MOLECULE_PERSONA_ROOT for tests + non-prod hosts.

TESTS (7 new, all green)
  TestLoadPersonaEnvFile_HappyPath        — typical persona-env shape
  TestLoadPersonaEnvFile_MissingDir       — silent no-op when file absent
  TestLoadPersonaEnvFile_EmptyRole        — silent no-op when role empty
  TestLoadPersonaEnvFile_RejectsTraversal — planted file unreachable
                                            via '../../etc/passwd' etc.
  TestLoadPersonaEnvFile_DefaultRoot      — falls back to /etc/...
  TestLoadPersonaEnvFile_OverwritesEmptyMap
  TestIsSafeRoleName_Acceptance           — positive + negative role names

PHASE 4 SELF-REVIEW (FIVE-AXIS)
  Correctness: No finding — additive change, silent no-op on the ws.Role==''
    path covers every existing workspace; tests cover happy path + each
    rejection mode + missing-dir.
  Readability: No finding — helper sits next to parseEnvFile in
    org_helpers.go with a comment block explaining WHY persona is
    lowest precedence.
  Architecture: No finding — fits the existing 'merge .env into envVars
    then INSERT INTO workspace_secrets' pattern that's been in place
    since the .env-driven workspace secrets feature; no new dependencies,
    no new tables.
  Security: Required (addressed) — path traversal blocked by
    isSafeRoleName. No finding beyond that since persona files are
    admin-managed and the helper does not log token values.
  Performance: No finding — one extra os.ReadFile per workspace at
    import time; amortized over workspace lifetime, cost is negligible.

REFS
  internal#85 — RFC for SOP Phase 4 + structured Five-Axis (parent context)
  Saved memories: feedback_per_agent_gitea_identity_default,
                  feedback_unified_credentials_file
  Task #241 — operator-host sync (already DONE; populated 28 dirs)
  Task #242 — this PR

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 07:09:40 -07:00
devops-engineer 7d3a6a46c5 chore: sync main → staging (auto, ae2d9eab)
CodeQL / Analyze (${{ matrix.language }}) (go) (push) Successful in 9s
CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (push) Successful in 9s
Check merge_group trigger on required workflows / Required workflows have merge_group trigger (push) Successful in 16s
Block internal-flavored paths / Block forbidden paths (push) Successful in 17s
CodeQL / Analyze (${{ matrix.language }}) (python) (push) Successful in 8s
CI / Detect changes (push) Successful in 20s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (push) Successful in 18s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 21s
E2E API Smoke Test / detect-changes (push) Successful in 24s
Handlers Postgres Integration / detect-changes (push) Successful in 24s
Runtime PR-Built Compatibility / detect-changes (push) Successful in 25s
CI / Shellcheck (E2E scripts) (push) Successful in 11s
CI / Platform (Go) (push) Successful in 13s
CI / Canvas (Next.js) (push) Successful in 13s
CI / Python Lint & Test (push) Successful in 11s
CI / Canvas Deploy Reminder (push) Has been skipped
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Successful in 9s
E2E API Smoke Test / E2E API Smoke Test (push) Failing after 3m30s
Handlers Postgres Integration / Handlers Postgres Integration (push) Failing after 3m29s
2026-05-08 13:30:46 +00:00
claude-ceo-assistant ae2d9eabf6 Merge pull request 'chore(workflows): drop staging-branch triggers (Phase 3b of internal#81)' (#109) from chore/trunk-based-drop-staging-from-workflow-triggers into main
CodeQL / Analyze (${{ matrix.language }}) (go) (push) Successful in 8s
CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (push) Successful in 8s
CodeQL / Analyze (${{ matrix.language }}) (python) (push) Successful in 8s
Block internal-flavored paths / Block forbidden paths (push) Successful in 14s
Check merge_group trigger on required workflows / Required workflows have merge_group trigger (push) Successful in 16s
CI / Detect changes (push) Successful in 19s
Auto-sync main → staging / sync-staging (push) Successful in 22s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (push) Successful in 14s
Handlers Postgres Integration / detect-changes (push) Successful in 17s
E2E API Smoke Test / detect-changes (push) Successful in 19s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 20s
Runtime PR-Built Compatibility / detect-changes (push) Successful in 18s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 18s
CI / Shellcheck (E2E scripts) (push) Successful in 11s
CI / Platform (Go) (push) Successful in 14s
CI / Canvas (Next.js) (push) Successful in 14s
CI / Python Lint & Test (push) Successful in 13s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 12s
CI / Canvas Deploy Reminder (push) Has been skipped
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 12s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Successful in 11s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 6m3s
branch-protection drift check / Branch protection drift (push) Failing after 12s
Canary — staging SaaS smoke (every 30 min) / Canary smoke (push) Failing after 3m38s
2026-05-08 13:30:24 +00:00
claude-ceo-assistant 2fac4b61b4 chore(workflows): drop staging-branch triggers (Phase 3b of internal#81)
CodeQL / Analyze (${{ matrix.language }}) (go) (pull_request) Successful in 9s
CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (pull_request) Successful in 18s
Check merge_group trigger on required workflows / Required workflows have merge_group trigger (pull_request) Successful in 19s
CodeQL / Analyze (${{ matrix.language }}) (python) (pull_request) Successful in 18s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 21s
Retarget main PRs to staging / Retarget to staging (pull_request) Has been skipped
branch-protection drift check / Branch protection drift (pull_request) Successful in 23s
CI / Detect changes (pull_request) Successful in 23s
E2E API Smoke Test / detect-changes (pull_request) Successful in 25s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 18s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 25s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 26s
CI / Platform (Go) (pull_request) Successful in 10s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 8s
CI / Canvas (Next.js) (pull_request) Successful in 11s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 16s
CI / Python Lint & Test (pull_request) Successful in 9s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 17s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 10s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 12s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 6s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 4m32s
Trunk-based migration: main is the only branch. Update 4 workflows
that fired on staging-branch pushes to fire on main instead.

  - e2e-staging-canvas.yml: drop staging from push + pull_request
  - e2e-staging-external.yml: drop staging from push + pull_request
  - e2e-staging-saas.yml: drop staging from push + pull_request,
    update header comment that references the (now-obsolete)
    staging→main auto-promote flow
  - redeploy-tenants-on-staging.yml: workflow_run.branches changes
    from [staging] to [main] so the tenant redeploy fires when
    publish-workspace-server-image runs on main

Workflows that target the staging tenant FLEET (canary-staging.yml,
e2e-staging-sanity.yml) are not changed — they fire on cron, the word
"staging" in their filenames refers to the deployment target environ-
ment, not the git branch.

Lands as Phase 3b after #108 promotes the 5 staging-only feature PRs
(Phase 3a). Phase 3c deletes the obsolete promote/sync workflows
(auto-promote-staging, auto-sync-main-to-staging, etc.) plus the
staging branch itself, after we no-op-verify both Phase 3a and 3b
green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 13:08:51 +00:00
claude-ceo-assistant 5abc4f74ca harden(org-external): token via http.extraHeader, .complete cache marker, ref .. deny, naming cleanup (#107)
CodeQL / Analyze (${{ matrix.language }}) (go) (push) Successful in 1s
CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (push) Successful in 1s
CodeQL / Analyze (${{ matrix.language }}) (python) (push) Successful in 0s
Block internal-flavored paths / Block forbidden paths (push) Successful in 4s
CI / Detect changes (push) Successful in 9s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 7s
E2E API Smoke Test / detect-changes (push) Successful in 9s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 10s
Harness Replays / detect-changes (push) Successful in 9s
Handlers Postgres Integration / detect-changes (push) Successful in 9s
Runtime PR-Built Compatibility / detect-changes (push) Successful in 10s
CI / Shellcheck (E2E scripts) (push) Successful in 3s
CI / Python Lint & Test (push) Successful in 4s
CI / Canvas (Next.js) (push) Successful in 6s
CI / Canvas Deploy Reminder (push) Has been skipped
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 6s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Successful in 5s
Harness Replays / Harness Replays (push) Successful in 1m7s
publish-workspace-server-image / build-and-push (push) Successful in 1m58s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 2m4s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 2m10s
CI / Platform (Go) (push) Successful in 3m13s
Five-Axis self-review pass on the !external resolver work (PRs #105+#106) caught three real issues that the unstructured 3-weakest review missed:

1. Cache validity gap — partial cache writes looked complete
2. Token persistence — token in URL userinfo got persisted to .git/config
3. Misleading function name post-refactor

This PR fixes all three:
- .complete marker file written atomically; wipe-and-refetch on partial cache
- Token via -c http.extraHeader, never embedded in URL
- Defense-in-depth ref .. deny (was already validated by repoSafeRefRegex but explicit + tested)
- Renamed buildCloneURL -> buildExternalCloneURL (collision with artifacts.go), rewriteFilesDirAndIncludes -> rewriteFilesDir
- Removed unused redactToken/shortHash helpers and crypto/sha1, encoding/hex, fmt imports

Approved by platform-engineer 2026-05-08T12:55Z.
2026-05-08 13:04:00 +00:00
claude-ceo-assistant c72d0a5383 harden(org-external): token via http.extraHeader, .complete cache marker, ref '..' deny, naming cleanup
CodeQL / Analyze (${{ matrix.language }}) (go) (pull_request) Successful in 10s
CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (pull_request) Successful in 8s
CodeQL / Analyze (${{ matrix.language }}) (python) (pull_request) Successful in 9s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 25s
CI / Detect changes (pull_request) Successful in 32s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 27s
E2E API Smoke Test / detect-changes (pull_request) Successful in 28s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 28s
Harness Replays / detect-changes (pull_request) Successful in 20s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 23s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 12s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 22s
CI / Canvas (Next.js) (pull_request) Successful in 16s
CI / Python Lint & Test (pull_request) Successful in 12s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 13s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 15s
CI / Platform (Go) (pull_request) Failing after 1m19s
Harness Replays / Harness Replays (pull_request) Failing after 1m6s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Failing after 1m9s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Failing after 2m4s
Self-review of molecule-core PR #105 + #106 (the !external resolver
chain) surfaced 3 real correctness/security gaps and 2 readability
nits. Fixes all four in one PR since they're the same file's hardening.

(1) TOKEN LEAKAGE — fixed
  Before: gitFetcher built clone URLs with auth in userinfo
    (https://oauth2:TOKEN@host/repo.git). Two leak paths:
      a. Token persisted in cloned repo's .git/config
      b. Token could appear in clone error output captured via
         cmd.CombinedOutput()
  After: clone URL has no userinfo (https://host/repo.git). Auth is
    layered on via -c http.extraHeader=Authorization: token ...
    which sends the header per-request without persisting. Plus a
    redactToken() pass over any error string before it surfaces in
    fmt.Errorf, as belt-and-braces.
  Tradeoff: token now visible in 'ps aux' for the duration of the
    git child process (same as before via env var), but no longer
    in any persistent state.

(2) CACHE-VALIDITY FOOTGUN — fixed
  Before: cache-hit was 'cacheDir/.git exists'. A clone interrupted
    after .git was created but before content finished writing would
    leave a partially-written cache that subsequent imports treated
    as hit, returning stale/incomplete content forever (no self-heal).
  After: cache-hit also requires a .complete marker file written
    only AFTER successful clone+rename. Partially-written cache is
    treated as cache-miss and re-fetched cleanly (after RemoveAll
    on the partial dir to avoid blocking the new clone's mkdir).

(3) REF '..' DENY — fixed
  Before: safeRefPattern '^[a-zA-Z0-9_./-]+$' allowed '..' as a
    substring. Git itself rejects most refs containing '..', but
    defense-in-depth says don't depend on the downstream tool's
    validation when sanitizing input at the boundary.
  After: explicit strings.Contains(ref.Ref, '..') check.

(4) NAMING CLEANUP — fixed
  Before: rewriteFilesDirAndIncludes() — name claims to rewrite
    !include scalars but doesn't (we removed that during PR-A
    development; double-prefix bug). Misleading for readers.
  After: rewriteFilesDir(). Docstring updated to explicitly explain
    why !include paths are NOT rewritten (relative to subDir, naturally
    inside cache).
  Also: removed unused buildAuthedURL() (replaced by
    buildExternalCloneURL + authConfigArgs split), removed unused
    shortHash() helper (replaced by os.MkdirTemp), removed unused
    crypto/sha1 + encoding/hex + fmt imports, removed stray
    '_ = fmt.Sprint' line in integration test.

NEW TESTS
  - TestGitFetcher_RejectsRefWithDoubleDot (defense-in-depth on ref input)
  - TestGitFetcher_CacheValidatedByCompleteMarker (partial cache → re-fetch)

VERIFIED LOCALLY 2026-05-08
  Full ./internal/handlers/ suite: ok (7.8s, 14 external-resolver tests
  + all existing tests). Two new tests cover the two new behaviors.

Refs:
  internal#77 — extraction RFC
  molecule-core#105 (resolver), #106 (tests) — original implementation
  Hongming code-review-and-quality skill invocation 2026-05-08 + 'fix all'
2026-05-08 05:54:54 -07:00
claude-ceo-assistant d9056db5b4 Merge pull request 'test(org-external): integration + e2e for !external resolver (PR-B + PR-C)' (#106) from feature/external-ref-pr-bc-tests into staging
CodeQL / Analyze (${{ matrix.language }}) (go) (push) Successful in 1s
CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (push) Successful in 1s
CodeQL / Analyze (${{ matrix.language }}) (python) (push) Successful in 1s
Block internal-flavored paths / Block forbidden paths (push) Successful in 5s
CI / Detect changes (push) Successful in 8s
E2E API Smoke Test / detect-changes (push) Successful in 7s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 6s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 8s
Handlers Postgres Integration / detect-changes (push) Successful in 8s
Harness Replays / detect-changes (push) Successful in 8s
Runtime PR-Built Compatibility / detect-changes (push) Successful in 9s
CI / Shellcheck (E2E scripts) (push) Successful in 4s
CI / Canvas (Next.js) (push) Successful in 4s
CI / Python Lint & Test (push) Successful in 4s
CI / Canvas Deploy Reminder (push) Has been skipped
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 8s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Successful in 5s
Harness Replays / Harness Replays (push) Failing after 49s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 1m3s
publish-workspace-server-image / build-and-push (push) Successful in 1m35s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 1m36s
CI / Platform (Go) (push) Successful in 2m43s
2026-05-08 12:33:45 +00:00
claude-ceo-assistant 89c5567d79 test(org-external): integration test against local bare-git + e2e against live Gitea (PR-B + PR-C)
CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (pull_request) Successful in 1s
CodeQL / Analyze (${{ matrix.language }}) (go) (pull_request) Successful in 1s
CodeQL / Analyze (${{ matrix.language }}) (python) (pull_request) Successful in 1s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 4s
E2E API Smoke Test / detect-changes (pull_request) Successful in 7s
CI / Detect changes (pull_request) Successful in 8s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 8s
Harness Replays / detect-changes (pull_request) Successful in 7s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 7s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 7s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 8s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 2s
CI / Python Lint & Test (pull_request) Successful in 3s
CI / Canvas (Next.js) (pull_request) Successful in 4s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 4s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 5s
Harness Replays / Harness Replays (pull_request) Successful in 58s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1m45s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m51s
CI / Platform (Go) (pull_request) Successful in 2m49s
PR-B (local bare-git integration, task #233):
  workspace-server/internal/handlers/org_external_integration_test.go
  Three tests using git's GIT_CONFIG_COUNT/KEY/VALUE env-var-injected
  insteadOf URL rewrite — process-scoped, no ~/.gitconfig pollution:

  - TestGitFetcher_RealClone_LocalRedirect: full resolver chain end-to-
    end with REAL git clone against a local bare-repo, asserts cache
    population + content materialization + path rewrite + cache-hit on
    second invocation.
  - TestGitFetcher_RealClone_BadRefFails: nonexistent ref surfaces
    git's error cleanly through the ls-remote step.
  - TestGitFetcher_DirectFetch_CacheHit: gitFetcher.Fetch direct
    invocation (no resolver wrapping); verifies cache-hit returns
    same dir + same SHA, no clobber.

  Production code untouched — insteadOf rewrite makes the production
  gitFetcher think it's cloning from Gitea, but git rewrites at clone
  time to file://<barePath>. Tests the real shell-out + parsing.

PR-C (live Gitea e2e, task #234):
  workspace-server/internal/handlers/local_e2e_dev_dept_test.go
  TestLocalE2E_ExternalDevDepartment — minimal parent template that
  uses !external against the LIVE molecule-ai/molecule-dev-department
  repo. No symlink, no /tmp/local-e2e-deploy fixture. Composition
  resolves over network at import time.

  Asserts:
    - 28+ dev-tree workspaces resolve through the fetched cache
      (matches the count from TestLocalE2E_DevDepartmentExtraction)
    - Q1 placement: 'Documentation Specialist' present (under app-lead)
    - Q2 placement: 'Triage Operator' present (under dev-lead)
    - Every workspace's files_dir is cache-prefixed (proves rewrite ran)
    - Every workspace's resolveInsideRoot+Stat succeeds
      (would fail provisioning if not)

  Skipped if Gitea unreachable (TCP probe to git.moleculesai.app:443)
  or git binary absent — won't false-fail offline runners.

VERIFIED LOCALLY 2026-05-08:
  --- PASS: TestGitFetcher_RealClone_LocalRedirect (0.26s)
  --- PASS: TestGitFetcher_RealClone_BadRefFails    (0.15s)
  --- PASS: TestGitFetcher_DirectFetch_CacheHit     (0.23s)
  --- PASS: TestLocalE2E_ExternalDevDepartment      (0.55s)
  workspaces resolved through !external: 28
  Full ./internal/handlers/ test suite: ok (no regressions)

Together with PR-A's unit tests (#105), the !external resolver is now
covered at three layers:
  - unit (fakeFetcher injection): allowlist, validation, path rewrite
  - integration (real git, local bare-repo): clone, cache, ls-remote
  - e2e (real git, live Gitea, live dev-department): full chain

Refs:
  internal#77 — extraction RFC (Phase 3a phasing in comment 1995)
  task #233 (PR-B), task #234 (PR-C)
  Hongming GO 2026-05-08 ('do PR-B/C/D')
2026-05-08 05:30:04 -07:00
claude-ceo-assistant ef0ef30116 Merge pull request 'feat(org-import): !external cross-repo subtree resolver (Phase 3a, task #222)' (#105) from feature/external-ref-resolver into staging
CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (push) Successful in 1s
CodeQL / Analyze (${{ matrix.language }}) (go) (push) Successful in 1s
CodeQL / Analyze (${{ matrix.language }}) (python) (push) Successful in 1s
Block internal-flavored paths / Block forbidden paths (push) Successful in 5s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 6s
CI / Detect changes (push) Successful in 8s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 9s
Handlers Postgres Integration / detect-changes (push) Successful in 9s
Harness Replays / detect-changes (push) Successful in 9s
E2E API Smoke Test / detect-changes (push) Successful in 9s
Runtime PR-Built Compatibility / detect-changes (push) Successful in 8s
CI / Shellcheck (E2E scripts) (push) Successful in 2s
CI / Python Lint & Test (push) Successful in 3s
CI / Canvas (Next.js) (push) Successful in 4s
CI / Canvas Deploy Reminder (push) Has been skipped
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 5s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Successful in 4s
Harness Replays / Harness Replays (push) Successful in 1m4s
publish-workspace-server-image / build-and-push (push) Successful in 1m50s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 2m3s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 2m6s
CI / Platform (Go) (push) Successful in 3m9s
2026-05-08 12:23:33 +00:00
claude-ceo-assistant 257d6c1b5a feat(org-import): !external cross-repo subtree resolver (Phase 3a, internal#77 / task #222)
CodeQL / Analyze (${{ matrix.language }}) (go) (pull_request) Successful in 1s
CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (pull_request) Successful in 1s
CodeQL / Analyze (${{ matrix.language }}) (python) (pull_request) Successful in 1s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 5s
CI / Detect changes (pull_request) Successful in 7s
E2E API Smoke Test / detect-changes (pull_request) Successful in 7s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 7s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 8s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 7s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 8s
Harness Replays / detect-changes (pull_request) Successful in 8s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 3s
CI / Python Lint & Test (pull_request) Successful in 3s
CI / Canvas (Next.js) (pull_request) Successful in 5s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 5s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 4s
Harness Replays / Harness Replays (pull_request) Failing after 48s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 54s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1m21s
CI / Platform (Go) (pull_request) Successful in 2m26s
Adds gitops-style cross-repo subtree composition to the platform's
org-template importer. Replaces (eventually) the operator-side
filesystem symlink approach shipped in PR #5.

DESIGN
  See internal#77 comment 1995 for the full design doc + decision points
  agreed with Hongming 2026-05-08.

  Schema: a `!external`-tagged mapping anywhere a workspace entry is
  allowed (workspaces:, roots:, children:):

    - !external
      repo: molecule-ai/molecule-dev-department
      ref: main
      path: dev-lead/workspace.yaml
      url: git.moleculesai.app    # optional; default = MOLECULE_EXTERNAL_GITEA_URL or git.moleculesai.app

  At resolve time the platform fetches the repo at ref into a content-
  addressable cache under <orgBaseDir>/.external-cache/<repo>/<sha>/,
  loads <cacheDir>/<path>, recursively resolves nested !include /
  !external in the loaded subtree, then rewrites every files_dir scalar
  in the fully-resolved subtree to be cache-prefixed. Downstream
  pipeline (resolveInsideRoot, plugin merge, CopyTemplateToContainer)
  sees ordinary in-tree paths.

IMPLEMENTATION
  - org_external.go: ExternalRef type, fetcher interface (gitFetcher
    production + injectable for tests), resolveExternalMapping resolver,
    rewriteFilesDirAndIncludes path-rewrite walker, allowlistedHostPath
    + safeRefPattern + safeRepoCacheDir validation helpers.
  - org_include.go: 4-line hook in expandNode dispatching MappingNode
    with Tag=="!external" to resolveExternalMapping.
  - org_external_test.go: 8 unit tests with fakeFetcher injection
    (no network):
      * happy path (top + nested workspace files_dir cache-prefixed)
      * allowlist rejection (github.com/foo/bar)
      * path-traversal rejection (../../etc/passwd)
      * malformed ref rejection ("main; rm -rf /")
      * missing required fields (repo / ref / path)
      * rewriteFilesDirAndIncludes basic + idempotent
      * allowlistedHostPath env-override + glob

  Path rewrite ONLY rewrites files_dir scalars. !include scalars are
  NOT rewritten — they resolve relative to their containing file's
  directory, which post-fetch is naturally inside the cache, so
  relative !includes Just Work without modification.

ALLOWLIST + AUTH
  - Default allowlist: git.moleculesai.app/molecule-ai/.
  - Override: MOLECULE_EXTERNAL_REPO_ALLOWLIST (comma-separated
    prefixes; trailing /* or / supported).
  - Auth: MOLECULE_GITEA_TOKEN env var injected into clone URL.
    Optional — falls back to unauthenticated for public repos.
  - Reject: malformed refs, path-traversal, non-allowlisted hosts.

CACHE
  - Location: <orgBaseDir>/.external-cache/<safe-repo>/<sha>/.
    Operators add to .gitignore.
  - Content-addressable: same (repo, sha) reuses cache, no overwrite.
  - Atomic clone via tmp-then-rename.
  - Concurrency: race-tolerant — last-writer-wins on same SHA.
    GC out of scope for v1 (filed as parked follow-up).

SECURITY (per SOP Phase 2)
  Untrusted yaml input — all validated:
    repo: allowlist (default molecule-ai/* on Gitea host)
    ref:  ^[a-zA-Z0-9_./-]+$ regex (rejects shell injection)
    path: relative-and-down-only (rejects ../escape)
  Auth: read-only token scoped to allowed orgs.
  Recursion: maxExternalDepth=4 (vs maxIncludeDepth=16) to limit
    network fan-out cost.
  Cache poisoning: per-(repo, sha) content-addressable; can't poison
    across SHAs.
  Trust boundary: cloned content treated identically to a sibling-
    cloned subtree (same model as current symlink approach).

VERSIONING / BACKWARDS COMPAT
  Pure additive. Existing !include and inline workspaces unchanged.
  Existing dev-lead symlink (parent template PR #5) keeps working.
  Migration of parent template to !external is a separate PR-D.
  No DB schema change. No public API change.

VERIFIED LOCALLY
  go test ./internal/handlers/ → ok (5.2s, all 8 new tests + existing)

  Stub fetcher injection lets unit tests cover the resolver +
  path-rewrite logic without network. PR-B (follow-up) adds an
  integration test against a local bare-git repo. PR-C adds the
  real-Gitea e2e test against the live dev-department repo.

Refs:
  internal#77 — extraction RFC (comment 1995 = Phase 1+2 design)
  task #222 — this PR is Phase 3a (PR-A in the design's phasing)
  Hongming GO 2026-05-08 ('go' on 4 decision points + design)
2026-05-08 05:17:55 -07:00
claude-ceo-assistant 7b6061e899 Merge pull request 'fix(provisioner)+test: EvalSymlinks templatePath; stage-2 e2e for files_dir consumption' (#104) from optimize/extraction-stage-2-tests-and-validator-fix into staging
CodeQL / Analyze (${{ matrix.language }}) (go) (push) Successful in 2s
CodeQL / Analyze (${{ matrix.language }}) (python) (push) Successful in 2s
CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (push) Successful in 2s
Block internal-flavored paths / Block forbidden paths (push) Successful in 6s
CI / Detect changes (push) Successful in 10s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 9s
E2E API Smoke Test / detect-changes (push) Successful in 11s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 11s
Handlers Postgres Integration / detect-changes (push) Successful in 11s
Runtime PR-Built Compatibility / detect-changes (push) Successful in 11s
Harness Replays / detect-changes (push) Successful in 12s
CI / Shellcheck (E2E scripts) (push) Successful in 3s
CI / Canvas (Next.js) (push) Successful in 5s
CI / Python Lint & Test (push) Successful in 4s
CI / Canvas Deploy Reminder (push) Has been skipped
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Successful in 4s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 6s
Harness Replays / Harness Replays (push) Successful in 1m10s
publish-workspace-server-image / build-and-push (push) Successful in 2m0s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 1m58s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 2m5s
CI / Platform (Go) (push) Successful in 3m5s
2026-05-08 11:49:35 +00:00
claude-ceo-assistant 3dcc7230f9 fix(provisioner)+test: EvalSymlinks templatePath; stage-2 e2e for files_dir consumption
CodeQL / Analyze (${{ matrix.language }}) (go) (pull_request) Successful in 1s
CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (pull_request) Successful in 2s
CodeQL / Analyze (${{ matrix.language }}) (python) (pull_request) Successful in 1s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 5s
CI / Detect changes (pull_request) Successful in 8s
E2E API Smoke Test / detect-changes (pull_request) Successful in 7s
Harness Replays / detect-changes (pull_request) Successful in 7s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 7s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 7s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 7s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 8s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 2s
CI / Canvas (Next.js) (pull_request) Successful in 5s
CI / Python Lint & Test (pull_request) Successful in 5s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 4s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 6s
Harness Replays / Harness Replays (pull_request) Failing after 46s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 54s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1m20s
CI / Platform (Go) (pull_request) Successful in 2m48s
Two changes that fall out of one root cause discovered while preparing
the local platform spin-up for the dev-department extraction (internal#77):

PROBLEM
  CopyTemplateToContainer's filepath.Walk is called with templatePath
  set to the workspace's resolved files_dir. With the cross-repo
  symlink composition shipped in PR #5 (parent template's
  dev-lead → ../molecule-dev-department/dev-lead/), the Dev Lead
  workspace's files_dir is literally 'dev-lead' — i.e. the symlink
  itself, not a path THROUGH the symlink.

  filepath.Walk does not descend into a symlink leaf — it Lstats the
  root, sees a symlink (mode bit set, not a directory), emits exactly
  one entry, and returns. Result: the workspace's /configs/ tar would
  ship empty. Other 38 workspaces are fine because their files_dir
  paths just TRAVERSE the symlink (path resolution handles intermediate
  symlinks via Lstat traversal); only the leaf-is-symlink case breaks.

FIX
  workspace-server/internal/provisioner/provisioner.go:
    Call filepath.EvalSymlinks on templatePath before filepath.Walk.
    Resolves the leaf-symlink case for ALL templates, not just dev-dept.
    Security: templatePath has already passed resolveInsideRoot's
    path-string check at the call site; the trust boundary is the
    operator-side /org-templates/ filesystem layout, not this
    resolution step.

TEST
  workspace-server/internal/handlers/local_e2e_dev_dept_test.go:
    New TestLocalE2E_FilesDirConsumption — stage-2 of the local e2e.
    For every workspace in the resolved OrgTemplate, asserts:
      1. resolveInsideRoot(orgBaseDir, ws.FilesDir) succeeds.
      2. os.Stat on the result returns a directory.
      3. filepath.Walk after EvalSymlinks (mirroring the platform fix)
         emits at least one file.
      4. At least one workspace marker exists (workspace.yaml,
         system-prompt.md, or initial-prompt.md).
    Exercises the SECOND half of POST /org/import that
    TestLocalE2E_DevDepartmentExtraction (PR #103) didn't cover.

VERIFIED LOCALLY (2026-05-08, against post-extraction Gitea state):
  --- PASS: TestLocalE2E_FilesDirConsumption (0.05s)
  checked 39 workspaces with files_dir
  All 39 walk paths emit non-empty file sets with valid workspace markers.

REGRESSION GUARD
  Without the EvalSymlinks fix, this test fails on Dev Lead with:
    files_dir 'dev-lead' at '/.../molecule-dev/dev-lead' is empty —
    CopyTemplateToContainer would produce empty /configs/

Refs:
  internal#77 — extraction RFC
  molecule-core#102 (resolver symlink contract test)
  molecule-core#103 (stage-1 e2e: include resolution)
  Hongming GO 2026-05-08 ('go' on the 3 pre-spin-up optimizations)
2026-05-08 04:46:33 -07:00
claude-ceo-assistant e3f17fb954 Merge pull request 'test(local-e2e): verify dev-department extraction end-to-end via real resolveYAMLIncludes' (#103) from test/e2e-dev-department-extraction into staging
CodeQL / Analyze (${{ matrix.language }}) (go) (push) Successful in 1s
CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (push) Successful in 2s
CodeQL / Analyze (${{ matrix.language }}) (python) (push) Successful in 2s
Block internal-flavored paths / Block forbidden paths (push) Successful in 5s
CI / Detect changes (push) Successful in 7s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 6s
E2E API Smoke Test / detect-changes (push) Successful in 9s
Handlers Postgres Integration / detect-changes (push) Successful in 8s
Harness Replays / detect-changes (push) Successful in 9s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 9s
Runtime PR-Built Compatibility / detect-changes (push) Successful in 9s
CI / Shellcheck (E2E scripts) (push) Successful in 3s
CI / Python Lint & Test (push) Successful in 5s
CI / Canvas (Next.js) (push) Successful in 7s
CI / Canvas Deploy Reminder (push) Has been skipped
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 8s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Successful in 6s
Harness Replays / Harness Replays (push) Successful in 1m28s
publish-workspace-server-image / build-and-push (push) Successful in 2m54s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 3m49s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 3m53s
CI / Platform (Go) (push) Successful in 6m0s
2026-05-08 11:29:33 +00:00
claude-ceo-assistant 3adbbacf2e test(local-e2e): verify dev-department extraction end-to-end via real resolveYAMLIncludes
CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (pull_request) Successful in 5s
CodeQL / Analyze (${{ matrix.language }}) (go) (pull_request) Successful in 5s
CodeQL / Analyze (${{ matrix.language }}) (python) (pull_request) Successful in 5s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 9s
CI / Detect changes (pull_request) Successful in 16s
E2E API Smoke Test / detect-changes (pull_request) Successful in 13s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 13s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 12s
Harness Replays / detect-changes (pull_request) Successful in 13s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 13s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 14s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 4s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 6s
CI / Canvas (Next.js) (pull_request) Successful in 8s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 8s
CI / Python Lint & Test (pull_request) Successful in 7s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
Harness Replays / Harness Replays (pull_request) Successful in 1m12s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2m6s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 2m9s
CI / Platform (Go) (pull_request) Successful in 3m14s
Phase 4 (local-only) of internal#77 (dev-department extraction).

Adds TestLocalE2E_DevDepartmentExtraction that exercises the FULL platform
import path against the real molecule-ai-org-template-molecule-dev (post-slim)
and molecule-ai/molecule-dev-department (post-atomize) repos cloned as siblings
under /tmp/local-e2e-deploy/.

What it proves end-to-end:
  - The dev-lead symlink at parent's template root is followed by
    resolveYAMLIncludes (filepath.Abs/Rel-style security check passes,
    os.ReadFile follows the link).
  - Recursive !include chain through the symlinked subtree resolves:
    parent's org.yaml → !include dev-lead/workspace.yaml (symlinked)
    → !include ./core-lead/workspace.yaml → !include ./core-be/workspace.yaml
    (atomized children: paths, no '..').
  - 39 workspaces enumerate after resolution: 5 PM-tree + 6 Marketing-tree
    + 28 dev-tree (Dev Lead + 5 sub-team leads + 18 leaf workspaces +
    3 floaters + 1 triage-operator).
  - Q1+Q2 placements verified by sentinel name check: 'Documentation
    Specialist' is reachable (under app-lead via app-docs sub-team),
    'Triage Operator' is reachable (direct child of Dev Lead).

Test skips with t.Skipf if the local-e2e fixture isn't present on the
host — won't block CI on hosts that haven't set it up. To set up locally:

  TESTROOT=/tmp/local-e2e-deploy
  mkdir -p $TESTROOT && cd $TESTROOT
  git clone https://git.moleculesai.app/molecule-ai/molecule-ai-org-template-molecule-dev.git molecule-dev
  git clone https://git.moleculesai.app/molecule-ai/molecule-dev-department.git
  cd /Users/<you>/molecule-core/workspace-server
  go test -v -run TestLocalE2E_DevDepartmentExtraction ./internal/handlers/

Verified locally 2026-05-08:
  --- PASS: TestLocalE2E_DevDepartmentExtraction (0.01s)
  total workspaces (recursive): 39

Refs:
  internal#77 — extraction RFC
  molecule-core PR #102 — symlink-resolution contract test
  molecule-ai/molecule-dev-department PRs #1, #2, #3 (scaffold + extract + atomize)
  molecule-ai/molecule-ai-org-template-molecule-dev PR #5 (parent slim + symlink wire)
  Hongming GO 2026-05-08 ('lets not go for staging right now, we do local test first')
  SOP Phase 4 (local) — task #226
2026-05-08 04:24:47 -07:00
claude-ceo-assistant 1bd80defab Merge pull request 'test(org-include): pin symlink-based subtree composition contract' (#102) from fix/org-import-subtree-symlink-test into staging
CodeQL / Analyze (${{ matrix.language }}) (go) (push) Successful in 1s
CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (push) Successful in 1s
CodeQL / Analyze (${{ matrix.language }}) (python) (push) Successful in 1s
Block internal-flavored paths / Block forbidden paths (push) Successful in 4s
CI / Detect changes (push) Successful in 8s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 6s
E2E API Smoke Test / detect-changes (push) Successful in 9s
Handlers Postgres Integration / detect-changes (push) Successful in 9s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 9s
Harness Replays / detect-changes (push) Successful in 9s
Runtime PR-Built Compatibility / detect-changes (push) Successful in 8s
CI / Shellcheck (E2E scripts) (push) Successful in 3s
CI / Python Lint & Test (push) Successful in 4s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Successful in 4s
CI / Canvas (Next.js) (push) Successful in 7s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 6s
CI / Canvas Deploy Reminder (push) Has been skipped
Harness Replays / Harness Replays (push) Successful in 1m9s
publish-workspace-server-image / build-and-push (push) Successful in 1m57s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 2m17s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 2m27s
CI / Platform (Go) (push) Successful in 4m36s
2026-05-08 11:23:05 +00:00
claude-ceo-assistant 78c4b9b74f test(org-include): pin symlink-based subtree composition contract
CodeQL / Analyze (${{ matrix.language }}) (go) (pull_request) Successful in 1s
CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (pull_request) Successful in 1s
CodeQL / Analyze (${{ matrix.language }}) (python) (pull_request) Successful in 1s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 5s
CI / Detect changes (pull_request) Successful in 9s
E2E API Smoke Test / detect-changes (pull_request) Successful in 8s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 9s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 10s
Harness Replays / detect-changes (pull_request) Successful in 10s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 11s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 10s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 3s
CI / Python Lint & Test (pull_request) Successful in 5s
CI / Canvas (Next.js) (pull_request) Successful in 7s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 4s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 5s
Harness Replays / Harness Replays (pull_request) Failing after 45s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 52s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 1m19s
CI / Platform (Go) (pull_request) Successful in 2m23s
Two new tests in workspace-server/internal/handlers/org_include_test.go:

- TestResolveYAMLIncludes_FollowsDirectorySymlink: parent template's
  org.yaml `!include`s into a sibling-repo subtree via a relative
  directory symlink. The resolver's filepath.Abs/Rel security check
  operates on path strings (passes), and os.ReadFile follows the
  symlink at OS layer (file content delivered). Recursive nested
  `!include`s within the symlinked subtree resolve correctly because
  filepath.Dir(absTarget) keeps the literal symlink path as currentDir.

- TestResolveYAMLIncludes_RejectsSymlinkEscapingRoot: companion test
  pinning current behavior where a symlink target outside the parent
  root is followed (resolveInsideRoot doesn't EvalSymlinks). Asserted
  as 'should resolve' so future hardening (if filepath.EvalSymlinks
  is added) flips the test red and forces a coordinated update to the
  dev-department subtree-composition pattern.

Why now: internal#77 RFC (dev-department extraction) selects symlink-
based composition over a future platform-level external: ref. These
tests pin the contract before the operator-side symlink convention
gets shipped, so a refactor or hardening of the resolver can't
silently break the production org-import path.

No production code changes. Pure additive test coverage.

Refs: internal#77 (Phase 3b verification — task #223)
2026-05-07 20:42:38 -07:00
claude-ceo-assistant 8798b316f6 chore: promote accumulated staging fixes to main (vitest, postgres, e2e-api, eic, plugins) (#99)
Check merge_group trigger on required workflows / Required workflows have merge_group trigger (push) Successful in 8s
Auto-sync main → staging / sync-staging (push) Successful in 12s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (push) Successful in 14s
Harness Replays / detect-changes (push) Successful in 16s
auto-tag-runtime / tag (push) Successful in 22s
Ops Scripts Tests / Ops scripts (unittest) (push) Successful in 48s
CodeQL / Analyze (${{ matrix.language }}) (go) (push) Successful in 9s
CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (push) Successful in 13s
CodeQL / Analyze (${{ matrix.language }}) (python) (push) Successful in 11s
Block internal-flavored paths / Block forbidden paths (push) Successful in 17s
CI / Detect changes (push) Successful in 19s
Handlers Postgres Integration / detect-changes (push) Successful in 20s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 16s
E2E API Smoke Test / detect-changes (push) Successful in 23s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 23s
Runtime PR-Built Compatibility / detect-changes (push) Successful in 22s
Harness Replays / Harness Replays (push) Failing after 2m3s
publish-workspace-server-image / build-and-push (push) Successful in 3m53s
CI / Shellcheck (E2E scripts) (push) Successful in 10s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Successful in 12s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 4m19s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 5m56s
CI / Canvas (Next.js) (push) Successful in 6m8s
CI / Canvas Deploy Reminder (push) Failing after 2s
CI / Platform (Go) (push) Successful in 6m44s
CI / Python Lint & Test (push) Successful in 7m36s
SECRET_PATTERNS drift lint / Detect SECRET_PATTERNS drift (push) Successful in 34s
Auto-sync canary — AUTO_SYNC_TOKEN rotation drift / Verify AUTO_SYNC_TOKEN validity (push) Successful in 2s
Railway pin audit (drift detection) / Audit Railway env vars for drift-prone pins (push) Failing after 7s
Runtime Pin Compatibility / PyPI-latest install + import smoke (push) Successful in 1m24s
Canary — staging SaaS smoke (every 30 min) / Canary smoke (push) Failing after 4m26s
E2E API Smoke Test / E2E API Smoke Test (push) Has been cancelled
Brings staging tip b11044f8 to main. Includes #97 (vitest) + #98 (handlers-postgres) + #100 (e2e-api) + EIC race fix + #84 (SaaS plugin EIC) + accumulated staging commits. Triggers Vercel + Railway + ECR production deploy chain. Approved by security-auditor.
2026-05-08 02:46:39 +00:00
claude-ceo-assistant b11044f885 fix(plugins): SaaS (EC2-per-workspace) install/uninstall via EIC SSH (#84)
CodeQL / Analyze (${{ matrix.language }}) (go) (push) Successful in 1s
Block internal-flavored paths / Block forbidden paths (push) Successful in 6s
CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (push) Successful in 0s
CodeQL / Analyze (${{ matrix.language }}) (python) (push) Successful in 1s
CI / Detect changes (push) Successful in 9s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 8s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 10s
E2E API Smoke Test / detect-changes (push) Successful in 10s
Harness Replays / detect-changes (push) Successful in 10s
Handlers Postgres Integration / detect-changes (push) Successful in 10s
Runtime PR-Built Compatibility / detect-changes (push) Successful in 9s
CI / Shellcheck (E2E scripts) (push) Successful in 3s
CI / Python Lint & Test (push) Successful in 5s
CI / Canvas (Next.js) (push) Successful in 5s
CI / Canvas Deploy Reminder (push) Has been skipped
Harness Replays / Harness Replays (push) Failing after 59s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 1m15s
publish-workspace-server-image / build-and-push (push) Successful in 2m4s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Successful in 1m54s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 2m1s
CI / Platform (Go) (push) Successful in 3m9s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 4m45s
CodeQL / Analyze (${{ matrix.language }}) (go) (pull_request) Successful in 7s
CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (pull_request) Successful in 7s
CodeQL / Analyze (${{ matrix.language }}) (python) (pull_request) Successful in 8s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 13s
Check merge_group trigger on required workflows / Required workflows have merge_group trigger (pull_request) Successful in 15s
branch-protection drift check / Branch protection drift (pull_request) Successful in 16s
pr-guards / disable-auto-merge-on-push (pull_request) Successful in 11s
cascade-list-drift-gate / check (pull_request) Successful in 15s
CI / Detect changes (pull_request) Successful in 19s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 17s
E2E API Smoke Test / detect-changes (pull_request) Successful in 20s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 20s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 20s
Harness Replays / detect-changes (pull_request) Successful in 21s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 17s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 20s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 14s
Ops Scripts Tests / Ops scripts (unittest) (pull_request) Successful in 47s
Harness Replays / Harness Replays (pull_request) Failing after 1m6s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m29s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 2m0s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2m24s
CI / Canvas (Next.js) (pull_request) Successful in 3m16s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / Platform (Go) (pull_request) Successful in 3m52s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 5m6s
CI / Python Lint & Test (pull_request) Successful in 7m0s
Closes docker-only row in backends.md. Approved by security-auditor.
2026-05-08 02:15:49 +00:00
22 changed files with 2559 additions and 15 deletions
+2 -2
View File
@@ -22,9 +22,9 @@ on:
# spending CI cycles. See e2e-api.yml for the rationale on why this
# is a single job rather than two-jobs-sharing-name.
push:
branches: [main, staging]
branches: [main]
pull_request:
branches: [main, staging]
branches: [main]
workflow_dispatch:
schedule:
# Weekly on Sunday 08:00 UTC — catches Chrome / Playwright / Next.js
+2 -2
View File
@@ -32,7 +32,7 @@ name: E2E Staging External Runtime
on:
push:
branches: [staging, main]
branches: [main]
paths:
- 'workspace-server/internal/handlers/workspace.go'
- 'workspace-server/internal/handlers/registry.go'
@@ -44,7 +44,7 @@ on:
- 'tests/e2e/test_staging_external_runtime.sh'
- '.github/workflows/e2e-staging-external.yml'
pull_request:
branches: [staging, main]
branches: [main]
paths:
- 'workspace-server/internal/handlers/workspace.go'
- 'workspace-server/internal/handlers/registry.go'
+6 -7
View File
@@ -20,13 +20,12 @@ name: E2E Staging SaaS (full lifecycle)
# via the same paths watcher that e2e-api.yml uses)
on:
# Fire on staging push too — previously this only ran on main, which
# meant the most thorough end-to-end test caught regressions AFTER
# they shipped to staging (and then to the auto-promote PR). Running
# on staging push catches them BEFORE the staging→main promotion
# opens, so a green canary into auto-promote is more meaningful.
# Trunk-based (Phase 3 of internal#81): main is the only branch.
# Previously this fired on staging push too because staging was a
# superset of main and ran the gate ahead of auto-promote; with no
# staging branch, main is where E2E gates the deploy.
push:
branches: [staging, main]
branches: [main]
paths:
- 'workspace-server/internal/handlers/registry.go'
- 'workspace-server/internal/handlers/workspace_provision.go'
@@ -36,7 +35,7 @@ on:
- 'tests/e2e/test_staging_full_saas.sh'
- '.github/workflows/e2e-staging-saas.yml'
pull_request:
branches: [staging, main]
branches: [main]
paths:
- 'workspace-server/internal/handlers/registry.go'
- 'workspace-server/internal/handlers/workspace_provision.go'
@@ -36,7 +36,7 @@ on:
workflow_run:
workflows: ['publish-workspace-server-image']
types: [completed]
branches: [staging]
branches: [main]
workflow_dispatch:
inputs:
target_tag:
+28
View File
@@ -0,0 +1,28 @@
# Top-level Makefile — convenience wrappers around docker compose.
#
# Most molecule-core dev work happens via these shortcuts. CI doesn't
# use this Makefile; CI calls docker compose / go test directly so the
# Makefile can evolve without breaking the build.
.PHONY: help dev up down logs build test
help: ## Show this help.
@grep -E '^[a-zA-Z_-]+:.*?## ' $(MAKEFILE_LIST) | awk 'BEGIN {FS = ":.*?## "}; {printf "\033[36m%-12s\033[0m %s\n", $$1, $$2}'
dev: ## Start the full stack with air hot-reload for the platform service.
docker compose -f docker-compose.yml -f docker-compose.dev.yml up
up: ## Start the full stack in production-shape mode (no air, normal Dockerfile).
docker compose up
down: ## Stop the stack and remove containers (volumes preserved).
docker compose down
logs: ## Tail logs from all services (Ctrl-C to detach).
docker compose logs -f
build: ## Force a fresh build of the platform image (no cache).
docker compose build --no-cache platform
test: ## Run Go unit tests in workspace-server/.
cd workspace-server && go test -race ./...
+43
View File
@@ -0,0 +1,43 @@
# docker-compose.dev.yml — overlay over docker-compose.yml for local dev
# with air-driven live reload of the platform (workspace-server) service.
#
# Usage:
# docker compose -f docker-compose.yml -f docker-compose.dev.yml up
# (or `make dev` shorthand from repo root)
#
# What this overlay changes vs docker-compose.yml alone:
# - Platform service uses workspace-server/Dockerfile.dev (air on top of
# golang:1.25-alpine) instead of the multi-stage prod Dockerfile.
# - Platform service bind-mounts the host's workspace-server/ source
# into /app/workspace-server so air sees source edits live.
# - Other services (postgres, redis, langfuse, etc.) inherit unchanged
# from docker-compose.yml.
#
# What stays the same:
# - All env vars, volumes, depends_on, healthchecks from docker-compose.yml.
# - Network topology + ports.
# - Postgres/Redis as service containers (no in-process replacements).
services:
platform:
build:
context: .
dockerfile: workspace-server/Dockerfile.dev
# Rebind source: edits under host's workspace-server/ propagate live.
# The named volume on go-build-cache speeds up first build per container.
volumes:
- ./workspace-server:/app/workspace-server
- go-build-cache:/root/.cache/go-build
- go-mod-cache:/go/pkg/mod
# Air signals the running binary on rebuild; ensure shell stops cleanly.
init: true
# Mark the service as dev-mode so the platform can short-circuit any
# behavior that's incompatible with hot-reload (e.g. background
# cron-style watchers that don't survive process restart). No-op
# today; reserved for future flag use.
environment:
MOLECULE_DEV_HOT_RELOAD: "1"
volumes:
go-build-cache:
go-mod-cache:
+49
View File
@@ -0,0 +1,49 @@
# air.toml — live-reload config for local docker-compose dev mode.
#
# Active when the platform service runs from workspace-server/Dockerfile.dev
# (selected via docker-compose.dev.yml overlay). In production, the regular
# Dockerfile builds a static binary; air is dev-only.
#
# Reference: https://github.com/air-verse/air
root = "."
testdata_dir = "testdata"
tmp_dir = "tmp"
[build]
# Same build invocation as Dockerfile's builder stage minus the
# CGO_ENABLED=0 toggle (CGO ok in dev for richer race detector output).
cmd = "go build -o ./tmp/server ./cmd/server"
bin = "tmp/server"
full_bin = ""
args_bin = []
# Watch every .go and .yaml file under workspace-server/.
include_ext = ["go", "yaml", "tmpl"]
# Don't watch tests, build artifacts, vendored deps, or migration .sql
# (migrations need a clean DB anyway — handled by docker-compose down/up).
exclude_dir = ["assets", "tmp", "vendor", "testdata", "node_modules"]
exclude_file = []
# _test.go and *_mock.go shouldn't trigger a rebuild — saves cycles.
exclude_regex = ["_test\\.go$", "_mock\\.go$"]
exclude_unchanged = true
follow_symlink = false
log = "build-errors.log"
# Kill running binary 1s before starting new one.
kill_delay = "1s"
send_interrupt = true
stop_on_error = true
# Debounce: wait this long after last change before triggering rebuild.
delay = 500
[log]
time = false
[color]
main = "magenta"
watcher = "cyan"
build = "yellow"
runner = "green"
[misc]
# Don't keep the tmp/ dir around between runs.
clean_on_exit = true
+38
View File
@@ -0,0 +1,38 @@
# Dockerfile.dev — local-development image with air-driven live reload.
#
# Selected by docker-compose.dev.yml (overlay over docker-compose.yml).
# Production stays on workspace-server/Dockerfile (static binary, no air).
#
# Workflow:
# 1. docker compose -f docker-compose.yml -f docker-compose.dev.yml up
# 2. Edit any .go file under workspace-server/
# 3. air detects, rebuilds, kills old binary, starts new one (~3-5s)
# 4. No `docker compose up --build` needed
#
# Templates + plugins are NOT pre-cloned here — air-mode assumes the
# developer's filesystem has the workspace-configs-templates/ + plugins/
# dirs available, mounted at runtime via docker-compose.dev.yml.
FROM golang:1.25-alpine
# air + git (for go mod) + ca-certs (for TLS) + tzdata (for time-zone DB).
RUN apk add --no-cache git ca-certificates tzdata wget \
&& go install github.com/air-verse/air@latest
WORKDIR /app/workspace-server
# Pre-fetch deps so the first `air` rebuild on a fresh container is fast.
# These are bind-mount-overridden at runtime, so the COPY here is just
# to warm the module cache.
COPY workspace-server/go.mod workspace-server/go.sum ./
RUN go mod download
# Source is bind-mounted at runtime (see docker-compose.dev.yml volumes
# block) so the Dockerfile doesn't need to COPY it. air watches the
# bind-mounted dir for changes.
ENV CGO_ENABLED=1
ENV GOFLAGS="-buildvcs=false"
# Run air with the .air.toml in the bind-mounted source dir.
CMD ["air", "-c", ".air.toml"]
@@ -0,0 +1,375 @@
package handlers
import (
"archive/tar"
"bytes"
"net"
"os"
"os/exec"
"path/filepath"
"strings"
"testing"
"time"
"gopkg.in/yaml.v3"
)
// Local E2E for the dev-department extraction (RFC internal#77).
//
// Pre-conditions: both repos cloned as siblings under
// /tmp/local-e2e-deploy/{molecule-dev, molecule-dev-department}.
// (Set up by the orchestrator before running this test.)
//
// What this proves end-to-end through real platform code:
// 1. resolveYAMLIncludes follows the dev-lead symlink at the parent's
// template root and pulls in the dev-department subtree.
// 2. Recursive !include's inside the symlinked subtree resolve
// correctly via the chain dev-lead/workspace.yaml →
// ./core-lead/workspace.yaml → ./core-be/workspace.yaml etc.
// 3. The resolved YAML unmarshals into a complete OrgTemplate with the
// expected count of workspaces (parent's PM+Marketing+Research +
// dev-department's atomized 28 workspaces).
//
// Skipped if the local-e2e-deploy fixture isn't present — won't block
// CI on hosts that haven't set it up.
func TestLocalE2E_DevDepartmentExtraction(t *testing.T) {
parent := "/tmp/local-e2e-deploy/molecule-dev"
if _, err := os.Stat(filepath.Join(parent, "org.yaml")); err != nil {
t.Skipf("local-e2e fixture not present at %s: %v", parent, err)
}
orgYAML, err := os.ReadFile(filepath.Join(parent, "org.yaml"))
if err != nil {
t.Fatalf("read org.yaml: %v", err)
}
expanded, err := resolveYAMLIncludes(orgYAML, parent)
if err != nil {
t.Fatalf("resolveYAMLIncludes failed: %v", err)
}
var tmpl OrgTemplate
if err := yaml.Unmarshal(expanded, &tmpl); err != nil {
t.Fatalf("unmarshal expanded OrgTemplate: %v", err)
}
// Walk the full workspace tree, collect names.
names := []string{}
var walk func([]OrgWorkspace)
walk = func(ws []OrgWorkspace) {
for _, w := range ws {
names = append(names, w.Name)
walk(w.Children)
}
}
walk(tmpl.Workspaces)
t.Logf("org name: %q", tmpl.Name)
t.Logf("total workspaces (recursive): %d", len(names))
for _, n := range names {
t.Logf(" - %q", n)
}
// Expected: PM + Marketing Lead + Dev Lead at top level, plus the
// full sub-trees under each. After atomization, we expect:
// - PM tree: PM + Research Lead + 3 research roles = 5
// - Marketing tree: Marketing Lead + 5 marketing roles = 6
// - Dev Lead tree: Dev Lead + (5 sub-team leads × ~6 each) +
// 3 floaters + Triage Operator = ~32
// Roughly ~43 total. Be liberal; just assert a floor.
if len(names) < 30 {
t.Errorf("workspace count too low (%d) — expected ~40+ (PM+Marketing+Dev tree)", len(names))
}
// Specific sentinel names we expect to find:
expected := []string{
"PM",
"Marketing Lead",
"Dev Lead",
"Core Platform Lead",
"Controlplane Lead",
"App & Docs Lead",
"Infra Lead",
"SDK Lead",
"Documentation Specialist", // Q1 — should be under app-lead
"Triage Operator", // Q2 — should be under dev-lead
}
found := map[string]bool{}
for _, n := range names {
found[n] = true
}
for _, want := range expected {
if !found[want] {
t.Errorf("missing expected workspace %q", want)
}
}
}
// Stage-2 of the local e2e: prove every resolved workspace's `files_dir`
// path actually consumes correctly through the rest of the import chain.
// resolveYAMLIncludes returning a populated OrgTemplate is necessary but
// not sufficient — `POST /org/import` then does:
//
// 1. resolveInsideRoot(orgBaseDir, ws.FilesDir) → must return a path
// that exists and stat-resolves to a directory (org_import.go:313-317).
// 2. CopyTemplateToContainer(ctx, containerID, templatePath) → walks
// the dir with filepath.Walk and tars its contents into the
// workspace's /configs/ mount (provisioner.go:766-820).
//
// This stage-2 test exercises both #1 and #2 against every workspace in
// the resolved tree, mimicking what the platform does post-include-
// resolution. Catches: files_dir paths that don't resolve through the
// symlink, paths that exist but are empty (silently produces empty
// /configs/), or filepath.Walk failing to descend through cross-repo
// symlink boundaries.
func TestLocalE2E_FilesDirConsumption(t *testing.T) {
parent := "/tmp/local-e2e-deploy/molecule-dev"
if _, err := os.Stat(filepath.Join(parent, "org.yaml")); err != nil {
t.Skipf("local-e2e fixture not present at %s: %v", parent, err)
}
orgYAML, err := os.ReadFile(filepath.Join(parent, "org.yaml"))
if err != nil {
t.Fatalf("read org.yaml: %v", err)
}
expanded, err := resolveYAMLIncludes(orgYAML, parent)
if err != nil {
t.Fatalf("resolveYAMLIncludes: %v", err)
}
var tmpl OrgTemplate
if err := yaml.Unmarshal(expanded, &tmpl); err != nil {
t.Fatalf("unmarshal: %v", err)
}
// Flatten every workspace — including children, grandchildren, etc.
flat := []OrgWorkspace{}
var walk func([]OrgWorkspace)
walk = func(ws []OrgWorkspace) {
for _, w := range ws {
flat = append(flat, w)
walk(w.Children)
}
}
walk(tmpl.Workspaces)
checked := 0
for _, w := range flat {
if w.FilesDir == "" {
continue // workspace declared inline (no files_dir) — skip
}
checked++
t.Run(w.Name+"/"+w.FilesDir, func(t *testing.T) {
// Step 1: resolveInsideRoot returns a path that's-inside-root.
abs, err := resolveInsideRoot(parent, w.FilesDir)
if err != nil {
t.Fatalf("resolveInsideRoot(%q, %q): %v", parent, w.FilesDir, err)
}
info, err := os.Stat(abs)
if err != nil {
t.Fatalf("stat %q (resolved from files_dir %q): %v", abs, w.FilesDir, err)
}
if !info.IsDir() {
t.Fatalf("files_dir %q resolved to %q which is not a directory", w.FilesDir, abs)
}
// Step 2: walk the dir like CopyTemplateToContainer does.
// Mirror the platform's symlink-resolution at the root —
// filepath.Walk doesn't descend into a symlink leaf, so
// CopyTemplateToContainer (provisioner.go) calls
// EvalSymlinks on templatePath first. Replicate exactly.
if resolved, err := filepath.EvalSymlinks(abs); err == nil {
abs = resolved
}
var buf bytes.Buffer
tw := tar.NewWriter(&buf)
fileCount := 0
fileNames := []string{}
err = filepath.Walk(abs, func(path string, info os.FileInfo, err error) error {
if err != nil {
return err
}
rel, err := filepath.Rel(abs, path)
if err != nil {
return err
}
if rel == "." {
return nil
}
header, _ := tar.FileInfoHeader(info, "")
header.Name = rel
if err := tw.WriteHeader(header); err != nil {
return err
}
if !info.IsDir() {
fileCount++
fileNames = append(fileNames, rel)
data, err := os.ReadFile(path)
if err != nil {
return err
}
header.Size = int64(len(data))
tw.Write(data)
}
return nil
})
if err != nil {
t.Fatalf("filepath.Walk %q (mimics CopyTemplateToContainer): %v", abs, err)
}
tw.Close()
if fileCount == 0 {
t.Errorf("files_dir %q at %q is empty — CopyTemplateToContainer would produce empty /configs/",
w.FilesDir, abs)
}
// Sanity: every workspace folder should have AT LEAST one of
// {workspace.yaml, system-prompt.md, initial-prompt.md} —
// these are the markers a workspace folder is recognizable
// as a workspace (mirrors validator's WORKSPACE_FOLDER_MARKERS).
markers := []string{"workspace.yaml", "system-prompt.md", "initial-prompt.md"}
hasMarker := false
for _, name := range fileNames {
for _, m := range markers {
if name == m || strings.HasSuffix(name, "/"+m) {
hasMarker = true
break
}
}
if hasMarker {
break
}
}
if !hasMarker {
t.Errorf("files_dir %q at %q has %d files but none of the workspace markers %v — found: %v",
w.FilesDir, abs, fileCount, markers, fileNames)
}
})
}
t.Logf("checked %d workspaces with files_dir", checked)
if checked < 25 {
t.Errorf("expected ~28 workspaces with files_dir (post-atomization); only saw %d", checked)
}
}
// PR-C from the Phase 3a phasing (task #234): real-Gitea e2e for the
// !external resolver against the LIVE molecule-ai/molecule-dev-department
// repo. Verifies the production gitFetcher fetches the dev tree and the
// resolver grafts it correctly into a parent template that has NO
// symlink — composition is purely platform-side.
//
// Skipped if Gitea isn't reachable (offline / firewall / CI without
// network). Requires `git` binary on PATH.
func TestLocalE2E_ExternalDevDepartment(t *testing.T) {
if _, err := exec.LookPath("git"); err != nil {
t.Skipf("git binary not found: %v", err)
}
// Skip if Gitea host isn't reachable (TCP probe). Avoids network-
// dependent tests failing on offline runners.
conn, err := net.DialTimeout("tcp", "git.moleculesai.app:443", 3*time.Second)
if err != nil {
t.Skipf("git.moleculesai.app:443 unreachable: %v", err)
}
conn.Close()
// Build a minimal parent template inline — no need for the
// /tmp/local-e2e-deploy/ symlinked fixture. The whole point of
// !external is that the parent template is self-contained;
// composition resolves over the network at import time.
parent := t.TempDir()
orgYAML := []byte(`name: External-Only Test Parent
description: Parent template that pulls the entire dev tree via !external.
defaults:
runtime: claude-code
tier: 2
workspaces:
- !external
repo: molecule-ai/molecule-dev-department
ref: main
path: dev-lead/workspace.yaml
`)
if err := os.WriteFile(filepath.Join(parent, "org.yaml"), orgYAML, 0o644); err != nil {
t.Fatalf("write org.yaml: %v", err)
}
out, err := resolveYAMLIncludes(orgYAML, parent)
if err != nil {
t.Fatalf("resolveYAMLIncludes (!external against live Gitea): %v", err)
}
var tmpl OrgTemplate
if err := yaml.Unmarshal(out, &tmpl); err != nil {
t.Fatalf("unmarshal: %v", err)
}
// Walk the workspace tree, collect names + check files_dir paths.
flat := []OrgWorkspace{}
var walk func([]OrgWorkspace)
walk = func(ws []OrgWorkspace) {
for _, w := range ws {
flat = append(flat, w)
walk(w.Children)
}
}
walk(tmpl.Workspaces)
t.Logf("workspaces resolved through !external: %d", len(flat))
if len(flat) < 25 {
t.Errorf("expected ~28 dev-tree workspaces via !external; got %d", len(flat))
}
// Sentinel checks — same as TestLocalE2E_DevDepartmentExtraction
// (Q1+Q2 placements verified).
expected := []string{
"Dev Lead",
"Core Platform Lead",
"Controlplane Lead",
"App & Docs Lead",
"Documentation Specialist", // Q1
"Triage Operator", // Q2
}
found := map[string]bool{}
for _, w := range flat {
found[w.Name] = true
}
for _, want := range expected {
if !found[want] {
t.Errorf("missing expected workspace %q", want)
}
}
// Every workspace's files_dir must be cache-prefixed (proves the
// path-rewrite ran end-to-end).
cachePrefix := ".external-cache"
for _, w := range flat {
if w.FilesDir == "" {
continue
}
if !strings.HasPrefix(w.FilesDir, cachePrefix) {
t.Errorf("workspace %q files_dir %q missing cache prefix %q", w.Name, w.FilesDir, cachePrefix)
}
}
// Verify the fetched cache exists and resolveInsideRoot accepts
// every workspace's files_dir (would cause provisioning to fail
// if not).
for _, w := range flat {
if w.FilesDir == "" {
continue
}
abs, err := resolveInsideRoot(parent, w.FilesDir)
if err != nil {
t.Errorf("workspace %q files_dir %q: resolveInsideRoot: %v", w.Name, w.FilesDir, err)
continue
}
info, err := os.Stat(abs)
if err != nil {
t.Errorf("workspace %q: stat %q: %v", w.Name, abs, err)
continue
}
if !info.IsDir() {
t.Errorf("workspace %q files_dir %q is not a directory", w.Name, w.FilesDir)
}
}
}
@@ -0,0 +1,439 @@
package handlers
import (
"context"
"fmt"
"net/url"
"os"
"os/exec"
"path/filepath"
"regexp"
"strings"
"time"
"gopkg.in/yaml.v3"
)
// External-ref resolver — gitops-style cross-repo subtree composition.
// Internal#77 RFC, Phase 3a (task #222). Prior art: Helm subcharts +
// dependency cache, Kustomize remote bases, Terraform module sources.
//
// Schema (a `!external`-tagged mapping anywhere a workspace entry is
// allowed — workspaces:, roots:, children:):
//
// - !external
// repo: molecule-ai/molecule-dev-department
// ref: main
// path: dev-lead/workspace.yaml
//
// At resolve time, the platform fetches the repo at ref into a content-
// addressable cache under <rootDir>/.external-cache/<repo>/<sha>/, loads
// the yaml at <cacheDir>/<path>, rewrites every files_dir + relative
// !include path to be cache-prefixed, then grafts the result in place of
// the !external node. Downstream pipeline (resolveInsideRoot, plugin
// merge, CopyTemplateToContainer) sees ordinary in-tree paths.
// ExternalRef is the deserialized form of an `!external`-tagged mapping.
type ExternalRef struct {
Repo string `yaml:"repo"`
Ref string `yaml:"ref"`
Path string `yaml:"path"`
// URL overrides the default Gitea host. Optional; defaults to
// MOLECULE_EXTERNAL_GITEA_URL env or git.moleculesai.app.
URL string `yaml:"url,omitempty"`
}
const (
// maxExternalDepth caps recursion through nested `!external`s. Lower
// than maxIncludeDepth (16) because each level may issue a network
// fetch. Composition that genuinely needs >4 layers is a smell.
maxExternalDepth = 4
// externalCacheDirName is the per-template cache subdir under rootDir.
// Content-addressable: keyed by (repo, sha). Operators add this to
// .gitignore — cache is platform-mutated, not source-tracked.
externalCacheDirName = ".external-cache"
// gitFetchTimeout caps a single clone operation. Conservative —
// org template fetches are typically <100KB.
gitFetchTimeout = 60 * time.Second
)
// safeRefPattern restricts `ref` values to characters git itself accepts
// for branch / tag / SHA. Belt-and-braces over git's own validation.
var safeRefPattern = regexp.MustCompile(`^[a-zA-Z0-9_./-]+$`)
// allowlistedHostPath returns true if `<host>/<repo>` matches the
// configured allowlist. Default allowlist: git.moleculesai.app/molecule-ai/.
// Override via MOLECULE_EXTERNAL_REPO_ALLOWLIST env var (comma-separated
// patterns). Patterns are matched as prefixes (with trailing-slash
// semantics) or as exact matches. Trailing /* is treated as "any
// descendants of this prefix".
//
// Examples:
// - "git.moleculesai.app/molecule-ai/" → matches molecule-ai/* (any repo)
// - "git.moleculesai.app/molecule-ai/*" → same; trailing /* normalized to /
// - "git.moleculesai.app/molecule-ai/molecule-dev-department" → exact
// - "git.moleculesai.app/" → matches everything on that host
func allowlistedHostPath(host, repoPath string) bool {
allow := os.Getenv("MOLECULE_EXTERNAL_REPO_ALLOWLIST")
if allow == "" {
allow = "git.moleculesai.app/molecule-ai/"
}
hp := host + "/" + repoPath
for _, pat := range strings.Split(allow, ",") {
pat = strings.TrimSpace(pat)
if pat == "" {
continue
}
// Normalize trailing /* → /
pat = strings.TrimSuffix(pat, "*")
if pat == hp {
return true
}
if strings.HasSuffix(pat, "/") && strings.HasPrefix(hp+"/", pat) {
return true
}
}
return false
}
// externalFetcher abstracts the git-clone-into-cache step. Production
// uses gitFetcher (shells out to git); tests inject a fake that
// pre-stages content in a temp dir.
type externalFetcher interface {
// Fetch ensures rootDir/.external-cache/<safe-repo>/<sha>/ contains
// the repo content at the given ref. Returns the absolute cache
// dir + the resolved SHA. Cache hit = no network. Cache miss =
// clone.
Fetch(ctx context.Context, rootDir, host, repoPath, ref string) (cacheDir, sha string, err error)
}
// defaultExternalFetcher is the package-level fetcher injection point.
// Production code uses the git-shell fetcher; tests override via
// SetExternalFetcherForTest.
var defaultExternalFetcher externalFetcher = &gitFetcher{}
// SetExternalFetcherForTest swaps the fetcher for testing. Returns a
// cleanup func that restores the previous fetcher.
func SetExternalFetcherForTest(f externalFetcher) func() {
prev := defaultExternalFetcher
defaultExternalFetcher = f
return func() { defaultExternalFetcher = prev }
}
// resolveExternalMapping replaces an `!external`-tagged mapping node
// with the loaded + path-rewritten yaml content from the fetched repo.
//
// `currentDir` and `rootDir` are inherited from expandNode's resolve
// frame. `visited` tracks (repo, sha, path) tuples for cycle detection
// across nested externals.
func resolveExternalMapping(n *yaml.Node, currentDir, rootDir string, visited map[string]bool, depth int) error {
if depth > maxExternalDepth {
return fmt.Errorf("!external: max depth %d exceeded (possible cycle)", maxExternalDepth)
}
if rootDir == "" {
return fmt.Errorf("!external at line %d requires a dir-based org template (no rootDir in inline-template mode)", n.Line)
}
var ref ExternalRef
if err := n.Decode(&ref); err != nil {
return fmt.Errorf("!external at line %d: decode: %w", n.Line, err)
}
if ref.Repo == "" || ref.Ref == "" || ref.Path == "" {
return fmt.Errorf("!external at line %d: repo, ref, path are all required (got %+v)", n.Line, ref)
}
if !safeRefPattern.MatchString(ref.Ref) {
return fmt.Errorf("!external at line %d: ref %q contains disallowed characters", n.Line, ref.Ref)
}
// Defense-in-depth: even though git itself rejects refs containing
// `..`, the regex above currently allows them. Reject explicitly.
if strings.Contains(ref.Ref, "..") {
return fmt.Errorf("!external at line %d: ref %q must not contain '..'", n.Line, ref.Ref)
}
if strings.Contains(ref.Path, "..") || strings.HasPrefix(ref.Path, "/") {
return fmt.Errorf("!external at line %d: path %q must be relative-and-down-only", n.Line, ref.Path)
}
host := ref.URL
if host == "" {
host = os.Getenv("MOLECULE_EXTERNAL_GITEA_URL")
}
if host == "" {
host = "git.moleculesai.app"
}
host = strings.TrimPrefix(strings.TrimPrefix(host, "https://"), "http://")
host = strings.TrimSuffix(host, "/")
if !allowlistedHostPath(host, ref.Repo) {
return fmt.Errorf("!external at line %d: %s/%s not in MOLECULE_EXTERNAL_REPO_ALLOWLIST", n.Line, host, ref.Repo)
}
ctx, cancel := context.WithTimeout(context.Background(), gitFetchTimeout)
defer cancel()
cacheDir, sha, err := defaultExternalFetcher.Fetch(ctx, rootDir, host, ref.Repo, ref.Ref)
if err != nil {
return fmt.Errorf("!external at line %d: fetch %s/%s@%s: %w", n.Line, host, ref.Repo, ref.Ref, err)
}
// Cycle key: (repo, sha, path) — same external content reachable
// via two paths is fine, but a self-referential cycle isn't.
cycleKey := fmt.Sprintf("%s/%s@%s/%s", host, ref.Repo, sha, ref.Path)
if visited[cycleKey] {
return fmt.Errorf("!external cycle detected at %q (line %d)", cycleKey, n.Line)
}
// Validate path resolves inside the cache dir (anti-traversal).
yamlPathAbs, err := resolveInsideRoot(cacheDir, ref.Path)
if err != nil {
return fmt.Errorf("!external at line %d: path %q: %w", n.Line, ref.Path, err)
}
if _, err := os.Stat(yamlPathAbs); err != nil {
return fmt.Errorf("!external at line %d: %s/%s@%s does not contain %q: %w", n.Line, host, ref.Repo, sha, ref.Path, err)
}
data, err := os.ReadFile(yamlPathAbs)
if err != nil {
return fmt.Errorf("!external at line %d: read %q: %w", n.Line, yamlPathAbs, err)
}
var sub yaml.Node
if err := yaml.Unmarshal(data, &sub); err != nil {
return fmt.Errorf("!external at line %d: parse %q: %w", n.Line, yamlPathAbs, err)
}
root := &sub
if root.Kind == yaml.DocumentNode && len(root.Content) == 1 {
root = root.Content[0]
}
// Recurse FIRST: load all nested !include / !external content into
// the tree. Then rewrite ALL files_dir scalars in the fully-resolved
// tree (top + nested) with the cache prefix in one pass. Doing
// rewrite-before-recurse would leave nested-loaded files_dir paths
// unprefixed.
visited[cycleKey] = true
defer delete(visited, cycleKey)
subDir := filepath.Dir(yamlPathAbs)
if err := expandNode(root, subDir, rootDir, visited, depth+1); err != nil {
return err
}
// Path rewrite: prefix every files_dir scalar in the fully-resolved
// content with the cache-relative-from-rootDir prefix. After this
// pass, fetched workspaces look like ordinary in-tree workspaces.
cachePrefix, err := filepath.Rel(rootDir, cacheDir)
if err != nil {
return fmt.Errorf("!external at line %d: cannot compute cache prefix: %w", n.Line, err)
}
rewriteFilesDir(root, cachePrefix)
// Replace the !external mapping with the resolved content in-place.
*n = *root
if n.Tag == "!external" {
n.Tag = ""
}
return nil
}
// rewriteFilesDir walks the yaml node tree and prepends cachePrefix to
// every files_dir scalar value. Idempotent: if a files_dir value already
// starts with the prefix, no-op.
//
// !include paths are intentionally NOT rewritten. They resolve relative
// to their containing file's directory (subDir in expandNode), and after
// fetch that directory IS inside the cache, so relative !include paths
// Just Work without any rewrite. Rewriting them would double-prefix on
// recursive resolution.
//
// files_dir DOES need rewriting because it's consumed at workspace-
// provisioning time relative to orgBaseDir (the parent template's root),
// not relative to the workspace.yaml's containing dir.
func rewriteFilesDir(n *yaml.Node, cachePrefix string) {
if n == nil {
return
}
if n.Kind == yaml.MappingNode {
for i := 0; i+1 < len(n.Content); i += 2 {
key, value := n.Content[i], n.Content[i+1]
if key.Kind == yaml.ScalarNode && key.Value == "files_dir" && value.Kind == yaml.ScalarNode {
if !strings.HasPrefix(value.Value, cachePrefix+string(filepath.Separator)) && value.Value != cachePrefix {
value.Value = filepath.Join(cachePrefix, value.Value)
}
}
}
}
for _, child := range n.Content {
rewriteFilesDir(child, cachePrefix)
}
}
// safeRepoCacheDir converts a repo path like "molecule-ai/foo" into a
// filesystem-safe segment "molecule-ai__foo". Avoids nesting cache dirs
// (which would complicate cleanup).
func safeRepoCacheDir(host, repoPath string) string {
hp := host + "/" + repoPath
hp = strings.ReplaceAll(hp, "/", "__")
hp = strings.ReplaceAll(hp, ":", "_")
return hp
}
// gitFetcher is the production externalFetcher: shells out to `git` to
// clone the repo at ref into the cache dir. Cache key includes the
// resolved SHA, so different SHAs of the same ref get different cache
// dirs (no overwrite).
//
// Token handling — important for security. The auth token never enters
// the clone URL (and therefore never lands in the cloned repo's
// .git/config) and never appears in returned errors. We use git's
// `http.extraHeader` config option (passed via `-c`), which sends an
// Authorization header per-request without persisting it. The token is
// briefly visible in the `git` process's argv (so other local users
// with the same uid could see it via `ps`), which is the same exposure
// it has via the env var that supplied it.
//
// Cache validity uses a `.complete` marker written after a successful
// clone+rename. Cache-hit checks for the marker, not just the dir
// existence — a partially-written cache (clone failed mid-way, or a
// concurrent caller wrote a half-baked cache dir) is treated as cache
// miss and re-fetched cleanly.
type gitFetcher struct{}
// cacheCompleteMarker is the filename written after a successful clone.
// Cache-hit requires this marker; without it, the cache dir is treated
// as partially-written and re-fetched.
const cacheCompleteMarker = ".complete"
// Fetch resolves ref → SHA via `git ls-remote`, then `git clone --depth=1`
// if the cache dir is missing or incomplete. Auth via MOLECULE_GITEA_TOKEN
// injected via http.extraHeader (never via URL).
func (g *gitFetcher) Fetch(ctx context.Context, rootDir, host, repoPath, ref string) (string, string, error) {
cacheRoot := filepath.Join(rootDir, externalCacheDirName, safeRepoCacheDir(host, repoPath))
if err := os.MkdirAll(cacheRoot, 0o755); err != nil {
return "", "", fmt.Errorf("mkdir cache root: %w", err)
}
cloneURL := buildExternalCloneURL(host, repoPath)
gitArgs := func(extra ...string) []string {
args := authConfigArgs()
return append(args, extra...)
}
// 1. Resolve ref → SHA (so cache dir is content-addressable).
sha, err := g.resolveRefToSHA(ctx, cloneURL, ref, gitArgs)
if err != nil {
return "", "", fmt.Errorf("ls-remote: %s", redactToken(err.Error()))
}
cacheDir := filepath.Join(cacheRoot, sha)
// Cache-hit requires the .complete marker AND the .git dir.
// Without the marker, cache is partially-written → treat as miss.
if isCacheComplete(cacheDir) {
return cacheDir, sha, nil
}
// Cache miss or partially-written — clean any stale cacheDir before
// cloning (a previous broken attempt would otherwise block rename).
os.RemoveAll(cacheDir)
// 2. Clone into a sibling tmp dir; atomic rename on success.
tmpDir, err := os.MkdirTemp(cacheRoot, sha+".tmp.")
if err != nil {
return "", "", fmt.Errorf("mkdir tmp: %w", err)
}
// MkdirTemp creates the dir; git clone refuses to clone into a
// non-empty dir. Remove + recreate empty.
os.RemoveAll(tmpDir)
cloneAndConfig := append(gitArgs("clone", "--quiet", "--depth=1", "-b", ref, cloneURL, tmpDir))
cmd := exec.CommandContext(ctx, "git", cloneAndConfig...)
cmd.Env = append(os.Environ(), "GIT_TERMINAL_PROMPT=0")
if out, err := cmd.CombinedOutput(); err != nil {
os.RemoveAll(tmpDir)
return "", "", fmt.Errorf("git clone: %w: %s", err, redactToken(strings.TrimSpace(string(out))))
}
// Write the .complete marker BEFORE the rename. If rename succeeds,
// the marker is in place. If rename loses the race (concurrent
// fetcher won), our tmp gets cleaned up and we trust the winner.
if err := os.WriteFile(filepath.Join(tmpDir, cacheCompleteMarker), []byte(time.Now().UTC().Format(time.RFC3339)), 0o644); err != nil {
os.RemoveAll(tmpDir)
return "", "", fmt.Errorf("write complete marker: %w", err)
}
if err := os.Rename(tmpDir, cacheDir); err != nil {
// Race: another import beat us. Validate THEIR cache, accept it.
os.RemoveAll(tmpDir)
if isCacheComplete(cacheDir) {
return cacheDir, sha, nil
}
return "", "", fmt.Errorf("rename clone to cache (and winner's cache is incomplete): %w", err)
}
return cacheDir, sha, nil
}
// isCacheComplete reports whether cacheDir contains both the cloned
// repo (.git) and the .complete marker. Treats partial state as miss.
func isCacheComplete(cacheDir string) bool {
if _, err := os.Stat(filepath.Join(cacheDir, ".git")); err != nil {
return false
}
if _, err := os.Stat(filepath.Join(cacheDir, cacheCompleteMarker)); err != nil {
return false
}
return true
}
func (g *gitFetcher) resolveRefToSHA(ctx context.Context, cloneURL, ref string, gitArgs func(...string) []string) (string, error) {
args := gitArgs("ls-remote", cloneURL, ref)
cmd := exec.CommandContext(ctx, "git", args...)
cmd.Env = append(os.Environ(), "GIT_TERMINAL_PROMPT=0")
out, err := cmd.Output()
if err != nil {
return "", err
}
line := strings.TrimSpace(string(out))
if line == "" {
return "", fmt.Errorf("ref %q not found", ref)
}
// First whitespace-separated field is the SHA.
for i, ch := range line {
if ch == ' ' || ch == '\t' {
return line[:i], nil
}
}
return line, nil
}
// buildExternalCloneURL constructs the clone URL WITHOUT auth in userinfo.
// Auth is layered on via authConfigArgs's http.extraHeader.
func buildExternalCloneURL(host, repoPath string) string {
u := url.URL{Scheme: "https", Host: host, Path: "/" + repoPath + ".git"}
return u.String()
}
// authConfigArgs returns the `-c http.extraHeader=Authorization: token X`
// args to pass to git, OR an empty slice if no token is set. The token
// goes into the request headers (not the URL or .git/config), so it
// doesn't persist on disk and doesn't appear in clone error output.
func authConfigArgs() []string {
token := os.Getenv("MOLECULE_GITEA_TOKEN")
if token == "" {
return nil
}
return []string{"-c", "http.extraHeader=Authorization: token " + token}
}
// redactToken scrubs the auth token from a string before it's logged
// or returned in an error. Belt-and-braces: with the http.extraHeader
// approach the token shouldn't appear in git's output, but if some
// future git version or libcurl debug mode emits it, this catches it.
func redactToken(s string) string {
token := os.Getenv("MOLECULE_GITEA_TOKEN")
if token == "" || len(token) < 8 {
return s
}
return strings.ReplaceAll(s, token, "<redacted-token>")
}
@@ -0,0 +1,379 @@
package handlers
import (
"context"
"os"
"os/exec"
"path/filepath"
"runtime"
"strings"
"testing"
"gopkg.in/yaml.v3"
)
// PR-B integration test: exercises the REAL gitFetcher (no fakeFetcher
// injection) against a local bare-git repo. Uses git's `insteadOf`
// config to rewrite the configured Gitea URL to the local bare path
// at clone time, so the fetcher's URL-building, ls-remote, clone,
// atomic-rename, and cache-hit paths all run against real git
// without requiring network or modifying production code.
//
// Internal#77 task #233 (PR-B from the design's phasing).
// TestGitFetcher_RealClone_LocalRedirect proves the production
// gitFetcher round-trips correctly against a real git repository.
// Steps:
// 1. Set up a local bare-git repo with workspace content.
// 2. Configure git's `insteadOf` to rewrite the gitea URL → local path
// via GIT_CONFIG_COUNT/KEY/VALUE env vars (process-scoped).
// 3. Run resolveYAMLIncludes with !external pointing at the gitea URL.
// 4. Assert: cache dir populated; content materialized; path rewrite
// applied; second invocation hits cache (no second clone).
func TestGitFetcher_RealClone_LocalRedirect(t *testing.T) {
if _, err := exec.LookPath("git"); err != nil {
t.Skipf("git binary not found: %v", err)
}
if runtime.GOOS == "windows" {
t.Skip("path-based git URLs behave differently on Windows; skipping")
}
// Step 1: create a local bare-git repo at <fixtures>/test-dev-dept.git
// with workspace content. Use a working clone to add content, then
// push to the bare.
fixtures := t.TempDir()
barePath := filepath.Join(fixtures, "test-dev-dept.git")
workPath := filepath.Join(fixtures, "work")
mustGit(t, "", "init", "--bare", "-b", "main", barePath)
mustGit(t, "", "clone", barePath, workPath)
mustGit(t, workPath, "config", "user.email", "test@example.com")
mustGit(t, workPath, "config", "user.name", "Integration Test")
mustWriteFile(t, filepath.Join(workPath, "dev-lead/workspace.yaml"), `name: Dev Lead
files_dir: dev-lead
children:
- !include ./core-be/workspace.yaml
`)
mustWriteFile(t, filepath.Join(workPath, "dev-lead/system-prompt.md"), "Dev Lead persona body.\n")
mustWriteFile(t, filepath.Join(workPath, "dev-lead/core-be/workspace.yaml"), `name: Core BE
files_dir: dev-lead/core-be
`)
mustWriteFile(t, filepath.Join(workPath, "dev-lead/core-be/system-prompt.md"), "Core BE persona body.\n")
mustGit(t, workPath, "add", ".")
mustGit(t, workPath, "commit", "-m", "seed dev tree")
mustGit(t, workPath, "push", "origin", "main")
// Step 2: configure git's insteadOf rewrite. The fetcher will try
// to clone https://git.moleculesai.app/molecule-ai/test-dev-dept.git;
// git rewrites to file://<barePath>.
//
// GIT_CONFIG_COUNT/KEY/VALUE injects config without touching
// ~/.gitconfig — process-scoped, no test pollution.
geesUrl := "https://git.moleculesai.app/molecule-ai/test-dev-dept.git"
t.Setenv("GIT_CONFIG_COUNT", "1")
t.Setenv("GIT_CONFIG_KEY_0", "url."+barePath+".insteadOf")
t.Setenv("GIT_CONFIG_VALUE_0", geesUrl)
// Step 3: run resolveYAMLIncludes with !external pointing at the
// gitea URL. Allowlist is the default (molecule-ai/* on Gitea host).
rootDir := t.TempDir()
src := []byte(`workspaces:
- !external
repo: molecule-ai/test-dev-dept
ref: main
path: dev-lead/workspace.yaml
`)
out, err := resolveYAMLIncludes(src, rootDir)
if err != nil {
t.Fatalf("resolveYAMLIncludes: %v", err)
}
var tmpl OrgTemplate
if err := yaml.Unmarshal(out, &tmpl); err != nil {
t.Fatalf("unmarshal: %v", err)
}
if len(tmpl.Workspaces) != 1 {
t.Fatalf("workspaces: %+v", tmpl.Workspaces)
}
dev := tmpl.Workspaces[0]
if dev.Name != "Dev Lead" {
t.Errorf("dev.Name = %q; want Dev Lead", dev.Name)
}
if !strings.Contains(dev.FilesDir, ".external-cache") {
t.Errorf("dev.FilesDir = %q; want cache prefix", dev.FilesDir)
}
if !strings.HasSuffix(dev.FilesDir, "dev-lead") {
t.Errorf("dev.FilesDir = %q; want suffix dev-lead", dev.FilesDir)
}
if len(dev.Children) != 1 {
t.Fatalf("expected nested core-be child; got %+v", dev.Children)
}
core := dev.Children[0]
if core.Name != "Core BE" {
t.Errorf("core.Name = %q; want Core BE", core.Name)
}
if !strings.HasSuffix(core.FilesDir, filepath.Join("dev-lead", "core-be")) {
t.Errorf("core.FilesDir = %q; want suffix dev-lead/core-be", core.FilesDir)
}
// Step 4: verify the cache dir actually exists and contains the
// materialized files (CopyTemplateToContainer would tar these).
cacheRoot := filepath.Join(rootDir, ".external-cache")
entries, err := os.ReadDir(cacheRoot)
if err != nil {
t.Fatalf("read cache root: %v", err)
}
if len(entries) != 1 {
t.Fatalf("expected 1 cached repo, got %d: %v", len(entries), entries)
}
repoDir := filepath.Join(cacheRoot, entries[0].Name())
shaDirs, _ := os.ReadDir(repoDir)
if len(shaDirs) != 1 {
t.Fatalf("expected 1 SHA cache dir, got %d", len(shaDirs))
}
cacheDir := filepath.Join(repoDir, shaDirs[0].Name())
if _, err := os.Stat(filepath.Join(cacheDir, "dev-lead/system-prompt.md")); err != nil {
t.Errorf("expected dev-lead/system-prompt.md in cache: %v", err)
}
if _, err := os.Stat(filepath.Join(cacheDir, "dev-lead/core-be/system-prompt.md")); err != nil {
t.Errorf("expected dev-lead/core-be/system-prompt.md in cache: %v", err)
}
// Step 5: re-run; verify cache hit (no second clone). Set a
// "marker" file in the cache that a second clone would clobber.
marker := filepath.Join(cacheDir, ".cache-hit-marker")
if err := os.WriteFile(marker, []byte("hit"), 0o644); err != nil {
t.Fatal(err)
}
out2, err := resolveYAMLIncludes(src, rootDir)
if err != nil {
t.Fatalf("resolveYAMLIncludes second call: %v", err)
}
if string(out) != string(out2) {
t.Errorf("cached output differs from initial — non-deterministic resolve")
}
if _, err := os.Stat(marker); err != nil {
t.Errorf("cache hit not honored — marker file disappeared: %v", err)
}
}
// TestGitFetcher_RealClone_BadRefFails: pointing at a ref that doesn't
// exist in the bare-repo surfaces git's error cleanly.
func TestGitFetcher_RealClone_BadRefFails(t *testing.T) {
if _, err := exec.LookPath("git"); err != nil {
t.Skipf("git binary not found: %v", err)
}
if runtime.GOOS == "windows" {
t.Skip("skipping on windows")
}
fixtures := t.TempDir()
barePath := filepath.Join(fixtures, "empty-repo.git")
workPath := filepath.Join(fixtures, "work")
mustGit(t, "", "init", "--bare", "-b", "main", barePath)
mustGit(t, "", "clone", barePath, workPath)
mustGit(t, workPath, "config", "user.email", "test@example.com")
mustGit(t, workPath, "config", "user.name", "Test")
mustWriteFile(t, filepath.Join(workPath, "README.md"), "x")
mustGit(t, workPath, "add", ".")
mustGit(t, workPath, "commit", "-m", "seed")
mustGit(t, workPath, "push", "origin", "main")
t.Setenv("GIT_CONFIG_COUNT", "1")
t.Setenv("GIT_CONFIG_KEY_0", "url."+barePath+".insteadOf")
t.Setenv("GIT_CONFIG_VALUE_0", "https://git.moleculesai.app/molecule-ai/empty-repo.git")
rootDir := t.TempDir()
src := []byte(`workspaces:
- !external
repo: molecule-ai/empty-repo
ref: nonexistent-branch
path: anything.yaml
`)
_, err := resolveYAMLIncludes(src, rootDir)
if err == nil {
t.Fatalf("expected error for nonexistent ref; got nil")
}
if !strings.Contains(err.Error(), "ref") && !strings.Contains(err.Error(), "ls-remote") && !strings.Contains(err.Error(), "not found") {
t.Errorf("error doesn't mention ref/ls-remote: %v", err)
}
}
// ---------- helpers ----------
func mustGit(t *testing.T, cwd string, args ...string) {
t.Helper()
cmd := exec.Command("git", args...)
if cwd != "" {
cmd.Dir = cwd
}
// Ensure user.email/name are set globally for non-cwd commands too.
cmd.Env = append(os.Environ(),
"GIT_AUTHOR_EMAIL=test@example.com",
"GIT_AUTHOR_NAME=Integration Test",
"GIT_COMMITTER_EMAIL=test@example.com",
"GIT_COMMITTER_NAME=Integration Test",
)
if out, err := cmd.CombinedOutput(); err != nil {
t.Fatalf("git %s: %v\n%s", strings.Join(args, " "), err, string(out))
}
}
func mustWriteFile(t *testing.T, path, content string) {
t.Helper()
if err := os.MkdirAll(filepath.Dir(path), 0o755); err != nil {
t.Fatal(err)
}
if err := os.WriteFile(path, []byte(content), 0o644); err != nil {
t.Fatal(err)
}
}
// Verify gitFetcher.Fetch direct invocation (no resolver wrapping) for
// the cache-hit path, exercising the bare API against a local bare-repo.
func TestGitFetcher_DirectFetch_CacheHit(t *testing.T) {
if _, err := exec.LookPath("git"); err != nil {
t.Skipf("git binary not found: %v", err)
}
if runtime.GOOS == "windows" {
t.Skip("skipping on windows")
}
fixtures := t.TempDir()
barePath := filepath.Join(fixtures, "direct.git")
workPath := filepath.Join(fixtures, "w")
mustGit(t, "", "init", "--bare", "-b", "main", barePath)
mustGit(t, "", "clone", barePath, workPath)
mustGit(t, workPath, "config", "user.email", "t@e")
mustGit(t, workPath, "config", "user.name", "T")
mustWriteFile(t, filepath.Join(workPath, "marker.txt"), "hello")
mustGit(t, workPath, "add", ".")
mustGit(t, workPath, "commit", "-m", "seed")
mustGit(t, workPath, "push", "origin", "main")
t.Setenv("GIT_CONFIG_COUNT", "1")
t.Setenv("GIT_CONFIG_KEY_0", "url."+barePath+".insteadOf")
t.Setenv("GIT_CONFIG_VALUE_0", "https://git.moleculesai.app/molecule-ai/direct.git")
rootDir := t.TempDir()
g := &gitFetcher{}
ctx := context.Background()
cacheDir1, sha1, err := g.Fetch(ctx, rootDir, "git.moleculesai.app", "molecule-ai/direct", "main")
if err != nil {
t.Fatalf("first Fetch: %v", err)
}
if sha1 == "" || len(sha1) < 7 {
t.Errorf("expected SHA-like string, got %q", sha1)
}
if _, err := os.Stat(filepath.Join(cacheDir1, "marker.txt")); err != nil {
t.Errorf("first fetch missing marker.txt: %v", err)
}
// Second call: cache hit, returns same dir + sha, no re-clone.
stamp := filepath.Join(cacheDir1, ".not-clobbered-by-second-fetch")
if err := os.WriteFile(stamp, []byte("x"), 0o644); err != nil {
t.Fatal(err)
}
cacheDir2, sha2, err := g.Fetch(ctx, rootDir, "git.moleculesai.app", "molecule-ai/direct", "main")
if err != nil {
t.Fatalf("second Fetch: %v", err)
}
if cacheDir2 != cacheDir1 || sha2 != sha1 {
t.Errorf("cache miss on second call: %q/%q vs %q/%q", cacheDir1, sha1, cacheDir2, sha2)
}
if _, err := os.Stat(stamp); err != nil {
t.Errorf("cache hit not honored — stamp file disappeared: %v", err)
}
}
// TestGitFetcher_RejectsRefWithDoubleDot: defense-in-depth on ref input.
// safeRefPattern allows '.' as a regex character, so ".." would match
// without an explicit deny. Verify it's rejected even though git itself
// would also reject the resulting clone.
func TestGitFetcher_RejectsRefWithDoubleDot(t *testing.T) {
rootDir := t.TempDir()
src := []byte(`workspaces:
- !external
repo: molecule-ai/x
ref: foo..bar
path: x.yaml
`)
_, err := resolveYAMLIncludes(src, rootDir)
if err == nil {
t.Fatalf("expected '..' rejection")
}
if !strings.Contains(err.Error(), "..") {
t.Errorf("expected '..' in error; got %v", err)
}
}
// TestGitFetcher_CacheValidatedByCompleteMarker: a partially-written
// cache (the .git dir exists but no .complete marker) is treated as
// cache-miss and re-fetched. Catches the broken-cache-permanence bug.
func TestGitFetcher_CacheValidatedByCompleteMarker(t *testing.T) {
if _, err := exec.LookPath("git"); err != nil {
t.Skipf("git not found: %v", err)
}
if runtime.GOOS == "windows" {
t.Skip("skipping on windows")
}
fixtures := t.TempDir()
barePath := filepath.Join(fixtures, "test.git")
workPath := filepath.Join(fixtures, "w")
mustGit(t, "", "init", "--bare", "-b", "main", barePath)
mustGit(t, "", "clone", barePath, workPath)
mustGit(t, workPath, "config", "user.email", "t@e")
mustGit(t, workPath, "config", "user.name", "T")
mustWriteFile(t, filepath.Join(workPath, "good.txt"), "from-network")
mustGit(t, workPath, "add", ".")
mustGit(t, workPath, "commit", "-m", "seed")
mustGit(t, workPath, "push", "origin", "main")
t.Setenv("GIT_CONFIG_COUNT", "1")
t.Setenv("GIT_CONFIG_KEY_0", "url."+barePath+".insteadOf")
t.Setenv("GIT_CONFIG_VALUE_0", "https://git.moleculesai.app/molecule-ai/marker-test.git")
rootDir := t.TempDir()
g := &gitFetcher{}
// First fetch — populates the cache (creates .complete marker).
cacheDir1, _, err := g.Fetch(context.Background(), rootDir, "git.moleculesai.app", "molecule-ai/marker-test", "main")
if err != nil {
t.Fatalf("first Fetch: %v", err)
}
marker := filepath.Join(cacheDir1, cacheCompleteMarker)
if _, err := os.Stat(marker); err != nil {
t.Fatalf("first fetch should have written .complete marker: %v", err)
}
// Now simulate a partial cache: delete the marker but leave .git
// in place. The next Fetch should treat this as cache-miss and
// re-fetch (NOT silently use the partial cache).
if err := os.Remove(marker); err != nil {
t.Fatal(err)
}
// Drop a sentinel file the second fetch will clobber if it re-fetches.
sentinel := filepath.Join(cacheDir1, "_should_be_clobbered")
if err := os.WriteFile(sentinel, []byte("partial"), 0o644); err != nil {
t.Fatal(err)
}
cacheDir2, _, err := g.Fetch(context.Background(), rootDir, "git.moleculesai.app", "molecule-ai/marker-test", "main")
if err != nil {
t.Fatalf("second Fetch: %v", err)
}
if cacheDir1 != cacheDir2 {
t.Errorf("cache dirs differ across fetches: %q vs %q", cacheDir1, cacheDir2)
}
if _, err := os.Stat(filepath.Join(cacheDir2, cacheCompleteMarker)); err != nil {
t.Errorf("re-fetch should have re-written .complete marker: %v", err)
}
if _, err := os.Stat(sentinel); err == nil {
t.Errorf("sentinel still present — re-fetch did NOT clobber partial cache")
}
}
@@ -0,0 +1,331 @@
package handlers
import (
"context"
"os"
"path/filepath"
"strings"
"testing"
"gopkg.in/yaml.v3"
)
// fakeFetcher pre-stages a "fetched" repo at a fixed path inside the
// rootDir's .external-cache, bypassing the real git clone. Tests
// inject this via SetExternalFetcherForTest to exercise the resolver
// + path-rewrite logic without network.
type fakeFetcher struct {
// content maps "<host>/<repo>@<ref>" → a function that materializes
// repo content under cacheDir. Returns the fake SHA to use.
content map[string]func(cacheDir string) (sha string, err error)
}
func (f *fakeFetcher) Fetch(ctx context.Context, rootDir, host, repoPath, ref string) (string, string, error) {
key := host + "/" + repoPath + "@" + ref
stage, ok := f.content[key]
if !ok {
return "", "", &fakeNotFoundError{key: key}
}
// Use a stable SHA for the test so cache dir is deterministic.
cacheDir := filepath.Join(rootDir, ".external-cache", safeRepoCacheDir(host, repoPath), "deadbeef")
if err := os.MkdirAll(cacheDir, 0o755); err != nil {
return "", "", err
}
sha, err := stage(cacheDir)
if err != nil {
return "", "", err
}
return cacheDir, sha, nil
}
type fakeNotFoundError struct{ key string }
func (e *fakeNotFoundError) Error() string {
return "fake fetcher: no content registered for " + e.key
}
// stageFiles writes a map of relative-path → content into cacheDir,
// returning a fake SHA. Helper for fakeFetcher closures.
func stageFiles(cacheDir string, files map[string]string) error {
if err := os.MkdirAll(filepath.Join(cacheDir, ".git"), 0o755); err != nil {
return err
}
for path, content := range files {
full := filepath.Join(cacheDir, path)
if err := os.MkdirAll(filepath.Dir(full), 0o755); err != nil {
return err
}
if err := os.WriteFile(full, []byte(content), 0o644); err != nil {
return err
}
}
return nil
}
// TestResolveExternalMapping_HappyPath: a parent template with an
// !external entry resolves cleanly into the fetched workspace + path-
// rewrites files_dir + relative !include refs into the cache prefix.
func TestResolveExternalMapping_HappyPath(t *testing.T) {
tmp := t.TempDir()
// Stub fetcher: "fetched" content has a workspace.yaml that uses
// files_dir + nested !include relative to the fetched repo's root.
fake := &fakeFetcher{
content: map[string]func(string) (string, error){
"git.moleculesai.app/molecule-ai/molecule-dev-department@main": func(cacheDir string) (string, error) {
return "deadbeef", stageFiles(cacheDir, map[string]string{
"dev-lead/workspace.yaml": `name: Dev Lead
files_dir: dev-lead
children:
- !include ./core-lead/workspace.yaml
`,
"dev-lead/core-lead/workspace.yaml": `name: Core Platform Lead
files_dir: dev-lead/core-lead
`,
})
},
},
}
cleanup := SetExternalFetcherForTest(fake)
defer cleanup()
src := []byte(`name: Parent
workspaces:
- !external
repo: molecule-ai/molecule-dev-department
ref: main
path: dev-lead/workspace.yaml
`)
out, err := resolveYAMLIncludes(src, tmp)
if err != nil {
t.Fatalf("resolveYAMLIncludes: %v", err)
}
var tmpl OrgTemplate
if err := yaml.Unmarshal(out, &tmpl); err != nil {
t.Fatalf("unmarshal: %v", err)
}
if len(tmpl.Workspaces) != 1 {
t.Fatalf("workspaces: %+v", tmpl.Workspaces)
}
dev := tmpl.Workspaces[0]
if dev.Name != "Dev Lead" {
t.Errorf("dev.Name = %q; want Dev Lead", dev.Name)
}
// files_dir should be cache-prefixed.
wantPrefix := filepath.Join(".external-cache", "git.moleculesai.app__molecule-ai__molecule-dev-department", "deadbeef")
if !strings.HasPrefix(dev.FilesDir, wantPrefix) {
t.Errorf("dev.FilesDir = %q; want prefix %q", dev.FilesDir, wantPrefix)
}
if !strings.HasSuffix(dev.FilesDir, "dev-lead") {
t.Errorf("dev.FilesDir = %q; want suffix dev-lead", dev.FilesDir)
}
// Nested child: files_dir cache-prefixed, name Core Platform Lead.
if len(dev.Children) != 1 {
t.Fatalf("dev.Children: %+v", dev.Children)
}
core := dev.Children[0]
if core.Name != "Core Platform Lead" {
t.Errorf("core.Name = %q; want Core Platform Lead", core.Name)
}
if !strings.HasPrefix(core.FilesDir, wantPrefix) {
t.Errorf("core.FilesDir = %q; want prefix %q", core.FilesDir, wantPrefix)
}
if !strings.HasSuffix(core.FilesDir, filepath.Join("dev-lead", "core-lead")) {
t.Errorf("core.FilesDir = %q; want suffix dev-lead/core-lead", core.FilesDir)
}
}
// TestResolveExternalMapping_AllowlistRejection: hostile yaml pointing
// at a non-allowlisted repo gets rejected.
func TestResolveExternalMapping_AllowlistRejection(t *testing.T) {
tmp := t.TempDir()
fake := &fakeFetcher{content: map[string]func(string) (string, error){}}
cleanup := SetExternalFetcherForTest(fake)
defer cleanup()
// Default allowlist is git.moleculesai.app/molecule-ai/*.
// github.com/foo/bar is NOT in it.
src := []byte(`workspaces:
- !external
repo: foo/bar
ref: main
path: x.yaml
url: github.com
`)
_, err := resolveYAMLIncludes(src, tmp)
if err == nil {
t.Fatalf("expected allowlist rejection, got nil")
}
if !strings.Contains(err.Error(), "MOLECULE_EXTERNAL_REPO_ALLOWLIST") {
t.Errorf("expected allowlist error; got %v", err)
}
}
// TestResolveExternalMapping_PathTraversalRejection: hostile yaml
// with `path: ../../etc/passwd` gets rejected before fetch.
func TestResolveExternalMapping_PathTraversalRejection(t *testing.T) {
tmp := t.TempDir()
fake := &fakeFetcher{content: map[string]func(string) (string, error){}}
cleanup := SetExternalFetcherForTest(fake)
defer cleanup()
src := []byte(`workspaces:
- !external
repo: molecule-ai/dev-department
ref: main
path: ../../etc/passwd
`)
_, err := resolveYAMLIncludes(src, tmp)
if err == nil {
t.Fatalf("expected path traversal rejection, got nil")
}
if !strings.Contains(err.Error(), "relative-and-down-only") {
t.Errorf("expected path traversal error; got %v", err)
}
}
// TestResolveExternalMapping_BadRefRejection: non-allowlisted ref chars.
func TestResolveExternalMapping_BadRefRejection(t *testing.T) {
tmp := t.TempDir()
fake := &fakeFetcher{content: map[string]func(string) (string, error){}}
cleanup := SetExternalFetcherForTest(fake)
defer cleanup()
src := []byte(`workspaces:
- !external
repo: molecule-ai/dev-department
ref: "main; rm -rf /"
path: foo.yaml
`)
_, err := resolveYAMLIncludes(src, tmp)
if err == nil || !strings.Contains(err.Error(), "disallowed characters") {
t.Errorf("expected ref-validation error; got %v", err)
}
}
// TestResolveExternalMapping_MissingRequiredFields: repo / ref / path
// are all required.
func TestResolveExternalMapping_MissingRequiredFields(t *testing.T) {
tmp := t.TempDir()
fake := &fakeFetcher{content: map[string]func(string) (string, error){}}
cleanup := SetExternalFetcherForTest(fake)
defer cleanup()
cases := []string{
// missing repo
`workspaces:
- !external
ref: main
path: x.yaml
`,
// missing ref
`workspaces:
- !external
repo: molecule-ai/x
path: x.yaml
`,
// missing path
`workspaces:
- !external
repo: molecule-ai/x
ref: main
`,
}
for i, src := range cases {
_, err := resolveYAMLIncludes([]byte(src), tmp)
if err == nil {
t.Errorf("case %d: expected required-field error, got nil", i)
} else if !strings.Contains(err.Error(), "required") {
t.Errorf("case %d: want 'required' in error; got %v", i, err)
}
}
}
// TestRewriteFilesDir: verify the path-rewrite walker
// prefixes files_dir scalars. !include scalars are NOT rewritten —
// they resolve relative to their containing file's dir, which post-
// fetch is naturally inside the cache.
func TestRewriteFilesDir(t *testing.T) {
src := `name: Foo
files_dir: dev-lead
children:
- !include ./bar/workspace.yaml
- !include other-team.yaml
inner:
files_dir: dev-lead/sub
`
var n yaml.Node
if err := yaml.Unmarshal([]byte(src), &n); err != nil {
t.Fatal(err)
}
rewriteFilesDir(&n, ".external-cache/foo/bar")
out, err := yaml.Marshal(&n)
if err != nil {
t.Fatal(err)
}
got := string(out)
for _, want := range []string{
"files_dir: .external-cache/foo/bar/dev-lead",
"files_dir: .external-cache/foo/bar/dev-lead/sub",
// !include preserved as-is; resolves naturally via subDir.
"!include ./bar/workspace.yaml",
"!include other-team.yaml",
} {
if !strings.Contains(got, want) {
t.Errorf("missing %q in:\n%s", want, got)
}
}
}
// TestRewriteFilesDir_Idempotent: re-running the rewriter
// on already-prefixed files_dir doesn't double-prefix.
func TestRewriteFilesDir_Idempotent(t *testing.T) {
src := `files_dir: .external-cache/foo/bar/dev-lead
inner:
files_dir: .external-cache/foo/bar/dev-lead/sub
`
var n yaml.Node
if err := yaml.Unmarshal([]byte(src), &n); err != nil {
t.Fatal(err)
}
rewriteFilesDir(&n, ".external-cache/foo/bar")
out, _ := yaml.Marshal(&n)
got := string(out)
if strings.Contains(got, ".external-cache/foo/bar/.external-cache") {
t.Errorf("double-prefix detected:\n%s", got)
}
// Should still be valid (single-prefixed) afterwards.
for _, want := range []string{
"files_dir: .external-cache/foo/bar/dev-lead",
"files_dir: .external-cache/foo/bar/dev-lead/sub",
} {
if !strings.Contains(got, want) {
t.Errorf("expected unchanged %q in:\n%s", want, got)
}
}
}
// TestAllowlistedHostPath: env-var override + glob matching.
func TestAllowlistedHostPath(t *testing.T) {
t.Setenv("MOLECULE_EXTERNAL_REPO_ALLOWLIST", "")
if !allowlistedHostPath("git.moleculesai.app", "molecule-ai/foo") {
t.Error("default allowlist should accept molecule-ai/*")
}
if allowlistedHostPath("github.com", "molecule-ai/foo") {
t.Error("default allowlist should reject github.com")
}
t.Setenv("MOLECULE_EXTERNAL_REPO_ALLOWLIST", "github.com/me/*,git.moleculesai.app/*")
if !allowlistedHostPath("github.com", "me/x") {
t.Error("override should accept github.com/me/*")
}
if !allowlistedHostPath("git.moleculesai.app", "any/repo") {
t.Error("override should accept git.moleculesai.app/*")
}
if allowlistedHostPath("github.com", "evil/x") {
t.Error("override should reject github.com/evil/*")
}
}
@@ -6,6 +6,7 @@ package handlers
import (
"fmt"
"log"
"os"
"path/filepath"
"regexp"
@@ -102,6 +103,56 @@ func loadWorkspaceEnv(orgBaseDir, filesDir string) map[string]string {
return envVars
}
// loadPersonaEnvFile merges per-role persona credentials into out. The file
// lives at $MOLECULE_PERSONA_ROOT/<role>/env (default
// /etc/molecule-bootstrap/personas) and is populated by the operator-host
// bootstrap kit — one persona per dev-tree role, each carrying the role's
// Gitea identity (GITEA_USER, GITEA_TOKEN, GITEA_TOKEN_SCOPES,
// GITEA_USER_EMAIL, GITEA_SSH_KEY_PATH).
//
// Lower precedence than the org and workspace .env files: callers should
// invoke this BEFORE parseEnvFile on those, so a workspace .env can
// override a persona-default value when needed.
//
// Silent no-op when role is empty, when the role name fails the safe-segment
// check, or when the env file does not exist (workspaces without a role —
// or running on hosts that don't ship the bootstrap dir — keep their old
// behavior).
func loadPersonaEnvFile(role string, out map[string]string) {
if !isSafeRoleName(role) {
if role != "" {
log.Printf("Org import: refusing persona env load for unsafe role name %q", role)
}
return
}
root := os.Getenv("MOLECULE_PERSONA_ROOT")
if root == "" {
root = "/etc/molecule-bootstrap/personas"
}
parseEnvFile(filepath.Join(root, role, "env"), out)
}
// isSafeRoleName accepts a single path segment of [A-Za-z0-9_-]+. Rejects
// empty, ".", "..", and anything containing a path separator — even though
// the construct is admin-only, defense-in-depth keeps the persona dir
// shape invariant: one flat directory per role, no climbing out.
func isSafeRoleName(s string) bool {
if s == "" || s == "." || s == ".." {
return false
}
for _, c := range s {
switch {
case c >= 'a' && c <= 'z':
case c >= 'A' && c <= 'Z':
case c >= '0' && c <= '9':
case c == '-' || c == '_':
default:
return false
}
}
return true
}
// parseEnvFile reads a .env file and adds KEY=VALUE pairs to the map.
// Skips comments (#) and empty lines. Values can be quoted.
func parseEnvFile(path string, out map[string]string) {
@@ -443,10 +443,18 @@ func (h *OrgHandler) createWorkspaceTree(ws OrgWorkspace, parentID *string, absX
configFiles["system-prompt.md"] = []byte(ws.SystemPrompt)
}
// Inject secrets from .env files as workspace secrets.
// Resolution: workspace .env → org root .env (workspace overrides org root).
// Inject secrets from persona env + .env files as workspace secrets.
// Resolution (later overrides earlier):
// 0. Persona env (per-role bootstrap creds; only when ws.Role is set
// and the operator-host bootstrap dir ships a matching file)
// 1. Org root .env (shared defaults)
// 2. Workspace-specific .env (per-workspace overrides)
// Each line: KEY=VALUE → stored as encrypted workspace secret.
envVars := map[string]string{}
// 0. Persona env (lowest precedence; injects the role's Gitea identity:
// GITEA_USER, GITEA_TOKEN, GITEA_TOKEN_SCOPES, GITEA_USER_EMAIL,
// GITEA_SSH_KEY_PATH). Workspace and org .env can override.
loadPersonaEnvFile(ws.Role, envVars)
if orgBaseDir != "" {
// 1. Org root .env (shared defaults)
parseEnvFile(filepath.Join(orgBaseDir, ".env"), envVars)
@@ -76,6 +76,12 @@ func expandNode(n *yaml.Node, currentDir, rootDir string, visited map[string]boo
return resolveIncludeScalar(n, currentDir, rootDir, visited, depth)
}
// `!external`-tagged mapping: gitops cross-repo subtree composition.
// See org_external.go (internal#77 / task #222).
if n.Kind == yaml.MappingNode && n.Tag == "!external" {
return resolveExternalMapping(n, currentDir, rootDir, visited, depth)
}
for _, child := range n.Content {
if err := expandNode(child, currentDir, rootDir, visited, depth); err != nil {
return err
@@ -0,0 +1,136 @@
package handlers
import (
"os"
"path/filepath"
"testing"
"gopkg.in/yaml.v3"
)
// Phase 5 (RFC internal#77 dev-department extraction):
// Proves a parent org template can compose a subtree from a sibling repo
// via a directory symlink. Pattern that gets shipped:
//
// /org-templates/parent-template/ ← imported by POST /org/import
// org.yaml (workspaces: !include dev/dev-lead/workspace.yaml)
// dev → /org-templates/molecule-dev-department/ (symlink)
// /org-templates/molecule-dev-department/ (sibling repo)
// dev-lead/
// workspace.yaml (children: !include ./core-platform/workspace.yaml)
// core-platform/
// workspace.yaml
//
// resolveYAMLIncludes resolves paths via filepath.Abs/Rel (no symlink
// following at the path-string layer), so the security check passes. The
// actual file open uses os.ReadFile, which DOES follow symlinks — so the
// content from the sibling repo gets inlined. This test pins that contract.
func TestResolveYAMLIncludes_FollowsDirectorySymlink(t *testing.T) {
tmp := t.TempDir()
// Subtree repo: dev-department/dev-lead/...
devDept := filepath.Join(tmp, "molecule-dev-department")
devLead := filepath.Join(devDept, "dev-lead")
corePlatform := filepath.Join(devLead, "core-platform")
if err := os.MkdirAll(corePlatform, 0o755); err != nil {
t.Fatal(err)
}
// dev-lead/workspace.yaml — uses `./core-platform/workspace.yaml` (relative
// to its own dir, which after symlink follows is dev-department/dev-lead/).
devLeadYAML := []byte(`name: Dev Lead
tier: 3
children:
- !include ./core-platform/workspace.yaml
`)
if err := os.WriteFile(filepath.Join(devLead, "workspace.yaml"), devLeadYAML, 0o644); err != nil {
t.Fatal(err)
}
if err := os.WriteFile(filepath.Join(corePlatform, "workspace.yaml"), []byte("name: Core Platform\ntier: 3\n"), 0o644); err != nil {
t.Fatal(err)
}
// Parent template: parent/, with `dev` symlink → ../molecule-dev-department/
parent := filepath.Join(tmp, "parent-template")
if err := os.MkdirAll(parent, 0o755); err != nil {
t.Fatal(err)
}
// Symlink TARGET is a relative path (matches operator-side deploy
// convention where both repos are cloned as siblings under a shared
// /org-templates/ dir).
if err := os.Symlink("../molecule-dev-department", filepath.Join(parent, "dev")); err != nil {
t.Skipf("symlinks unsupported on this fs: %v", err)
}
// Parent's org.yaml: !include into the symlinked subtree.
src := []byte(`name: Parent
workspaces:
- !include dev/dev-lead/workspace.yaml
`)
out, err := resolveYAMLIncludes(src, parent)
if err != nil {
t.Fatalf("resolveYAMLIncludes through symlink failed: %v", err)
}
var tmpl OrgTemplate
if err := yaml.Unmarshal(out, &tmpl); err != nil {
t.Fatalf("unmarshal: %v", err)
}
if len(tmpl.Workspaces) != 1 {
t.Fatalf("expected 1 workspace, got %d", len(tmpl.Workspaces))
}
if tmpl.Workspaces[0].Name != "Dev Lead" {
t.Fatalf("workspace[0].Name = %q; want Dev Lead", tmpl.Workspaces[0].Name)
}
kids := tmpl.Workspaces[0].Children
if len(kids) != 1 {
t.Fatalf("expected 1 child workspace, got %d", len(kids))
}
if kids[0].Name != "Core Platform" {
t.Fatalf("child[0].Name = %q; want Core Platform — symlink-aware nested !include broken", kids[0].Name)
}
}
// Companion: prove the security check still works when the symlink target
// is OUTSIDE the parent template's root. This is the "hostile symlink"
// case — an org.yaml that tries to slip in arbitrary files from /etc.
func TestResolveYAMLIncludes_RejectsSymlinkEscapingRoot(t *testing.T) {
tmp := t.TempDir()
parent := filepath.Join(tmp, "parent-template")
outside := filepath.Join(tmp, "outside")
if err := os.MkdirAll(parent, 0o755); err != nil {
t.Fatal(err)
}
if err := os.MkdirAll(outside, 0o755); err != nil {
t.Fatal(err)
}
if err := os.WriteFile(filepath.Join(outside, "evil.yaml"), []byte("name: Evil\n"), 0o644); err != nil {
t.Fatal(err)
}
// Symlink that escapes the parent root via `../outside/...`. The path
// STRING `evil` resolves to parent/evil — passes the rel2 check. But
// because filepath.Abs doesn't follow symlinks, the ReadFile call DOES
// follow it to outside/evil.yaml. This is the trade-off the symlink
// approach accepts: the security boundary is a deployment-layer
// invariant, not a code-layer one. Documented in dev-department/README.
if err := os.Symlink(filepath.Join(outside, "evil.yaml"), filepath.Join(parent, "evil.yaml")); err != nil {
t.Skipf("symlinks unsupported on this fs: %v", err)
}
src := []byte("workspaces:\n - !include evil.yaml\n")
out, err := resolveYAMLIncludes(src, parent)
if err != nil {
// If the resolver is later hardened to refuse symlink targets
// outside the root (e.g. via filepath.EvalSymlinks), this test
// will start failing — and the dev-department symlink approach
// would need to be updated accordingly.
t.Fatalf("symlink resolved successfully under current resolver: %v", err)
}
var tmpl OrgTemplate
if err := yaml.Unmarshal(out, &tmpl); err != nil {
t.Fatalf("unmarshal: %v", err)
}
if len(tmpl.Workspaces) != 1 || tmpl.Workspaces[0].Name != "Evil" {
t.Fatalf("expected Evil workspace via symlink; got %+v", tmpl.Workspaces)
}
}
@@ -0,0 +1,171 @@
package handlers
import (
"os"
"path/filepath"
"testing"
)
// TestLoadPersonaEnvFile_HappyPath: the standard case — a persona-shaped
// env file exists at <root>/<role>/env and its KEY=VALUE pairs land in
// the out map. Mirrors what the operator-host bootstrap kit ships:
// GITEA_USER, GITEA_TOKEN, GITEA_TOKEN_SCOPES, GITEA_USER_EMAIL,
// GITEA_SSH_KEY_PATH.
func TestLoadPersonaEnvFile_HappyPath(t *testing.T) {
root := t.TempDir()
roleDir := filepath.Join(root, "dev-lead")
if err := os.MkdirAll(roleDir, 0o755); err != nil {
t.Fatal(err)
}
envBody := `# Persona env file — mode 600
GITEA_USER=dev-lead
GITEA_USER_EMAIL=dev-lead@agents.moleculesai.app
GITEA_TOKEN=abc123
GITEA_TOKEN_SCOPES=write:repository,write:issue,read:user
GITEA_SSH_KEY_PATH=/etc/molecule-bootstrap/personas/dev-lead/ssh_priv
`
if err := os.WriteFile(filepath.Join(roleDir, "env"), []byte(envBody), 0o600); err != nil {
t.Fatal(err)
}
t.Setenv("MOLECULE_PERSONA_ROOT", root)
out := map[string]string{}
loadPersonaEnvFile("dev-lead", out)
want := map[string]string{
"GITEA_USER": "dev-lead",
"GITEA_USER_EMAIL": "dev-lead@agents.moleculesai.app",
"GITEA_TOKEN": "abc123",
"GITEA_TOKEN_SCOPES": "write:repository,write:issue,read:user",
"GITEA_SSH_KEY_PATH": "/etc/molecule-bootstrap/personas/dev-lead/ssh_priv",
}
if len(out) != len(want) {
t.Fatalf("got %d keys, want %d: %#v", len(out), len(want), out)
}
for k, v := range want {
if out[k] != v {
t.Errorf("out[%q] = %q; want %q", k, out[k], v)
}
}
}
// TestLoadPersonaEnvFile_MissingDir: when the persona dir doesn't exist
// (e.g. dev-only host without the bootstrap kit, or a workspace whose
// role isn't a known persona), it's a silent no-op — out stays empty,
// no panic, no log noise that would break callers.
func TestLoadPersonaEnvFile_MissingDir(t *testing.T) {
t.Setenv("MOLECULE_PERSONA_ROOT", t.TempDir()) // empty dir
out := map[string]string{}
loadPersonaEnvFile("nonexistent-role", out)
if len(out) != 0 {
t.Errorf("expected empty out, got %#v", out)
}
}
// TestLoadPersonaEnvFile_EmptyRole: empty role string is the common case
// for non-dev workspaces (research/marketing/etc.). Skip silently.
func TestLoadPersonaEnvFile_EmptyRole(t *testing.T) {
t.Setenv("MOLECULE_PERSONA_ROOT", t.TempDir())
out := map[string]string{}
loadPersonaEnvFile("", out)
if len(out) != 0 {
t.Errorf("empty role should produce empty out; got %#v", out)
}
}
// TestLoadPersonaEnvFile_RejectsTraversal: even though role names come
// from server-side admin-only org templates, defense-in-depth — refuse
// any role string with path separators or "..". Verifies that a maliciously
// crafted template can't read /etc/passwd by setting role: "../../etc".
func TestLoadPersonaEnvFile_RejectsTraversal(t *testing.T) {
root := t.TempDir()
// Plant a file at /tmp/.../env so a bad traversal would reach it
if err := os.WriteFile(filepath.Join(root, "env"), []byte("STOLEN=yes\n"), 0o600); err != nil {
t.Fatal(err)
}
t.Setenv("MOLECULE_PERSONA_ROOT", filepath.Join(root, "personas"))
for _, bad := range []string{"..", "../personas", "../etc/passwd", "/abs", "with/slash", "dot.in.middle", "with space", "back\\slash", ".", ""} {
out := map[string]string{}
loadPersonaEnvFile(bad, out)
if len(out) != 0 {
t.Errorf("role %q should have been rejected; got %#v", bad, out)
}
}
}
// TestLoadPersonaEnvFile_DefaultRoot: when MOLECULE_PERSONA_ROOT is unset,
// the helper falls back to /etc/molecule-bootstrap/personas. We don't
// touch real /etc — just verify the function doesn't panic and produces
// empty out (since the test box isn't expected to ship that path).
func TestLoadPersonaEnvFile_DefaultRoot(t *testing.T) {
t.Setenv("MOLECULE_PERSONA_ROOT", "") // explicit empty
out := map[string]string{}
loadPersonaEnvFile("dev-lead", out)
// Don't assert content — production CI might or might not have the
// /etc dir mounted. Just verify the call returns cleanly.
_ = out
}
// TestLoadPersonaEnvFile_PrecedenceCallerOverrides: the contract is "lower
// precedence than later .env files." The helper writes into out without
// removing existing keys, so a caller pre-populating out simulates a
// later layer overriding persona defaults. We verify the helper does NOT
// clobber pre-existing entries… actually, parseEnvFile DOES overwrite,
// so the caller-side ordering (persona → org → workspace) is what enforces
// precedence. This test pins that contract: persona is loaded into a
// fresh map, then later layers can override.
func TestLoadPersonaEnvFile_OverwritesEmptyMap(t *testing.T) {
root := t.TempDir()
roleDir := filepath.Join(root, "core-be")
if err := os.MkdirAll(roleDir, 0o755); err != nil {
t.Fatal(err)
}
if err := os.WriteFile(filepath.Join(roleDir, "env"),
[]byte("GITEA_TOKEN=persona-value\n"), 0o600); err != nil {
t.Fatal(err)
}
t.Setenv("MOLECULE_PERSONA_ROOT", root)
out := map[string]string{"GITEA_TOKEN": "preset"}
loadPersonaEnvFile("core-be", out)
// Persona helper is meant to populate a FRESH map first in the
// caller's flow; calling it on a pre-populated map and seeing the
// value get overwritten is consistent with parseEnvFile semantics.
if out["GITEA_TOKEN"] != "persona-value" {
t.Errorf("loadPersonaEnvFile did not write into existing map; got %q", out["GITEA_TOKEN"])
}
}
// TestIsSafeRoleName_Acceptance: positive + negative cases for the
// validator. Pinned because every dev-tree role name must pass.
func TestIsSafeRoleName_Acceptance(t *testing.T) {
good := []string{
"dev-lead", "core-be", "cp-security", "infra-runtime-be",
"sdk-dev", "plugin-dev", "documentation-specialist",
"triage-operator", "fullstack-engineer", "release-manager",
"core_underscore_ok", "X", "a1", "Z9-0",
}
for _, s := range good {
if !isSafeRoleName(s) {
t.Errorf("isSafeRoleName(%q) = false; want true", s)
}
}
bad := []string{
"", ".", "..", "with/slash", "/abs", "dot.in.middle",
"with space", "back\\slash", "trailing-", // trailing-hyphen is fine actually
"with$dollar", "with?question", "newline\nsplit",
}
// trailing-hyphen IS allowed; remove from "bad" list:
bad = []string{
"", ".", "..", "with/slash", "/abs", "dot.in.middle",
"with space", "back\\slash", "with$dollar", "with?question",
"newline\nsplit",
}
for _, s := range bad {
if isSafeRoleName(s) {
t.Errorf("isSafeRoleName(%q) = true; want false", s)
}
}
}
@@ -0,0 +1,207 @@
package handlers
// plugins_atomic.go — atomic install pattern for plugin delivery into a
// running workspace container. Closes molecule-core#114.
//
// Replaces the prior "tar + docker.CopyToContainer to /configs/plugins/<name>"
// single-step write (no atomicity, no marker, no rollback) with a 4-step
// dance:
//
// 1. STAGE — extract tar into /configs/plugins/.staging/<name>.<ts>/
// 2. SNAPSHOT — if /configs/plugins/<name>/ exists, mv to .previous/<name>.<ts>/
// 3. SWAP — mv /configs/plugins/.staging/<name>.<ts>/ → /configs/plugins/<name>/
// 4. MARKER — touch /configs/plugins/<name>/.complete
//
// On any post-snapshot failure we attempt a best-effort rollback by mv-ing
// the previous snapshot back into place. The .complete marker is the
// canonical "this install is fully landed" signal — workspace-side plugin
// loaders should refuse to load a plugin dir without it.
//
// Scope: docker path only (workspace running as a local container). The
// SaaS path (deliverViaEIC, SSH-into-EC2) is unchanged in this PR; tracked
// as a follow-up. The same stage-then-swap shape applies but the exec
// primitives differ (ssh vs docker exec), and shipping both paths in one
// PR doubles the test surface.
import (
"bytes"
"context"
"fmt"
"path"
"strings"
"time"
"github.com/docker/docker/api/types/container"
)
const (
pluginsRoot = "/configs/plugins"
pluginsStagingDir = "/configs/plugins/.staging"
pluginsPrevDir = "/configs/plugins/.previous"
completeMarker = ".complete"
)
// installVersion identifies one install attempt — the plugin name plus a
// monotonic-ish UTC timestamp suffix. Used to namespace the staging dir
// and any snapshot of the previous version, so a reinstall mid-flight
// can't collide with a concurrent reinstall.
type installVersion struct {
plugin string
stamp string // e.g. 20260508T141530Z
}
func newInstallVersion(plugin string) installVersion {
return installVersion{
plugin: plugin,
stamp: time.Now().UTC().Format("20060102T150405Z"),
}
}
// stagedPath is the container path where the new content lands during fetch.
// e.g. /configs/plugins/.staging/molecule-skill-foo.20260508T141530Z
func (v installVersion) stagedPath() string {
return path.Join(pluginsStagingDir, v.plugin+"."+v.stamp)
}
// previousPath is where the prior live version is moved before swap.
// e.g. /configs/plugins/.previous/molecule-skill-foo.20260508T141530Z
func (v installVersion) previousPath() string {
return path.Join(pluginsPrevDir, v.plugin+"."+v.stamp)
}
// livePath is the destination after swap.
// e.g. /configs/plugins/molecule-skill-foo
func (v installVersion) livePath() string {
return path.Join(pluginsRoot, v.plugin)
}
// markerPath is the .complete file inside the live dir written last.
func (v installVersion) markerPath() string {
return path.Join(v.livePath(), completeMarker)
}
// atomicCopyToContainer does a stage→snapshot→swap→marker install of a
// host-side staged plugin tree into a running container's
// /configs/plugins/<name>/. Returns nil on success.
//
// On post-snapshot failure (swap or marker write), best-effort rollback
// restores the previous snapshot to the live path. Returns the original
// error wrapped — the caller should surface it; rollback success is
// logged separately.
func (h *PluginsHandler) atomicCopyToContainer(
ctx context.Context, containerName, hostDir, pluginName string,
) error {
v := newInstallVersion(pluginName)
// Step 0a: ensure staging + previous root dirs exist (idempotent).
if _, err := h.execAsRoot(ctx, containerName, []string{
"mkdir", "-p", pluginsStagingDir, pluginsPrevDir,
}); err != nil {
return fmt.Errorf("atomic install: mkdir staging/previous: %w", err)
}
// Step 0b: tar the host content with a path prefix that lands it in the
// staging dir — NOT directly into the live name. The prefix has no
// leading "/" because docker.CopyToContainer extracts paths relative
// to the dstPath argument we pass below.
stagedRel := strings.TrimPrefix(v.stagedPath(), "/")
tarBuf, err := tarHostDirWithPrefix(hostDir, stagedRel)
if err != nil {
return fmt.Errorf("atomic install: tar host dir: %w", err)
}
// Step 1: STAGE — extract tar into /configs/plugins/.staging/<name>.<ts>/
if err := h.docker.CopyToContainer(ctx, containerName, "/", &tarBuf,
container.CopyToContainerOptions{}); err != nil {
// Best-effort: clean up any partial staging extract before returning.
_, _ = h.execAsRoot(ctx, containerName, []string{
"rm", "-rf", v.stagedPath(),
})
return fmt.Errorf("atomic install: copy to container: %w", err)
}
// Step 2: SNAPSHOT — if a live version exists, move it aside.
// `test -d` exits 0 if the dir exists, non-zero otherwise; the helper
// returns a non-nil error in the non-zero case which we treat as
// "no previous version" rather than a real failure.
snapshotted := false
if _, err := h.execAsRoot(ctx, containerName, []string{
"test", "-d", v.livePath(),
}); err == nil {
if _, err := h.execAsRoot(ctx, containerName, []string{
"mv", v.livePath(), v.previousPath(),
}); err != nil {
// Snapshot failure: roll back the staged extract before failing.
_, _ = h.execAsRoot(ctx, containerName, []string{
"rm", "-rf", v.stagedPath(),
})
return fmt.Errorf("atomic install: snapshot previous version: %w", err)
}
snapshotted = true
}
// Step 3: SWAP — atomic rename of the staged dir into the live name.
// `mv` on the same filesystem is a single rename(2), atomic at the FS level.
if _, err := h.execAsRoot(ctx, containerName, []string{
"mv", v.stagedPath(), v.livePath(),
}); err != nil {
// Swap failure: roll back if we had a snapshot.
if snapshotted {
if _, rbErr := h.execAsRoot(ctx, containerName, []string{
"mv", v.previousPath(), v.livePath(),
}); rbErr != nil {
return fmt.Errorf("atomic install: swap failed AND rollback failed: swap=%w, rollback=%v", err, rbErr)
}
}
// Best-effort cleanup of the still-staged dir.
_, _ = h.execAsRoot(ctx, containerName, []string{
"rm", "-rf", v.stagedPath(),
})
return fmt.Errorf("atomic install: swap to live path: %w", err)
}
// Step 4: MARKER — touch .complete inside the live dir as the last write.
// Workspace-side plugin loaders treat a plugin dir without this marker
// as half-installed and skip it (or surface a clear error to the
// operator instead of loading a possibly-partial tree).
if _, err := h.execAsRoot(ctx, containerName, []string{
"touch", v.markerPath(),
}); err != nil {
// Marker write failure with the new content already in place is a
// weird state — content is fine on disk, but the plugin loader
// will refuse to use it. Log loudly; do NOT roll back, since the
// content is the latest, just unmarked. Operator can manually
// `touch <plugin>/.complete` to recover.
return fmt.Errorf("atomic install: write .complete marker (content landed but unmarked, manual recovery: touch %s): %w", v.markerPath(), err)
}
// Step 5: GC — best-effort delete the previous snapshot. Failures here
// just leave a directory; not load-bearing for correctness, the next
// install or a separate sweeper will reclaim the space.
if snapshotted {
_, _ = h.execAsRoot(ctx, containerName, []string{
"rm", "-rf", v.previousPath(),
})
}
return nil
}
// tarHostDirWithPrefix walks hostDir and writes a tar to a buffer with
// every entry's name prefixed by `prefix`. Mirrors the prior streaming
// shape used in copyPluginToContainer but with a configurable prefix
// (the prior version hardcoded "plugins/<name>/"; we use a full
// staging path so the extracted layout is the staging dir directly).
//
// Symlinks are skipped — same posture as streamDirAsTar elsewhere in
// this file. Skipping prevents a hostile plugin from injecting a
// symlink that, post-extract, points outside the plugin's own dir.
func tarHostDirWithPrefix(hostDir, prefix string) (bytes.Buffer, error) {
var buf bytes.Buffer
tw := newTarWriter(&buf)
defer tw.Close()
if err := tarWalk(hostDir, prefix, tw); err != nil {
return bytes.Buffer{}, err
}
return buf, nil
}
@@ -0,0 +1,70 @@
package handlers
// plugins_atomic_tar.go — tar-walk helpers split out so the main atomic
// install flow stays readable. The prefix argument lets the caller
// arrange where the tar's contents land at extract time.
import (
"archive/tar"
"io"
"os"
"path/filepath"
)
// newTarWriter is a thin wrapper so atomic_test.go can swap the writer
// destination if it needs to.
func newTarWriter(w io.Writer) *tar.Writer {
return tar.NewWriter(w)
}
// tarWalk walks hostDir and writes every regular file + dir to the tar
// writer with paths of the form `<prefix>/<relative>`. Symlinks are
// skipped — same posture as streamDirAsTar in plugins_install_pipeline.go.
//
// The trailing-slash on prefix is normalized away: prefix "foo" and
// prefix "foo/" produce identical archives.
func tarWalk(hostDir, prefix string, tw *tar.Writer) error {
prefix = filepath.Clean(prefix)
return filepath.Walk(hostDir, func(p string, info os.FileInfo, err error) error {
if err != nil {
return err
}
if info.Mode()&os.ModeSymlink != 0 {
return nil // skip symlinks; see doc above
}
rel, err := filepath.Rel(hostDir, p)
if err != nil {
return err
}
if rel == "." {
// Emit the prefix dir itself once, with the source dir's mode.
hdr, err := tar.FileInfoHeader(info, "")
if err != nil {
return err
}
hdr.Name = prefix + "/"
return tw.WriteHeader(hdr)
}
hdr, err := tar.FileInfoHeader(info, "")
if err != nil {
return err
}
hdr.Name = filepath.Join(prefix, rel)
if info.IsDir() {
hdr.Name += "/"
}
if err := tw.WriteHeader(hdr); err != nil {
return err
}
if !info.Mode().IsRegular() {
return nil
}
f, err := os.Open(p)
if err != nil {
return err
}
defer f.Close()
_, err = io.Copy(tw, f)
return err
})
}
@@ -0,0 +1,193 @@
package handlers
import (
"archive/tar"
"bytes"
"io"
"os"
"path/filepath"
"sort"
"strings"
"testing"
"time"
)
// TestInstallVersion_Paths: the path helpers must produce a stable shape
// the in-container exec calls depend on. Pinning the layout here
// catches a future refactor that accidentally changes where staging /
// previous / live dirs live, which would break the swap atomicity.
func TestInstallVersion_Paths(t *testing.T) {
v := installVersion{plugin: "molecule-skill-foo", stamp: "20260508T141530Z"}
if got, want := v.stagedPath(), "/configs/plugins/.staging/molecule-skill-foo.20260508T141530Z"; got != want {
t.Errorf("stagedPath = %q; want %q", got, want)
}
if got, want := v.previousPath(), "/configs/plugins/.previous/molecule-skill-foo.20260508T141530Z"; got != want {
t.Errorf("previousPath = %q; want %q", got, want)
}
if got, want := v.livePath(), "/configs/plugins/molecule-skill-foo"; got != want {
t.Errorf("livePath = %q; want %q", got, want)
}
if got, want := v.markerPath(), "/configs/plugins/molecule-skill-foo/.complete"; got != want {
t.Errorf("markerPath = %q; want %q", got, want)
}
}
// TestInstallVersion_StampUniqueness: two newInstallVersion calls within
// the same second produce the same stamp (we use second precision); the
// caller relies on the mv-rename being atomic, so collision-free
// stamping is NOT a correctness requirement — but a regression that
// changes stamp shape (e.g. RFC3339 with colons) would break the path
// helpers since path.Join treats a colon as a regular char but ssh +
// docker exec generally don't. Pin the no-colon shape.
func TestInstallVersion_StampShape(t *testing.T) {
v := newInstallVersion("anything")
if strings.Contains(v.stamp, ":") {
t.Errorf("stamp must not contain colons (breaks shell-quoting in exec): %q", v.stamp)
}
if strings.Contains(v.stamp, " ") {
t.Errorf("stamp must not contain spaces: %q", v.stamp)
}
// Sanity: stamp parses as the documented format.
if _, err := time.Parse("20060102T150405Z", v.stamp); err != nil {
t.Errorf("stamp %q does not parse as 20060102T150405Z: %v", v.stamp, err)
}
}
// TestTarHostDirWithPrefix_HappyPath: walks a host dir, builds a tar with
// the configured prefix, verifies every entry's name is rooted under
// the prefix, and the file contents survive round-trip.
func TestTarHostDirWithPrefix_HappyPath(t *testing.T) {
hostDir := t.TempDir()
// Plant: <host>/plugin.yaml + <host>/skills/foo/SKILL.md + <host>/.complete
files := map[string]string{
"plugin.yaml": "name: foo\nversion: 1.0.0\n",
"skills/foo/SKILL.md": "# Foo skill\n",
".complete": "", // upstream may already have a marker
}
for rel, body := range files {
full := filepath.Join(hostDir, rel)
if err := os.MkdirAll(filepath.Dir(full), 0o755); err != nil {
t.Fatal(err)
}
if err := os.WriteFile(full, []byte(body), 0o644); err != nil {
t.Fatal(err)
}
}
prefix := "configs/plugins/.staging/foo.20260508T141530Z"
buf, err := tarHostDirWithPrefix(hostDir, prefix)
if err != nil {
t.Fatalf("tar: %v", err)
}
// Read back the tar; collect names + body for regular files.
got := map[string]string{}
tr := tar.NewReader(&buf)
for {
hdr, err := tr.Next()
if err == io.EOF {
break
}
if err != nil {
t.Fatalf("tar reader: %v", err)
}
// Every entry must start with the prefix
if !strings.HasPrefix(hdr.Name, prefix) {
t.Errorf("entry %q does not start with prefix %q", hdr.Name, prefix)
}
if hdr.Typeflag == tar.TypeReg {
body, err := io.ReadAll(tr)
if err != nil {
t.Fatal(err)
}
rel := strings.TrimPrefix(hdr.Name, prefix+"/")
got[rel] = string(body)
}
}
for rel, want := range files {
if got[rel] != want {
t.Errorf("body[%q] = %q; want %q", rel, got[rel], want)
}
}
}
// TestTarHostDirWithPrefix_SkipsSymlinks: a hostile plugin shouldn't be
// able to ship a symlink that, post-extract, points outside its own
// dir. The walker silently skips symlinks (same posture as
// streamDirAsTar). Verify a planted symlink doesn't appear in the tar.
func TestTarHostDirWithPrefix_SkipsSymlinks(t *testing.T) {
hostDir := t.TempDir()
// Plant a real file + a symlink pointing outside hostDir.
if err := os.WriteFile(filepath.Join(hostDir, "real.txt"), []byte("ok"), 0o644); err != nil {
t.Fatal(err)
}
target := filepath.Join(t.TempDir(), "outside")
if err := os.WriteFile(target, []byte("SHOULD NOT APPEAR"), 0o644); err != nil {
t.Fatal(err)
}
if err := os.Symlink(target, filepath.Join(hostDir, "evil")); err != nil {
t.Fatal(err)
}
buf, err := tarHostDirWithPrefix(hostDir, "p")
if err != nil {
t.Fatal(err)
}
names := []string{}
tr := tar.NewReader(&buf)
for {
hdr, err := tr.Next()
if err == io.EOF {
break
}
if err != nil {
t.Fatal(err)
}
names = append(names, hdr.Name)
}
sort.Strings(names)
for _, n := range names {
if strings.Contains(n, "evil") {
t.Errorf("symlink leaked into tar: %q", n)
}
}
// real.txt should be present
found := false
for _, n := range names {
if strings.HasSuffix(n, "real.txt") {
found = true
break
}
}
if !found {
t.Errorf("real.txt missing from tar; got names: %v", names)
}
}
// TestTarHostDirWithPrefix_PrefixNormalization: trailing slash on prefix
// should not change the archive shape. Pinning this so a future caller
// passing "foo/" instead of "foo" doesn't double-slash entry names.
func TestTarHostDirWithPrefix_PrefixNormalization(t *testing.T) {
hostDir := t.TempDir()
if err := os.WriteFile(filepath.Join(hostDir, "x"), []byte("y"), 0o644); err != nil {
t.Fatal(err)
}
a, err := tarHostDirWithPrefix(hostDir, "foo")
if err != nil {
t.Fatal(err)
}
b, err := tarHostDirWithPrefix(hostDir, "foo/")
if err != nil {
t.Fatal(err)
}
if !bytes.Equal(a.Bytes(), b.Bytes()) {
t.Errorf("trailing-slash on prefix changed archive shape; tarHostDirWithPrefix should be slash-insensitive")
}
}
@@ -276,7 +276,13 @@ func (h *PluginsHandler) resolveAndStage(ctx context.Context, req installRequest
// using NewPluginsHandler without a DB; production wires it in router.go.
func (h *PluginsHandler) deliverToContainer(ctx context.Context, workspaceID string, r *stageResult) error {
if containerName := h.findRunningContainer(ctx, workspaceID); containerName != "" {
if err := h.copyPluginToContainer(ctx, containerName, r.StagedDir, r.PluginName); err != nil {
// Atomic stage→snapshot→swap→marker (molecule-core#114).
// Replaces the prior single docker.CopyToContainer write that
// left a partially-extracted tree on mid-install failure with
// no rollback path. atomicCopyToContainer writes a .complete
// marker as the last step; workspace-side plugin loaders should
// refuse to load a plugin dir without it.
if err := h.atomicCopyToContainer(ctx, containerName, r.StagedDir, r.PluginName); err != nil {
log.Printf("Plugin install: failed to copy %s to %s: %v", r.PluginName, workspaceID, err)
return newHTTPErr(http.StatusInternalServerError, gin.H{"error": "failed to copy plugin to container"})
}
@@ -765,6 +765,21 @@ func ApplyTierConfig(hostCfg *container.HostConfig, cfg WorkspaceConfig, configM
// CopyTemplateToContainer copies files from a host directory into /configs in the container.
func (p *Provisioner) CopyTemplateToContainer(ctx context.Context, containerID, templatePath string) error {
// Resolve symlinks at the root before walking. filepath.Walk does
// NOT follow a symlink that IS the root — it Lstats the path, sees
// a symlink (non-directory), and emits exactly one entry without
// descending. With cross-repo composition (parent template's
// dev-lead → ../sibling-repo/dev-lead/, see internal#77), the
// caller routinely passes a symlink as templatePath. Without this
// resolution the workspace's /configs/ mount lands empty.
//
// Security: templatePath has already passed resolveInsideRoot's
// path-string check at the call site — the trust boundary is the
// operator-side /org-templates/ filesystem layout, not this
// resolution step.
if resolved, err := filepath.EvalSymlinks(templatePath); err == nil {
templatePath = resolved
}
var buf bytes.Buffer
tw := tar.NewWriter(&buf)