Merge pull request 'feat(actions): add audit-force-merge composite action' (#5 ) from feat/audit-force-merge-composite-action into main

feat(actions): add audit-force-merge composite action
§SOP-6 force-merge detector, hosted as a Gitea Actions composite action so it can be vendored into every org repo via a single `uses:` line instead of copy-pasting the bash. Source of truth for the audit script logic. Why composite vs reusable workflow: Gitea 1.22.6 doesn't support cross-repo `uses: org/repo/.gitea/workflows/X.yml@ref`. Cross-repo reusable workflows landed in go-gitea/gitea#32562 (1.26.0, Oct 2025) and have not been backported. Composite actions resolve via the actions-fetch path which works cross-repo against a public callee. Re-evaluate when operator host runs Gitea ≥ 1.26. Consumer workflow shape: on: pull_request_target: types: [closed] jobs: audit: if: github.event.pull_request.merged == true runs-on: ubuntu-latest steps: - uses: molecule-ai/molecule-ci/.gitea/actions/audit-force-merge@main with: gitea-token: ${{ secrets.SOP_TIER_CHECK_TOKEN }} repo: ${{ github.repository }} pr-number: ${{ github.event.pull_request.number }} required-checks: | sop-tier-check / tier-check (pull_request) No actions/checkout step needed in the consumer — the audit script does pure API calls, never reads working tree. Removing checkout is also a small security win (PR head code never loaded). Verified end-to-end on internal#123 + molecule-core#150 with the inline copies (which this PR will replace via consumer-side stub PRs once merged). Tier: low.
2026-05-09 03:30:02 +00:00 · 2026-05-08 20:29:40 -07:00 · 2026-05-08 15:52:57 +00:00 · 2026-05-08 08:52:32 -07:00 · 2026-05-07 08:40:43 +00:00 · 2026-05-07 01:37:34 -07:00
19 changed files with 2926 additions and 231 deletions
--- a/.gitea/actions/audit-force-merge/action.yml
+++ b/.gitea/actions/audit-force-merge/action.yml
@ -0,0 +1,55 @@
+name: 'Audit force-merge'
+description: >-
+  §SOP-6 force-merge audit. Detects PRs merged with required-status-checks
+  not green at HEAD SHA and emits incident.force_merge JSON to runner
+  stdout. Vector docker_logs source ships the line to Loki on
+  molecule-canonical-obs (per reference_obs_stack_phase1).
+
+# Why a composite action and not a reusable workflow:
+# Gitea 1.22.6 does NOT support cross-repo `uses: org/repo/.gitea/
+# workflows/X.yml@ref`. Cross-repo reusable workflows landed in
+# go-gitea/gitea PR #32562 in Gitea 1.26.0 (Oct 2025). On 1.22.x the
+# clone fails because act_runner mints a caller-scoped GITEA_TOKEN.
+# Composite actions resolve via the actions-fetch path which works
+# cross-repo on 1.22 against a public callee — that's us. Re-evaluate
+# this choice when the operator host upgrades to Gitea ≥ 1.26.
+
+inputs:
+  gitea-token:
+    description: >-
+      PAT for sop-tier-bot (or equivalent read-only audit identity).
+      Needs read:user,read:repository,read:issue scopes — admin scope
+      is intentionally NOT required.
+    required: true
+  gitea-host:
+    description: 'Gitea host'
+    required: false
+    default: 'git.moleculesai.app'
+  repo:
+    description: 'owner/name; typically ${{ github.repository }}'
+    required: true
+  pr-number:
+    description: 'PR number; typically ${{ github.event.pull_request.number }}'
+    required: true
+  required-checks:
+    description: >-
+      Newline-separated required-status-check context names. Mirror
+      of branch protection's status_check_contexts. Declared at the
+      caller because /branch_protections requires admin scope which
+      this audit identity intentionally does not hold (least-privilege).
+      When the required-check set changes, update both branch
+      protection AND this input.
+    required: true
+
+runs:
+  using: composite
+  steps:
+    - name: Detect force-merge + emit audit event
+      shell: bash
+      env:
+        GITEA_TOKEN: ${{ inputs.gitea-token }}
+        GITEA_HOST: ${{ inputs.gitea-host }}
+        REPO: ${{ inputs.repo }}
+        PR_NUMBER: ${{ inputs.pr-number }}
+        REQUIRED_CHECKS: ${{ inputs.required-checks }}
+      run: bash "$GITHUB_ACTION_PATH/audit.sh"
--- a/.gitea/actions/audit-force-merge/audit.sh
+++ b/.gitea/actions/audit-force-merge/audit.sh
@ -0,0 +1,118 @@
+#!/usr/bin/env bash
+# audit-force-merge — detect a §SOP-6 force-merge on a closed PR, emit
+# `incident.force_merge` to stdout as structured JSON.
+#
+# Invoked by the `audit-force-merge` composite action defined alongside
+# this script (action.yml). Caller workflows fire on
+# `pull_request_target: closed` and gate on `merged == true`. See
+# action.yml for the supported inputs.
+#
+# Vector's docker_logs source picks up runner stdout; the JSON gets
+# shipped to Loki on molecule-canonical-obs, indexable by event_type.
+# Query example:
+#
+#   {host="operator"} |= "event_type" |= "incident.force_merge" | json
+#
+# A force-merge is detected when a merged PR had at least one of the
+# caller-declared required-status-check contexts in a state other than
+# "success" at the PR HEAD. That's exactly what the Gitea
+# force_merge:true API call lets through, so it's a faithful detector
+# of the override path.
+#
+# Required env (set by the composite action via inputs):
+#   GITEA_TOKEN, GITEA_HOST, REPO, PR_NUMBER, REQUIRED_CHECKS
+#
+# REQUIRED_CHECKS is newline-separated context names. Declared by the
+# caller (mirror of branch protection's status_check_contexts) rather
+# than fetched from /branch_protections, which requires admin scope —
+# the audit identity is intentionally read-only (least-privilege; see
+# memory/feedback_least_privilege_via_workflow_env).
+
+set -euo pipefail
+
+: "${GITEA_TOKEN:?required}"
+: "${GITEA_HOST:?required}"
+: "${REPO:?required}"
+: "${PR_NUMBER:?required}"
+: "${REQUIRED_CHECKS:?required (newline-separated context names)}"
+
+OWNER="${REPO%%/*}"
+NAME="${REPO##*/}"
+API="https://${GITEA_HOST}/api/v1"
+AUTH="Authorization: token ${GITEA_TOKEN}"
+
+# 1. Fetch the PR. If not merged, no-op.
+PR=$(curl -sS -H "$AUTH" "${API}/repos/${OWNER}/${NAME}/pulls/${PR_NUMBER}")
+MERGED=$(echo "$PR" | jq -r '.merged // false')
+if [ "$MERGED" != "true" ]; then
+  echo "::notice::PR #${PR_NUMBER} closed without merge — no audit emission."
+  exit 0
+fi
+
+MERGE_SHA=$(echo "$PR" | jq -r '.merge_commit_sha // empty')
+MERGED_BY=$(echo "$PR" | jq -r '.merged_by.login // "unknown"')
+TITLE=$(echo "$PR" | jq -r '.title // ""')
+BASE_BRANCH=$(echo "$PR" | jq -r '.base.ref // "main"')
+HEAD_SHA=$(echo "$PR" | jq -r '.head.sha // empty')
+
+if [ -z "$MERGE_SHA" ]; then
+  echo "::warning::PR #${PR_NUMBER} merged=true but no merge_commit_sha — cannot evaluate force-merge."
+  exit 0
+fi
+
+# 2. Required status checks declared in the workflow env.
+REQUIRED="$REQUIRED_CHECKS"
+if [ -z "${REQUIRED//[[:space:]]/}" ]; then
+  echo "::notice::REQUIRED_CHECKS empty — force-merge not applicable."
+  exit 0
+fi
+
+# 3. Status-check state at the PR HEAD (where checks ran). The merge
+#    commit doesn't get its own checks; we evaluate the PR's last
+#    commit, which is what branch protection compared against.
+STATUS=$(curl -sS -H "$AUTH" \
+  "${API}/repos/${OWNER}/${NAME}/commits/${HEAD_SHA}/status")
+declare -A CHECK_STATE
+while IFS=$'\t' read -r ctx state; do
+  [ -n "$ctx" ] && CHECK_STATE[$ctx]="$state"
+done < <(echo "$STATUS" | jq -r '.statuses // [] | .[] | "\(.context)\t\(.status)"')
+
+# 4. For each required check, was it green at merge? YAML block scalars
+#    (`|`) leave a trailing newline; skip blank/whitespace-only lines.
+FAILED_CHECKS=()
+while IFS= read -r req; do
+  trimmed="${req#"${req%%[![:space:]]*}"}"   # ltrim
+  trimmed="${trimmed%"${trimmed##*[![:space:]]}"}"  # rtrim
+  [ -z "$trimmed" ] && continue
+  state="${CHECK_STATE[$trimmed]:-missing}"
+  if [ "$state" != "success" ]; then
+    FAILED_CHECKS+=("${trimmed}=${state}")
+  fi
+done <<< "$REQUIRED"
+
+if [ "${#FAILED_CHECKS[@]}" -eq 0 ]; then
+  echo "::notice::PR #${PR_NUMBER} merged with all required checks green — not a force-merge."
+  exit 0
+fi
+
+# 5. Emit structured audit event.
+NOW=$(date -u +%Y-%m-%dT%H:%M:%SZ)
+FAILED_JSON=$(printf '%s\n' "${FAILED_CHECKS[@]}" | jq -R . | jq -s .)
+
+# Print as a single-line JSON so Vector's parse_json transform can pick
+# it up cleanly from docker_logs.
+jq -nc \
+  --arg event_type "incident.force_merge" \
+  --arg ts "$NOW" \
+  --arg repo "$REPO" \
+  --argjson pr "$PR_NUMBER" \
+  --arg title "$TITLE" \
+  --arg base "$BASE_BRANCH" \
+  --arg merged_by "$MERGED_BY" \
+  --arg merge_sha "$MERGE_SHA" \
+  --argjson failed_checks "$FAILED_JSON" \
+  '{event_type: $event_type, ts: $ts, repo: $repo, pr: $pr, title: $title,
+    base_branch: $base, merged_by: $merged_by, merge_sha: $merge_sha,
+    failed_checks: $failed_checks}'
+
+echo "::warning::FORCE-MERGE detected on PR #${PR_NUMBER} by ${MERGED_BY}: ${#FAILED_CHECKS[@]} required check(s) not green at merge time."
--- a/.github/workflows/auto-promote-branch.yml
+++ b/.github/workflows/auto-promote-branch.yml
@ -0,0 +1,219 @@
+name: Auto-promote branch (reusable)
+
+# Reusable version of the auto-promote-staging workflow that lived
+# directly in molecule-ci. Any repo with a `from-branch` (typically
+# `staging`) → `to-branch` (typically `main`) flow can call this
+# workflow to fast-forward `to-branch` whenever `from-branch` is
+# strictly ahead AND all configured required-status-checks on the
+# `from-branch` HEAD are green.
+#
+# Adoption pattern in a consumer repo:
+#
+#   # .github/workflows/auto-promote.yml
+#   name: Auto-promote staging → main
+#   on:
+#     push:
+#       branches: [staging]
+#     workflow_dispatch:
+#   permissions:
+#     contents: write          # push the fast-forward to to-branch
+#     statuses: read           # read commit status checks
+#     administration: read     # read branch protection (REQUIRED — see below)
+#   jobs:
+#     promote:
+#       uses: molecule-ai/molecule-ci/.github/workflows/auto-promote-branch.yml@v1
+#       with:
+#         from-branch: staging
+#         to-branch: main
+#
+# Repo-agnostic by design — gates are read from the consuming repo's
+# branch protection at run time, not hardcoded here.
+#
+# `@v1` is a moving tag pointing at the latest 1.x release of
+# molecule-ci's reusable workflows (GitHub Actions convention, same
+# as `actions/checkout@v4`). Breaking changes get a new `@v2` tag
+# and the old `@v1` keeps working for existing consumers. Pinning to
+# `@main` is also accepted for forward-compat preview but is
+# unstable — any change merged here rolls out instantly to consumers
+# without a release boundary.
+#
+# `administration: read` is REQUIRED. Without it, the branch-protection
+# API returns 403 and the workflow refuses to fast-forward (fail-loud),
+# rather than silently degrading to --ff-only-only enforcement (which
+# is ancestry-only, not test-status — a green-but-flaky branch would
+# ff-promote red commits). If you intentionally want no-gate
+# enforcement, leave from-branch unprotected — a 404 from the API is
+# treated as "no gates configured" and falls back to --ff-only safety.
+#
+# Excluded-by-policy repos (molecule-core + molecule-controlplane per
+# CEO directive 2026-04-24) simply do not adopt this workflow; the
+# reusable shape adds no surface area to repos that don't call it.
+
+on:
+  workflow_call:
+    inputs:
+      from-branch:
+        description: "Source branch with green CI"
+        required: false
+        default: staging
+        type: string
+      to-branch:
+        description: "Target branch to fast-forward"
+        required: false
+        default: main
+        type: string
+
+permissions:
+  contents: write
+  statuses: read
+
+jobs:
+  promote:
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v4
+        with:
+          fetch-depth: 0
+          token: ${{ secrets.GITHUB_TOKEN }}
+
+      - name: Check required gates (if configured) on source HEAD
+        id: gates
+        env:
+          GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
+          REPO: ${{ github.repository }}
+          HEAD_SHA: ${{ github.sha }}
+          FROM_BRANCH: ${{ inputs.from-branch }}
+        shell: bash
+        run: |
+          set -euo pipefail
+
+          # Read required gates from branch protection. Three response
+          # classes, distinguished by HTTP status:
+          #
+          #   200 — branch protection is configured. Honor the gates.
+          #   404 — branch is not protected. Legitimate "no gates";
+          #         fall back to --ff-only as the sole safety net.
+          #   403 — caller's GITHUB_TOKEN can't read branch protection.
+          #         FAIL LOUD. The previous behavior conflated this
+          #         with 404 ("api inaccessible") and silently degraded
+          #         to --ff-only-only — which is ancestry-only, not
+          #         test-status. A green-but-flaky branch would
+          #         ff-promote red commits to the target. The fix:
+          #         require the caller to add `administration: read`
+          #         to its permissions block, or explicitly accept the
+          #         no-gates posture by removing branch protection on
+          #         the source branch.
+          #
+          # `gh api` exit code is 0 only on 2xx; non-zero on anything
+          # else. We use --include to capture HTTP status to discriminate.
+
+          if PROTECTION_RESP=$(gh api -i "repos/${REPO}/branches/${FROM_BRANCH}/protection/required_status_checks" 2>&1); then
+            HTTP_STATUS=200
+          else
+            HTTP_STATUS=$(echo "$PROTECTION_RESP" | grep -oE '^HTTP/[12](\.[01])? [0-9]{3}' | awk '{print $2}' | head -1)
+            HTTP_STATUS=${HTTP_STATUS:-unknown}
+          fi
+
+          case "$HTTP_STATUS" in
+            200)
+              # Strip headers from gh -i output to get just the body.
+              GATES_JSON=$(echo "$PROTECTION_RESP" | awk 'p{print} /^[[:space:]]*$/ && !p {p=1}')
+              ;;
+            404)
+              echo "::notice::No branch protection on '${FROM_BRANCH}' — relying on --ff-only safety."
+              echo "ok=true" >> "$GITHUB_OUTPUT"
+              exit 0
+              ;;
+            403|401)
+              echo "::error::Cannot read branch protection on '${FROM_BRANCH}' (HTTP ${HTTP_STATUS})."
+              echo "::error::Caller's GITHUB_TOKEN lacks 'administration: read' permission."
+              echo "::error::Refusing to fast-forward without explicit gate enforcement —"
+              echo "::error::a silent fallback to --ff-only here would let green-but-flaky"
+              echo "::error::branches promote red commits."
+              echo "::error::"
+              echo "::error::Fix: add to the caller's workflow's permissions block:"
+              echo "::error::  permissions:"
+              echo "::error::    contents: write"
+              echo "::error::    statuses: read"
+              echo "::error::    administration: read"
+              echo "::error::"
+              echo "::error::Or, if you intentionally want no-gate enforcement, remove"
+              echo "::error::branch protection on '${FROM_BRANCH}' so the API returns 404."
+              exit 1
+              ;;
+            *)
+              echo "::error::Unexpected HTTP status '${HTTP_STATUS}' from branch-protection API."
+              echo "::error::Response (first 5 lines):"
+              echo "$PROTECTION_RESP" | head -5 | sed 's/^/::error::  /'
+              exit 1
+              ;;
+          esac
+
+          GATES=$(echo "${GATES_JSON}" | jq -r '.contexts[]?' 2>/dev/null || true)
+
+          if [ -z "$GATES" ]; then
+            echo "::notice::Branch protection on '${FROM_BRANCH}' has zero required-status-checks contexts — relying on --ff-only safety."
+            echo "ok=true" >> "$GITHUB_OUTPUT"
+            exit 0
+          fi
+
+          echo "Required gates on '${FROM_BRANCH}':"
+          echo "${GATES}" | sed 's/^/  - /'
+
+          ALL_GREEN=true
+          while IFS= read -r gate; do
+            [ -z "$gate" ] && continue
+
+            conclusion=$(gh api "repos/${REPO}/commits/${HEAD_SHA}/check-runs" \
+                          --jq "[.check_runs[] | select(.name == \"${gate}\")] | sort_by(.completed_at) | last.conclusion" \
+                          2>/dev/null || echo "")
+
+            if [ -z "$conclusion" ] || [ "$conclusion" = "null" ]; then
+              conclusion=$(gh api "repos/${REPO}/commits/${HEAD_SHA}/status" \
+                            --jq "[.statuses[] | select(.context == \"${gate}\")] | sort_by(.updated_at) | last.state" \
+                            2>/dev/null || echo "")
+            fi
+
+            if [ "$conclusion" != "success" ] && [ "$conclusion" != "SUCCESS" ]; then
+              echo "::warning::Gate '${gate}' is '${conclusion:-missing}' on ${HEAD_SHA} — skipping promote."
+              ALL_GREEN=false
+            else
+              echo "  ✓ ${gate}: success"
+            fi
+          done <<< "$GATES"
+
+          echo "ok=${ALL_GREEN}" >> "$GITHUB_OUTPUT"
+
+      - name: Fast-forward target branch to source HEAD
+        if: steps.gates.outputs.ok == 'true'
+        env:
+          GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
+          FROM_BRANCH: ${{ inputs.from-branch }}
+          TO_BRANCH: ${{ inputs.to-branch }}
+        shell: bash
+        run: |
+          set -euo pipefail
+
+          git config user.email "actions@github.com"
+          git config user.name "github-actions[bot]"
+
+          # Source branch is what's checked out (workflow fires on push to
+          # source). Can't fetch into it. Fetch target into a local target.
+          git fetch origin "${TO_BRANCH}"
+          git checkout -B "${TO_BRANCH}" "origin/${TO_BRANCH}"
+
+          # Check if target is already at or ahead of source.
+          if git merge-base --is-ancestor "origin/${FROM_BRANCH}" "${TO_BRANCH}" 2>/dev/null; then
+            echo "${TO_BRANCH} already contains ${FROM_BRANCH}; nothing to promote."
+            exit 0
+          fi
+
+          # --ff-only refuses if target has independent commits not on
+          # source (divergence — hotfix direct to target). Human resolves.
+          if ! git merge --ff-only "origin/${FROM_BRANCH}" 2>&1; then
+            echo "::warning::${TO_BRANCH} has diverged from ${FROM_BRANCH} — refusing fast-forward. Resolve manually (likely a direct-to-${TO_BRANCH} commit exists that ${FROM_BRANCH} doesn't have)."
+            exit 0
+          fi
+
+          git push origin "${TO_BRANCH}"
+          echo "::notice::Promoted: ${TO_BRANCH} is now at $(git rev-parse --short HEAD)"
--- a/.github/workflows/auto-promote-staging-pr.yml
+++ b/.github/workflows/auto-promote-staging-pr.yml
@ -0,0 +1,262 @@
+name: Auto-promote staging → main (PR-based, reusable)
+
+# Reusable PR-based auto-promote for repos whose `main` branch has
+# protection rules that require status checks "set by the expected
+# GitHub apps" — direct `git push` from a workflow can't satisfy
+# that, only PR merges through the merge queue can.
+#
+# Distinct from the simpler ff-only auto-promote in this same repo
+# (auto-promote-staging.yml): that one does `git merge --ff-only` +
+# direct push and only works on repos WITHOUT required-status-checks.
+# This reusable workflow is for the protected-branch case.
+#
+# Call from each repo's .github/workflows/ via a thin wrapper:
+#
+#   name: Auto-promote staging → main
+#   on:
+#     workflow_run:
+#       workflows: [CI, E2E Staging Canvas, ...]
+#       types: [completed]
+#     workflow_dispatch:
+#       inputs:
+#         force:
+#           description: "Force promote (manual override)"
+#           required: false
+#           default: "false"
+#   permissions:
+#     contents: write
+#     pull-requests: write
+#   jobs:
+#     promote:
+#       uses: molecule-ai/molecule-ci/.github/workflows/auto-promote-staging-pr.yml@v1
+#       with:
+#         gates: "ci.yml,e2e-staging-canvas.yml,e2e-api.yml,codeql.yml"
+#         force: ${{ github.event.inputs.force == 'true' }}
+#       secrets: inherit
+#
+# IMPORTANT: the caller MUST keep the `on.workflow_run.workflows`
+# display-name list in sync with the `gates` input (which uses
+# workflow filenames). The reusable can't validate this — display
+# names and filenames are decoupled in GitHub Actions.
+#
+# Required repo settings (one-time, in the CALLER repo):
+#
+#   Settings → Actions → General → Workflow permissions
+#   → ✅ Allow GitHub Actions to create and approve pull requests
+#
+# Without it, every workflow run fails with:
+#
+#   pull request create failed: GraphQL: GitHub Actions is not
+#   permitted to create or approve pull requests (createPullRequest)
+#
+# Toggle: caller repo variable AUTO_PROMOTE_ENABLED=true. Override
+# via the `enabled-var` input if a different name is needed.
+# When the variable is unset, the workflow logs what it would have
+# done but doesn't open the PR — useful for dry-running the gate
+# logic without surfacing a noisy PR while staging CI is still flaky.
+
+on:
+  workflow_call:
+    inputs:
+      gates:
+        description: >-
+          Comma-separated list of workflow FILENAMES (not display
+          names) that must be conclusion=success on the staging head
+          SHA before promote fires. Example:
+          "ci.yml,e2e-staging-canvas.yml,codeql.yml". File paths are
+          used (not display names) because gh run list with display
+          names is ambiguous when two workflows share a name (observed
+          2026-04-28 with codeql.yml + GitHub UI's Code-quality default
+          setup both surfacing as "CodeQL").
+        required: true
+        type: string
+      target-branch:
+        description: "Target branch to promote TO (default: main)"
+        required: false
+        type: string
+        default: main
+      source-branch:
+        description: "Source branch to promote FROM (default: staging)"
+        required: false
+        type: string
+        default: staging
+      enabled-var:
+        description: >-
+          Repo variable name that gates this workflow. Set this
+          variable to "true" in the caller repo's Settings →
+          Variables → Actions to enable. Defaults to
+          AUTO_PROMOTE_ENABLED.
+        required: false
+        type: string
+        default: AUTO_PROMOTE_ENABLED
+      merge-method:
+        description: >-
+          Merge method for `gh pr merge --auto`. One of merge|squash|
+          rebase. Defaults to "merge" (matches user preference for
+          merge commits over squash).
+        required: false
+        type: string
+        default: merge
+      force:
+        description: >-
+          Skip the AUTO_PROMOTE_ENABLED variable check. Pass true
+          when the caller's workflow_dispatch input is force=true.
+          Default false.
+        required: false
+        type: boolean
+        default: false
+
+jobs:
+  check-all-gates-green:
+    # Only consider promotions for the source branch's push events.
+    # PR runs into the source branch don't promote. workflow_dispatch
+    # passes through unconditionally.
+    if: >
+      (github.event_name == 'workflow_run' &&
+       github.event.workflow_run.head_branch == inputs.source-branch &&
+       github.event.workflow_run.event == 'push')
+      || github.event_name == 'workflow_dispatch'
+    runs-on: ubuntu-latest
+    outputs:
+      all_green: ${{ steps.gates.outputs.all_green }}
+      head_sha: ${{ steps.gates.outputs.head_sha }}
+    steps:
+      - name: Check all required gates on this SHA
+        id: gates
+        env:
+          GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
+          HEAD_SHA: ${{ github.event.workflow_run.head_sha || github.sha }}
+          REPO: ${{ github.repository }}
+          GATES_CSV: ${{ inputs.gates }}
+          SOURCE_BRANCH: ${{ inputs.source-branch }}
+        run: |
+          set -euo pipefail
+
+          # Split the comma-separated gates input. Trim whitespace per
+          # entry so callers can format readably (e.g. "ci.yml, e2e.yml").
+          IFS=',' read -ra GATES <<< "$GATES_CSV"
+
+          echo "head_sha=${HEAD_SHA}" >> "$GITHUB_OUTPUT"
+          echo "Checking gates on SHA ${HEAD_SHA}"
+
+          ALL_GREEN=true
+          for gate_raw in "${GATES[@]}"; do
+            gate="${gate_raw## }"
+            gate="${gate%% }"
+            if [ -z "$gate" ]; then
+              continue
+            fi
+
+            # Query the most recent run of this workflow on this SHA.
+            # event=push to avoid picking up PR runs. branch filter
+            # guards against someone dispatching the gate on a non-
+            # source branch at the same SHA.
+            RESULT=$(gh run list \
+              --repo "$REPO" \
+              --workflow "$gate" \
+              --branch "$SOURCE_BRANCH" \
+              --event push \
+              --commit "$HEAD_SHA" \
+              --limit 1 \
+              --json status,conclusion \
+              --jq '.[0] | "\(.status)/\(.conclusion // "none")"' \
+              2>/dev/null || echo "missing/none")
+
+            echo "  $gate → $RESULT"
+
+            # Only completed/success counts. Anything else aborts.
+            if [ "$RESULT" != "completed/success" ]; then
+              ALL_GREEN=false
+            fi
+          done
+
+          echo "all_green=${ALL_GREEN}" >> "$GITHUB_OUTPUT"
+          if [ "$ALL_GREEN" != "true" ]; then
+            echo "::notice::auto-promote: not all gates are green on ${HEAD_SHA} — staying on current ${{ inputs.target-branch }}"
+          fi
+
+  promote:
+    needs: check-all-gates-green
+    if: needs.check-all-gates-green.outputs.all_green == 'true'
+    runs-on: ubuntu-latest
+    steps:
+      - name: Check rollout gate
+        env:
+          ENABLED_VAR_NAME: ${{ inputs.enabled-var }}
+          ENABLED_VAR_VALUE: ${{ vars[inputs.enabled-var] }}
+          FORCE: ${{ inputs.force }}
+        run: |
+          set -eu
+          # Caller repo controls rollout via the named variable.
+          # Default name is AUTO_PROMOTE_ENABLED; callers can override.
+          if [ "${ENABLED_VAR_VALUE:-}" != "true" ] && [ "${FORCE:-false}" != "true" ]; then
+            {
+              echo "## ⏸ Auto-promote disabled"
+              echo
+              echo "Repo variable \`${ENABLED_VAR_NAME}\` is not set to \`true\`."
+              echo "All gates are green on ${{ inputs.source-branch }}; would have opened a promote PR to \`${{ inputs.target-branch }}\`."
+              echo
+              echo "To enable: Settings → Secrets and variables → Actions → Variables → \`${ENABLED_VAR_NAME}=true\`."
+              echo "To test once manually: workflow_dispatch with \`force=true\`."
+            } >> "$GITHUB_STEP_SUMMARY"
+            echo "::notice::auto-promote disabled — dry run only"
+            exit 0
+          fi
+
+      - name: Open (or reuse) ${{ inputs.source-branch }} → ${{ inputs.target-branch }} promote PR + enable auto-merge
+        if: ${{ vars[inputs.enabled-var] == 'true' || inputs.force == true }}
+        env:
+          GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
+          REPO: ${{ github.repository }}
+          TARGET_SHA: ${{ needs.check-all-gates-green.outputs.head_sha }}
+          SOURCE_BRANCH: ${{ inputs.source-branch }}
+          TARGET_BRANCH: ${{ inputs.target-branch }}
+          MERGE_METHOD: ${{ inputs.merge-method }}
+          GATES_CSV: ${{ inputs.gates }}
+        run: |
+          set -euo pipefail
+
+          # Look for an existing open promote PR (idempotent on re-run).
+          # The PR's head IS the source branch — the whole point is
+          # "advance target to source's tip", so we don't need a per-SHA
+          # branch like auto-sync-main-to-staging.yml uses.
+          PR_NUM=$(gh pr list --repo "$REPO" \
+            --base "$TARGET_BRANCH" --head "$SOURCE_BRANCH" --state open \
+            --json number --jq '.[0].number // ""')
+
+          if [ -z "$PR_NUM" ]; then
+            TITLE="${SOURCE_BRANCH} → ${TARGET_BRANCH}: auto-promote ${TARGET_SHA:0:7}"
+            BODY_FILE=$(mktemp)
+            cat > "$BODY_FILE" <<EOFBODY
+          Automated promotion of \`${SOURCE_BRANCH}\` (\`${TARGET_SHA:0:8}\`) to \`${TARGET_BRANCH}\`. Required gates green at this SHA: ${GATES_CSV}.
+
+          This PR is auto-generated by a thin caller of \`molecule-ai/molecule-ci/.github/workflows/auto-promote-staging-pr.yml\` whenever every required gate completes green on the same source-branch SHA. It exists because protected branches require status checks "set by the expected GitHub apps" — direct \`git push\` from a workflow can't satisfy that, only PR merges through the queue can.
+
+          Merge queue lands this; no human action needed unless gates fail.
+          EOFBODY
+            PR_URL=$(gh pr create --repo "$REPO" \
+              --base "$TARGET_BRANCH" --head "$SOURCE_BRANCH" \
+              --title "$TITLE" \
+              --body-file "$BODY_FILE")
+            PR_NUM=$(echo "$PR_URL" | grep -oE '[0-9]+$' | tail -1)
+            rm -f "$BODY_FILE"
+            echo "::notice::Opened PR #${PR_NUM}"
+          else
+            echo "::notice::Re-using existing promote PR #${PR_NUM}"
+          fi
+
+          # Enable auto-merge — the merge queue picks it up once
+          # required gates are green on the merge_group ref.
+          if ! gh pr merge "$PR_NUM" --repo "$REPO" --auto --"$MERGE_METHOD" 2>&1; then
+            echo "::warning::Failed to enable auto-merge on PR #${PR_NUM} — operator may need to merge manually."
+          fi
+
+          {
+            echo "## ✅ Auto-promote PR opened"
+            echo
+            echo "- Source: \`${SOURCE_BRANCH}\` at \`${TARGET_SHA:0:8}\`"
+            echo "- Target: \`${TARGET_BRANCH}\`"
+            echo "- PR: #${PR_NUM}"
+            echo
+            echo "Merge queue lands the PR once required gates are green; no human action needed unless gates fail."
+          } >> "$GITHUB_STEP_SUMMARY"
--- a/.github/workflows/auto-promote-staging.yml
+++ b/.github/workflows/auto-promote-staging.yml
@ -1,24 +1,14 @@
 name: Auto-promote staging → main

-# Fast-forwards `main` to `staging` when staging is strictly ahead (main
-# is an ancestor). Eliminates the manual sync-PR round for non-critical
-# repos.
+# molecule-ci's own auto-promote: thin wrapper over the reusable
+# `auto-promote-branch.yml` workflow factored out for org-wide reuse.
+# Other repos consume the same reusable workflow via:
 #
-# Gate handling:
-# - If the repo has required_status_checks configured AND the API
-#   returns them, all must be SUCCESS on the staging HEAD commit.
-# - If no gates are configured (or the API 403s on a private free-tier
-#   repo), `--ff-only` is the sole safety. It refuses if main has
-#   independent commits staging doesn't contain.
+#   uses: molecule-ai/molecule-ci/.github/workflows/auto-promote-branch.yml@v1
 #
-# Excluded by policy: molecule-core + molecule-controlplane. Those two
-# stay manual per CEO directive 2026-04-24.
-#
-# Safety:
-# - Only fires on push to staging (PRs into staging don't promote)
-# - `--ff-only` refuses if main has diverged (hotfix landed directly)
-# - Promote commit goes through GITHUB_TOKEN; shows up in git log as
-#   a deliberate act
+# Excluded by policy: molecule-core + molecule-controlplane stay
+# manual per CEO directive 2026-04-24. Those repos do NOT call the
+# reusable workflow.

 on:
  push:
@ -26,94 +16,14 @@ on:
  workflow_dispatch:

 permissions:
-  contents: write
-  statuses: read
+  contents: write          # push the fast-forward to main
+  statuses: read           # read commit status checks
+  administration: read     # read branch protection (required by the
+                           # reusable workflow — see its header for why)

 jobs:
  promote:
-    runs-on: ubuntu-latest
-    steps:
-      - uses: actions/checkout@v4
-        with:
-          fetch-depth: 0
-          token: ${{ secrets.GITHUB_TOKEN }}
-
-      - name: Check required gates (if configured) on staging HEAD
-        id: gates
-        env:
-          GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
-          REPO: ${{ github.repository }}
-          HEAD_SHA: ${{ github.sha }}
-        shell: bash
-        run: |
-          set -euo pipefail
-
-          # Try to read required gates from branch protection. Free-tier
-          # private repos may 403; handle that gracefully.
-          GATES_JSON=$(gh api "repos/${REPO}/branches/staging/protection/required_status_checks" 2>/dev/null || echo '{}')
-          GATES=$(echo "${GATES_JSON}" | jq -r '.contexts[]?' 2>/dev/null || true)
-
-          if [ -z "$GATES" ]; then
-            echo "No required gates configured (or API inaccessible). Relying on --ff-only safety."
-            echo "ok=true" >> "$GITHUB_OUTPUT"
-            exit 0
-          fi
-
-          echo "Required gates on staging:"
-          echo "${GATES}" | sed 's/^/  - /'
-
-          ALL_GREEN=true
-          while IFS= read -r gate; do
-            [ -z "$gate" ] && continue
-
-            conclusion=$(gh api "repos/${REPO}/commits/${HEAD_SHA}/check-runs" \
-                          --jq "[.check_runs[] | select(.name == \"${gate}\")] | sort_by(.completed_at) | last.conclusion" \
-                          2>/dev/null || echo "")
-
-            if [ -z "$conclusion" ] || [ "$conclusion" = "null" ]; then
-              conclusion=$(gh api "repos/${REPO}/commits/${HEAD_SHA}/status" \
-                            --jq "[.statuses[] | select(.context == \"${gate}\")] | sort_by(.updated_at) | last.state" \
-                            2>/dev/null || echo "")
-            fi
-
-            if [ "$conclusion" != "success" ] && [ "$conclusion" != "SUCCESS" ]; then
-              echo "::warning::Gate '${gate}' is '${conclusion:-missing}' on ${HEAD_SHA} — skipping promote."
-              ALL_GREEN=false
-            else
-              echo "  ✓ ${gate}: success"
-            fi
-          done <<< "$GATES"
-
-          echo "ok=${ALL_GREEN}" >> "$GITHUB_OUTPUT"
-
-      - name: Fast-forward main to staging
-        if: steps.gates.outputs.ok == 'true'
-        env:
-          GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
-        shell: bash
-        run: |
-          set -euo pipefail
-
-          git config user.email "actions@github.com"
-          git config user.name "github-actions[bot]"
-
-          # staging is the checked-out branch (workflow fires on push to
-          # staging). Can't fetch into it. Fetch main into a local main.
-          git fetch origin main
-          git checkout -B main origin/main
-
-          # Check if main is already at or ahead of origin/staging.
-          if git merge-base --is-ancestor origin/staging main 2>/dev/null; then
-            echo "main already contains staging; nothing to promote."
-            exit 0
-          fi
-
-          # --ff-only refuses if main has independent commits not on
-          # staging (divergence — hotfix direct to main). Human resolves.
-          if ! git merge --ff-only origin/staging 2>&1; then
-            echo "::warning::main has diverged from staging — refusing fast-forward. Resolve manually (likely a direct-to-main commit exists that staging doesn't have)."
-            exit 0
-          fi
-
-          git push origin main
-          echo "::notice::Promoted: main is now at $(git rev-parse --short HEAD)"
+    uses: ./.github/workflows/auto-promote-branch.yml
+    with:
+      from-branch: staging
+      to-branch: main
--- a/.github/workflows/disable-auto-merge-on-push.yml
+++ b/.github/workflows/disable-auto-merge-on-push.yml
@ -0,0 +1,53 @@
+name: Disable auto-merge on push
+
+# Reusable guard against the "I enabled auto-merge then pushed more
+# commits" race. Background: on 2026-04-27, PR #2174 in molecule-core
+# auto-merged with only the first commit because the second commit
+# was pushed AFTER the merge queue had already locked the PR's SHA.
+# The second commit ended up orphaned on a merged-and-deleted branch.
+#
+# Mechanism: on every `pull_request: synchronize` event (= new commit
+# pushed to an open PR), check if auto-merge is enabled. If yes,
+# disable it and post a comment. This forces the operator to
+# re-engage `gh pr merge --auto` after the new push, with the
+# re-engagement acting as the verification step.
+#
+# Call from each repo's .github/workflows/ via a thin wrapper:
+#
+#   name: pr-guards
+#   on:
+#     pull_request:
+#       types: [synchronize]
+#   permissions:
+#     pull-requests: write
+#   jobs:
+#     disable-auto-merge-on-push:
+#       uses: molecule-ai/molecule-ci/.github/workflows/disable-auto-merge-on-push.yml@v1
+#
+# False-positive behavior: if a CI bot pushes (e.g. dependency-update
+# rebase, secret rotation), this also disables auto-merge for that
+# PR. That's acceptable — the operator who originally enabled
+# auto-merge gets notified and re-engages, which is exactly the
+# verify-after-machine-edits behavior we want.
+
+on:
+  workflow_call:
+
+jobs:
+  guard:
+    name: Disable auto-merge on push
+    runs-on: ubuntu-latest
+    if: github.event.pull_request.auto_merge != null
+    permissions:
+      pull-requests: write
+    steps:
+      - name: Disable auto-merge
+        env:
+          GH_TOKEN: ${{ github.token }}
+          PR: ${{ github.event.pull_request.number }}
+          REPO: ${{ github.repository }}
+          NEW_SHA: ${{ github.event.pull_request.head.sha }}
+        run: |
+          set -eu
+          gh pr merge "$PR" --disable-auto -R "$REPO" || true
+          gh pr comment "$PR" -R "$REPO" --body "🔒 Auto-merge disabled — new commit (\`${NEW_SHA:0:7}\`) pushed after auto-merge was enabled. The merge queue locks SHAs at entry, so subsequent pushes can race. Verify the new commit and re-enable with \`gh pr merge --auto\`."
--- a/.github/workflows/publish-template-image.yml
+++ b/.github/workflows/publish-template-image.yml
@ -1,6 +1,6 @@
 name: Publish Workspace Template Image

-# Reusable workflow for every Molecule-AI/molecule-ai-workspace-template-*
+# Reusable workflow for every molecule-ai/molecule-ai-workspace-template-*
 # repo. Builds the template's Dockerfile on main and pushes to GHCR as
 # `ghcr.io/molecule-ai/workspace-template-<runtime>:latest` (plus a
 # per-commit `sha-<7>` tag). Auto-derives <runtime> from the caller repo
@ -17,7 +17,7 @@ name: Publish Workspace Template Image
 #     packages: write
 #   jobs:
 #     publish:
-#       uses: Molecule-AI/molecule-ci/.github/workflows/publish-template-image.yml@main
+#       uses: molecule-ai/molecule-ci/.github/workflows/publish-template-image.yml@v1
 #       secrets: inherit
 #
 # Runner choice (2026-04-22): ubuntu-latest
@ -40,6 +40,19 @@ on:
        required: false
        type: string
        default: ""
+      runtime_version:
+        description: >-
+          molecule-ai-workspace-runtime version to install. Forwarded
+          as RUNTIME_VERSION docker build-arg. When unset, the
+          Dockerfile's requirements.txt pin is used. Cascade-triggered
+          builds forward client_payload.runtime_version here so each
+          rebuild has a unique build-arg → unique cache key →
+          guaranteed fresh `pip install`. Solves the
+          "cascade rebuilt but image still has old runtime" cache
+          trap that bit us repeatedly on 2026-04-27.
+        required: false
+        type: string
+        default: ""
    outputs:
      image:
        description: "Full image reference that was pushed (with :latest tag)"
@ -90,6 +103,64 @@ jobs:
          echo "sha=${SHA}"         >> "$GITHUB_OUTPUT"
          echo "::notice::Publishing runtime='${RUNTIME}' → ${IMAGE}:latest + :sha-${SHA}"

+      - name: Lint — no bare imports of runtime modules
+        # Templates that bare-import a workspace/ runtime module
+        # (e.g. `from plugins import load_plugins` instead of
+        # `from molecule_runtime.plugins import load_plugins`) work in
+        # the monorepo's bundled-runtime layout but explode at startup
+        # with `ModuleNotFoundError` once the runtime is installed as a
+        # package. This bit claude-code (5 imports), langgraph,
+        # deepagents, and gemini-cli on 2026-04-27 — each one a
+        # separate workspace-stuck-in-provisioning incident.
+        #
+        # Source of truth: molecule_runtime/_runtime_modules.json
+        # inside the published wheel (emitted by
+        # scripts/build_runtime_package.py). Pulling the manifest
+        # from PyPI's latest wheel ensures the lint never drifts from
+        # the rewriter's actual closed list. If the manifest can't be
+        # fetched (older wheel, PyPI down, etc.), falls back to the
+        # inline list — known to be correct as of 2026-04-27 — so
+        # the lint never silently passes on a fetch failure.
+        #
+        # Fail-fast: this runs before docker login + buildx setup so
+        # a bad PR returns red in seconds, not minutes.
+        shell: bash
+        run: |
+          set -eu
+
+          # Fallback list — used only when the manifest fetch fails.
+          # Mirrors scripts/build_runtime_package.py:TOP_LEVEL_MODULES
+          # at the time this comment was written.
+          FALLBACK_MODULES='plugins|adapter_base|config|main|preflight|prompt|coordinator|consolidation|events|heartbeat|transcript_auth|runtime_wedge|watcher|skill_loader|policies|adapters|builtin_tools|executor_helpers|a2a_executor|a2a_client|a2a_tools|a2a_cli|a2a_mcp_server|agent|agents_md|initial_prompt|molecule_ai_status|platform_auth|shared_runtime'
+
+          RUNTIME_MODULES=""
+          mkdir -p /tmp/runtime-wheel
+          if pip download --quiet molecule-ai-workspace-runtime --no-deps -d /tmp/runtime-wheel 2>/dev/null; then
+            WHEEL=$(ls /tmp/runtime-wheel/*.whl 2>/dev/null | head -1)
+            if [ -n "$WHEEL" ]; then
+              # Pull both top_level + subpackage names; both can be bare-imported.
+              RUNTIME_MODULES=$(unzip -p "$WHEEL" molecule_runtime/_runtime_modules.json 2>/dev/null \
+                | python3 -c "import sys,json; m=json.load(sys.stdin); print('|'.join(sorted(set(m['top_level_modules']) | set(m['subpackages']))))" 2>/dev/null || echo "")
+            fi
+          fi
+
+          if [ -n "$RUNTIME_MODULES" ]; then
+            echo "::notice::lint module list pulled from molecule-ai-workspace-runtime wheel manifest"
+          else
+            RUNTIME_MODULES="$FALLBACK_MODULES"
+            echo "::warning::could not read _runtime_modules.json from PyPI wheel — using inline fallback list"
+          fi
+
+          # Match `from <module> import` at start of line OR after any whitespace
+          # (function-scope imports inside if/try blocks count too).
+          if HITS=$(grep -nE "^\s*from (${RUNTIME_MODULES}) import" *.py 2>/dev/null); then
+            echo "::error::Bare imports of runtime modules found — must use \`from molecule_runtime.<module> import\`"
+            echo "$HITS" | sed 's/^/  /'
+            echo "::error::Fix: prefix each match with 'molecule_runtime.' (e.g. 'from plugins' → 'from molecule_runtime.plugins')."
+            exit 1
+          fi
+          echo "::notice::✓ no bare imports of runtime modules in template *.py files"
+
      - name: Log in to GHCR
        uses: docker/login-action@v3
        with:
@ -100,7 +171,213 @@ jobs:
      - name: Set up Docker Buildx
        uses: docker/setup-buildx-action@v3

-      - name: Build & push template image to GHCR
+      - name: Build template image (load for smoke test, do not push yet)
+        # Build into the runner's local docker first so the smoke test can
+        # actually boot the image. We push :latest + :sha-* only AFTER the
+        # smoke test passes — this is the gate that prevents broken images
+        # from poisoning :latest. Background: 2026-04-27 outage where the
+        # template's adapter.py imported a symbol (RuntimeCapabilities)
+        # that the published runtime didn't yet export. The old smoke
+        # test only inspected the entrypoint string, so the broken image
+        # shipped to GHCR and every workspace provision hung.
+        uses: docker/build-push-action@v6
+        with:
+          context: .
+          file: ./Dockerfile
+          platforms: linux/amd64
+          load: true
+          push: false
+          tags: ${{ steps.tags.outputs.image }}:sha-${{ steps.tags.outputs.sha }}
+          cache-from: type=gha
+          cache-to: type=gha,mode=max
+          # RUNTIME_VERSION is empty by default. When the cascade fires
+          # (or workflow_dispatch is invoked with a version), it's the
+          # exact runtime version about to be installed. Forwarded as a
+          # build-arg so Dockerfiles that declare `ARG RUNTIME_VERSION`
+          # get cache-key invalidation per-version. Templates that
+          # don't declare the ARG silently ignore it (no breakage).
+          build-args: |
+            RUNTIME_VERSION=${{ inputs.runtime_version }}
+          labels: |
+            org.opencontainers.image.source=https://github.com/${{ github.repository }}
+            org.opencontainers.image.revision=${{ github.sha }}
+            org.opencontainers.image.description=Molecule AI workspace template — ${{ steps.tags.outputs.runtime }} runtime
+
+      - name: Smoke test — boot image and import every /app/*.py
+        # The real boot test. Imports every Python module at /app/ inside
+        # the image, which exercises:
+        #   - adapter.py exists, no syntax errors, all module-level
+        #     imports resolve against the pip-installed runtime version
+        #     (catches version skew — symbol added to runtime but PyPI
+        #     not yet republished, etc.)
+        #   - executor.py / cli_executor.py / claude_sdk_executor.py /
+        #     etc. — sibling modules adapter.py imports lazily inside
+        #     create_executor(). Plain `import adapter` doesn't catch
+        #     bugs there because they're behind `def create_executor`.
+        #     This bit hermes (a2a-sdk migration) and langgraph
+        #     (LangGraphA2AExecutor bare import) on 2026-04-27.
+        #   - cross-cutting: any bare `from <runtime_module>` (the lint
+        #     above catches these statically; this catches them at
+        #     resolution time too, plus any imports of third-party
+        #     packages that the lint can't reason about).
+        # We bypass the gosu/agent entrypoint with --entrypoint sh
+        # because import smoke doesn't need workspace permissions.
+        shell: bash
+        env:
+          IMAGE: ${{ steps.tags.outputs.image }}:sha-${{ steps.tags.outputs.sha }}
+        run: |
+          set -eu
+          docker run --rm --entrypoint sh "${IMAGE}" -c '
+            set -e
+            cd /app
+            for f in *.py; do
+              [ "$f" = "__init__.py" ] && continue
+              mod="${f%.py}"
+              python3 -c "import $mod" || { echo "::error::failed to import $mod"; exit 1; }
+              echo "  ✓ $mod"
+            done
+          '
+          echo "::notice::✓ ${IMAGE} all /app/*.py modules import cleanly against installed runtime"
+
+      - name: Boot smoke — execute() against stub deps (#2275, task #131)
+        # The static import smoke above only IMPORTs /app/*.py — lazy
+        # imports buried inside `async def execute(...)` bodies (e.g.
+        # `from a2a.types import FilePart`) NEVER evaluate at static-
+        # import time. The 2026-04-2x v0→v1 a2a-sdk migration shipped 5
+        # such regressions in templates that all looked fine at module-
+        # load smoke (claude-code, langgraph, deepagents, gemini-cli,
+        # hermes — every one a separate provisioning incident).
+        #
+        # This step boots the image with MOLECULE_SMOKE_MODE=1, which
+        # routes molecule-runtime through smoke_mode.run_executor_smoke()
+        # — invokes executor.execute(stub_ctx, stub_queue) once with a
+        # short timeout. Healthy import tree → execution proceeds far
+        # enough to hit a network boundary and times out (exit 0).
+        # Broken lazy import → ImportError/ModuleNotFoundError from
+        # inside the executor body (exit 1).
+        #
+        # Universal turn-smoke (task #131): run_executor_smoke also
+        # consults runtime_wedge.is_wedged() at the end of every result
+        # path and upgrades a provisional PASS to FAIL when an adapter
+        # marked the runtime wedged. Catches PR-25-class regressions
+        # (claude-agent-sdk init wedge from a malformed CLI argv) where
+        # the SDK takes 60s to time out on `initialize()` — the outer
+        # wait_for must outlast that handshake so the adapter's wedge
+        # catch arm runs before the smoke gives up. That's why the
+        # smoke timeout is 90s (NOT the original 10s) and the outer
+        # `timeout` wrapper is 120s (NOT 60s). Lowering either back
+        # makes this gate blind to init-wedge bugs again — confirm with
+        # an injected wedge in test_smoke_mode.py before changing.
+        #
+        # Requires runtime >= 0.1.60 (the version that introduced
+        # smoke_mode). Older runtimes silently no-op and would hang on
+        # uvicorn, so we detect the module first and skip if absent —
+        # this lets templates pinned to older runtimes continue to
+        # publish without this gate flipping red, while every fresh
+        # cascade-triggered build (which forwards the just-published
+        # version as RUNTIME_VERSION) gets the gate automatically.
+        #
+        # Wrapped in `timeout` as a belt-and-suspenders safety net in
+        # case smoke_mode itself wedges — runner shouldn't hang
+        # indefinitely on a single template.
+        shell: bash
+        env:
+          IMAGE: ${{ steps.tags.outputs.image }}:sha-${{ steps.tags.outputs.sha }}
+        run: |
+          set -eu
+
+          HAS_SMOKE_MODE=$(docker run --rm --entrypoint sh "${IMAGE}" -c \
+            'python3 -c "import molecule_runtime.smoke_mode" >/dev/null 2>&1 && echo yes || echo no')
+          if [ "${HAS_SMOKE_MODE}" = "no" ]; then
+            echo "::warning::installed runtime predates molecule-core#2275 (no molecule_runtime.smoke_mode); skipping boot smoke. Bump requirements.txt to molecule-ai-workspace-runtime>=0.1.60 to enable."
+            exit 0
+          fi
+
+          if [ ! -f config.yaml ]; then
+            echo "::error::config.yaml not found at repo root — boot smoke needs it to populate /configs. Templates without a config.yaml at root cannot be boot-smoked; either add one or skip this gate by setting an old runtime pin."
+            exit 1
+          fi
+
+          # Mount the repo's own config.yaml at /configs so the runtime
+          # can reach create_executor() — that's where the lazy imports
+          # we want to test actually live. The image's entrypoint drops
+          # priv from root to agent (uid 1000) before exec'ing
+          # molecule-runtime, so /configs needs to be readable AND
+          # traversable from uid 1000.
+          #
+          # Use `a+rX` (capital X — only adds x where it's already
+          # executable, i.e. directories): mktemp -d creates the dir
+          # with mode 700, so a bare `go+r` would leave the dir
+          # un-traversable for agent and config.py would
+          # PermissionError on `Path('/configs/config.yaml').exists()`.
+          # Mount RW (not :ro) so the entrypoint's `chown -R agent
+          # /configs` succeeds — its silent chown failure on a :ro
+          # mount was the original symptom.
+          SMOKE_CONFIG_DIR=$(mktemp -d)
+          cp config.yaml "${SMOKE_CONFIG_DIR}/"
+          chmod -R a+rX "${SMOKE_CONFIG_DIR}"
+
+          # Stub credentials — adapters validate shape at create_executor
+          # time but the smoke times out before any real call goes out.
+          # Set the common ones so any adapter that early-validates a
+          # specific key sees a non-empty value.
+          # PYTHONPATH=/app mirrors what the platform's provisioner
+          # injects at workspace startup (workspace-server/internal/
+          # provisioner/provisioner.go:563). Without it,
+          # `importlib.import_module('adapter')` in the runtime's
+          # preflight check fails with ModuleNotFoundError because
+          # molecule-runtime is a console_scripts entry point —
+          # sys.path[0] is /usr/local/bin, NOT /app. The existing
+          # static import smoke step above doesn't hit this because
+          # `python3 -c "import $mod"` adds cwd to sys.path; only the
+          # entry-point invocation needs PYTHONPATH.
+          set +e
+          # MOLECULE_SMOKE_TIMEOUT_SECS=90 is calibrated to outlast
+          # claude-agent-sdk's 60s initialize() handshake (see step
+          # comment above + workspace/smoke_mode.py top docstring) so
+          # adapter wedge catch arms run before run_executor_smoke
+          # gives up. Outer `timeout 120` is the runner-level safety
+          # net — slightly longer than the inner timeout so a hung
+          # smoke_mode itself surfaces as exit 124 and gets a clear
+          # error message instead of just `exit 1`.
+          timeout 120 docker run --rm \
+            -v "${SMOKE_CONFIG_DIR}:/configs" \
+            -e WORKSPACE_ID=fake-smoke \
+            -e PYTHONPATH=/app \
+            -e MOLECULE_SMOKE_MODE=1 \
+            -e MOLECULE_SMOKE_TIMEOUT_SECS=90 \
+            -e CLAUDE_CODE_OAUTH_TOKEN=sk-fake-smoke-token \
+            -e ANTHROPIC_API_KEY=sk-fake-smoke-key \
+            -e GEMINI_API_KEY=fake-smoke-key \
+            -e OPENAI_API_KEY=sk-fake-smoke-key \
+            "${IMAGE}"
+          rc=$?
+          set -e
+          # Cleanup is best-effort: the entrypoint chowns /configs to
+          # uid 1000 (agent) inside the container, which propagates to
+          # the host bind-mount, leaving the runner user unable to
+          # remove the files. Fall back to `sudo rm` and ignore any
+          # remaining failure — the runner is ephemeral, /tmp is
+          # cleaned automatically post-job.
+          rm -rf "${SMOKE_CONFIG_DIR}" 2>/dev/null \
+            || sudo rm -rf "${SMOKE_CONFIG_DIR}" 2>/dev/null \
+            || true
+
+          if [ "${rc}" -eq 124 ]; then
+            echo "::error::boot smoke wedged past 120s — smoke_mode itself failed to terminate (look for blocking calls before MOLECULE_SMOKE_TIMEOUT_SECS fires)"
+            exit 1
+          fi
+          if [ "${rc}" -ne 0 ]; then
+            echo "::error::boot smoke failed (exit ${rc}) — executor.execute() raised an import error OR an adapter marked runtime_wedge.is_wedged() (PR-25-class init wedge). Check the container log above for the offending lazy import or wedge reason."
+            exit "${rc}"
+          fi
+          echo "::notice::✓ ${IMAGE} executor.execute() smoke passed (imports healthy, no runtime wedge)"
+
+      - name: Push image to GHCR (post-smoke)
+        # Now that the smoke test passed, push both tags. build-push-action
+        # reuses the cached build from the load step above, so this is fast
+        # — it's effectively a layer push, not a rebuild. Same build-args
+        # passed for cache key consistency.
        uses: docker/build-push-action@v6
        with:
          context: .
@ -112,24 +389,9 @@ jobs:
            ${{ steps.tags.outputs.image }}:sha-${{ steps.tags.outputs.sha }}
          cache-from: type=gha
          cache-to: type=gha,mode=max
+          build-args: |
+            RUNTIME_VERSION=${{ inputs.runtime_version }}
          labels: |
            org.opencontainers.image.source=https://github.com/${{ github.repository }}
            org.opencontainers.image.revision=${{ github.sha }}
            org.opencontainers.image.description=Molecule AI workspace template — ${{ steps.tags.outputs.runtime }} runtime
-
-      - name: Smoke test the pushed image
-        # Pull the tag we just pushed and verify the entrypoint is set.
-        # Catches "image pushed but binary missing" regressions without a
-        # full end-to-end provision test. We don't `docker run` — most
-        # templates need platform env (WORKSPACE_ID, PLATFORM_URL, etc.)
-        # to actually boot, so inspection is the right layer here.
-        shell: bash
-        env:
-          IMAGE: ${{ steps.tags.outputs.image }}:sha-${{ steps.tags.outputs.sha }}
-        run: |
-          set -eu
-          docker pull "${IMAGE}"
-          docker inspect "${IMAGE}" --format '{{.Config.Entrypoint}} {{.Config.Cmd}}' \
-            | tee /dev/stderr \
-            | grep -qE '.' || { echo "::error::Image has empty entrypoint+cmd"; exit 1; }
-          echo "::notice::✓ ${IMAGE} pulled and entrypoint verified"
--- a/.github/workflows/validate-org-template.yml
+++ b/.github/workflows/validate-org-template.yml
@ -9,13 +9,23 @@ jobs:
    timeout-minutes: 10
    steps:
      - uses: actions/checkout@v4
+      # Canonical validator script lives in molecule-ci, fetched fresh on
+      # every run. The previous setup expected `.molecule-ci/scripts/` to
+      # be vendored INTO each org-template repo, which drifted across the
+      # 5 org-template repos as the validator evolved. Single source of
+      # truth eliminates that drift class entirely. Mirrors the same
+      # pattern already used by validate-workspace-template.yml.
+      # Direct git-clone — see validate-plugin.yml for the rationale.
+      # Anonymous fetch of public molecule-ci, no actions/checkout idiosyncrasies.
+      - name: Fetch molecule-ci canonical scripts
+        run: git clone --depth 1 https://git.moleculesai.app/molecule-ai/molecule-ci.git .molecule-ci-canonical
      - uses: actions/setup-python@v5
        with:
          python-version: "3.11"
          cache: "pip"
-          cache-dependency-path: .molecule-ci/scripts/requirements.txt
+          cache-dependency-path: .molecule-ci-canonical/.molecule-ci/scripts/requirements.txt
      - run: pip install pyyaml -q
-      - run: python3 .molecule-ci/scripts/validate-org-template.py
+      - run: python3 .molecule-ci-canonical/.molecule-ci/scripts/validate-org-template.py
      - name: Check for secrets
        run: |
          python3 - << 'PYEOF'
@ -32,7 +42,7 @@ jobs:
              re.compile(r'''ghp_[a-zA-Z0-9]{36,}'''),
              re.compile(r'''sk-ant-[a-zA-Z0-9]{50,}'''),
          ]
-          SKIP_DIRS = {'.molecule-ci', '.git', 'node_modules', '__pycache__'}
+          SKIP_DIRS = {'.molecule-ci', '.molecule-ci-canonical', '.git', 'node_modules', '__pycache__'}
          EXTENSIONS = {'.yaml', '.yml', '.md', '.py', '.sh'}

          def is_false_positive(line):
--- a/.github/workflows/validate-plugin.yml
+++ b/.github/workflows/validate-plugin.yml
@ -9,13 +9,32 @@ jobs:
    timeout-minutes: 10
    steps:
      - uses: actions/checkout@v4
+      # Canonical validator script lives in molecule-ci, fetched fresh on
+      # every run. The previous setup expected `.molecule-ci/scripts/` to
+      # be vendored INTO each plugin repo, which drifted across the
+      # 20+ plugin repos as the validator evolved. Single source of
+      # truth eliminates that drift class entirely. Mirrors the same
+      # pattern already used by validate-workspace-template.yml.
+      # Direct git-clone instead of actions/checkout@v4 because:
+      # (a) actions/checkout@v4 sends Authorization: basic <github.token> by default,
+      #     and Gitea 404s the cross-repo authenticated request (different from
+      #     GitHub which falls back to anon-public-read).
+      # (b) Passing token: '' triggers actions/checkout's runtime "Input required
+      #     and not supplied: token" error — the input is documented as
+      #     required:false but the action's runtime calls getInput with
+      #     required:true on its auth-helper path.
+      # Anonymous git clone of public molecule-ci has neither problem.
+      # See molecule-ci#1 (lowercase fix) + #2 (token:'' attempt) +
+      # the post-merge CI run on plugin-molecule-careful-bash@663bf72.
+      - name: Fetch molecule-ci canonical scripts
+        run: git clone --depth 1 https://git.moleculesai.app/molecule-ai/molecule-ci.git .molecule-ci-canonical
      - uses: actions/setup-python@v5
        with:
          python-version: "3.11"
          cache: "pip"
-          cache-dependency-path: .molecule-ci/scripts/requirements.txt
+          cache-dependency-path: .molecule-ci-canonical/.molecule-ci/scripts/requirements.txt
      - run: pip install pyyaml -q
-      - run: python3 .molecule-ci/scripts/validate-plugin.py
+      - run: python3 .molecule-ci-canonical/.molecule-ci/scripts/validate-plugin.py
      - name: Check for secrets
        run: |
          python3 - << 'PYEOF'
@ -32,7 +51,7 @@ jobs:
              re.compile(r'''ghp_[a-zA-Z0-9]{36,}'''),
              re.compile(r'''sk-ant-[a-zA-Z0-9]{50,}'''),
          ]
-          SKIP_DIRS = {'.molecule-ci', '.git', 'node_modules', '__pycache__'}
+          SKIP_DIRS = {'.molecule-ci', '.molecule-ci-canonical', '.git', 'node_modules', '__pycache__'}
          EXTENSIONS = {'.yaml', '.yml', '.md', '.py', '.sh'}

          def is_false_positive(line):
--- a/.github/workflows/validate-workspace-template.yml
+++ b/.github/workflows/validate-workspace-template.yml
@ -2,23 +2,66 @@ name: Validate Workspace Template
 on:
  workflow_call:

+# Defense-in-depth on the GITHUB_TOKEN scope. This workflow runs
+# untrusted-by-design code from the calling template repo — pip
+# installs the template's requirements.txt (post-install hooks),
+# imports adapter.py, and `docker build`s the Dockerfile (RUN
+# steps). Each of those primitives can execute arbitrary code with
+# the token in env. Pinning `contents: read` means the worst a
+# malicious template PR can do with the token is read public repo
+# state — no write to issues, no push to branches, no comment-spam,
+# no workflow re-trigger.
+#
+# Fork-PR lockdown (#135): the workflow splits into two jobs:
+#
+#   validate-static  — file-content checks only (secret scan, YAML
+#                      parse, AST inspection of adapter.py without
+#                      import). Always runs, including external fork
+#                      PRs. Safe because no third-party code executes.
+#
+#   validate-runtime — pip install requirements.txt + import
+#                      adapter.py + docker build. SKIPPED on fork
+#                      PRs because each step is arbitrary code
+#                      execution from the template repo's perspective.
+#                      Internal PRs and post-merge runs still get
+#                      the full coverage.
+#
+# What this prevents: a malicious external PR can no longer
+# crypto-mine on the runner, DNS-exfiltrate runner metadata, or
+# attempt to read GitHub-Actions internal env via a setup.py
+# postinstall hook. They still get static feedback (secret scan
+# is the most important security check anyway).
+#
+# What this does NOT prevent: malicious template metadata that
+# passes static checks. The runtime job catches those once the PR
+# merges (or an internal contributor reposts the change), at which
+# point branch protection on staging/main blocks the merge if
+# runtime validation fails.
+permissions:
+  contents: read
+
 jobs:
-  validate:
-    name: Template validation
+  validate-static:
+    name: Template validation (static)
    runs-on: ubuntu-latest
-    timeout-minutes: 15
+    timeout-minutes: 5
    steps:
+      # Calling template repo (Dockerfile + config.yaml + adapter.py).
      - uses: actions/checkout@v4
+      # Canonical validator script lives in molecule-ci, fetched fresh on
+      # every run. The previous setup expected `.molecule-ci/scripts/` to
+      # be vendored INTO each template repo, which drifted across the 8
+      # template repos as the validator evolved. Single source of truth
+      # eliminates that drift class entirely — every template runs the
+      # same canonical contract check on every CI run.
+      # Direct git-clone — see validate-plugin.yml for the rationale.
+      # Anonymous fetch of public molecule-ci, no actions/checkout idiosyncrasies.
+      - name: Fetch molecule-ci canonical scripts
+        run: git clone --depth 1 https://git.moleculesai.app/molecule-ai/molecule-ci.git .molecule-ci-canonical
      - uses: actions/setup-python@v5
        with:
          python-version: "3.11"
-          cache: "pip"
-          cache-dependency-path: .molecule-ci/scripts/requirements.txt
-      - run: pip install pyyaml -q
-      - run: python3 .molecule-ci/scripts/validate-workspace-template.py
-      - name: Docker build smoke test
-        if: hashFiles('Dockerfile') != ''
-        run: docker build -t template-test . --no-cache 2>&1 | tail -5 && echo "✓ Docker build succeeded"
+      # Secret scan — the most important check. Always runs.
      - name: Check for secrets
        run: |
          python3 - << 'PYEOF'
@ -68,3 +111,100 @@ jobs:
          else:
              print("::notice::No secrets detected")
          PYEOF
+      # Static-only validator — file existence checks, YAML parse,
+      # AST inspection of adapter.py (no import). Doesn't execute
+      # any third-party code; safe on fork PRs.
+      - run: pip install pyyaml -q
+      - run: python3 .molecule-ci-canonical/scripts/validate-workspace-template.py --static-only
+
+  validate-runtime:
+    name: Template validation (runtime)
+    runs-on: ubuntu-latest
+    timeout-minutes: 15
+    needs: validate-static
+    # Skip when the PR comes from a fork — those are external,
+    # untrusted, and would let attackers run pip install / docker
+    # build / adapter.py import on our runner. Internal PRs (head
+    # repo == base repo, fork == false) and push events to internal
+    # branches both keep full coverage.
+    #
+    # github.event.pull_request.head.repo.fork is null for non-PR
+    # events (push, schedule, etc.) — defaults to running.
+    if: github.event.pull_request.head.repo.fork != true
+    steps:
+      - uses: actions/checkout@v4
+      # Direct git-clone — see validate-plugin.yml for the rationale.
+      # Anonymous fetch of public molecule-ci, no actions/checkout idiosyncrasies.
+      - name: Fetch molecule-ci canonical scripts
+        run: git clone --depth 1 https://git.moleculesai.app/molecule-ai/molecule-ci.git .molecule-ci-canonical
+      - uses: actions/setup-python@v5
+        with:
+          python-version: "3.11"
+          # Cache pip against the calling repo's own requirements.txt
+          # (the file we install one step below). Pointing the cache key
+          # at the validator's own deps was decorative — pyyaml never
+          # changes, so the key never invalidated even when the template
+          # added a heavy dep like crewai.
+          cache: "pip"
+          cache-dependency-path: requirements.txt
+      - run: pip install pyyaml -q
+      # Install the template's runtime dependencies so the validator's
+      # `check_adapter_runtime_load()` can import adapter.py the same way
+      # the workspace container does at boot. Without this, a
+      # syntactically-valid adapter that ImportErrors on a missing
+      # transitive dep would build clean and crash on first user prompt.
+      # The fallback (no requirements.txt) installs the runtime alone so
+      # BaseAdapter is at least importable for the class-discovery check.
+      - if: hashFiles('requirements.txt') != ''
+        run: pip install -q -r requirements.txt
+      - if: hashFiles('requirements.txt') == ''
+        run: pip install -q molecule-ai-workspace-runtime
+      # Full validator — includes adapter.py import (exec_module).
+      - run: python3 .molecule-ci-canonical/scripts/validate-workspace-template.py
+      - name: Docker build smoke test
+        if: hashFiles('Dockerfile') != ''
+        run: docker build -t template-test . --no-cache 2>&1 | tail -5 && echo "✓ Docker build succeeded"
+
+  # Aggregator that emits a single `Template validation` check name —
+  # the caller's job (`validate:` in each template's ci.yml) plus this
+  # job's name produces `validate / Template validation`, which is what
+  # template-repo branch protection has historically required.
+  #
+  # Why it's needed: the workflow was refactored from one job into
+  # validate-static + validate-runtime (with matrix-suffixed display
+  # names) for fork-PR security. The matrix names never match the
+  # original required-check name, so PR auto-merge silently hung in
+  # BLOCKED forever on every template repo (caught while shipping
+  # fixes for the boot-smoke gate, openclaw#11 + hermes#29).
+  #
+  # `if: always()` so it reports out even when validate-static fails —
+  # without that, GitHub marks the aggregator as SKIPPED and branch
+  # protection still blocks because the required check never reports
+  # a final state.
+  #
+  # Fork-PR semantics: validate-runtime is intentionally skipped on
+  # fork PRs (security gate). Treat `skipped` as a pass for the
+  # aggregator on forks so static-only coverage doesn't make every
+  # external PR un-mergeable.
+  template-validation:
+    name: Template validation
+    runs-on: ubuntu-latest
+    needs: [validate-static, validate-runtime]
+    if: always()
+    timeout-minutes: 1
+    steps:
+      - name: Aggregate
+        run: |
+          static="${{ needs.validate-static.result }}"
+          runtime="${{ needs.validate-runtime.result }}"
+          echo "validate-static:  $static"
+          echo "validate-runtime: $runtime"
+          if [ "$static" != "success" ]; then
+            echo "::error::validate-static did not succeed: $static"
+            exit 1
+          fi
+          if [ "$runtime" != "success" ] && [ "$runtime" != "skipped" ]; then
+            echo "::error::validate-runtime did not succeed: $runtime"
+            exit 1
+          fi
+          echo "::notice::Template validation aggregate passed (static=$static, runtime=$runtime)"
--- a/.gitignore
+++ b/.gitignore
@ -19,3 +19,12 @@
 # Workspace auth tokens
 .auth-token
 .auth_token
+
+# Python bytecode + caches — never commit. Generated by every test run.
+__pycache__/
+*.pyc
+*.pyo
+*.pyd
+.pytest_cache/
+.mypy_cache/
+.ruff_cache/
--- a/.molecule-ci/scripts/validate-org-template.py
+++ b/.molecule-ci/scripts/validate-org-template.py
@ -2,19 +2,47 @@
 """Validate a Molecule AI org template repo."""
 import os, sys, yaml

-# Support !include and other custom YAML tags used by org templates.
-# These resolve at platform load time, not at validation time — we just
-# need to parse past them without crashing.
+# Support custom YAML tags used by org templates. Two shapes:
+#
+#   - `!include teams/pm.yaml`  → scalar string referencing another YAML
+#     file in the same repo. Platform inlines at load time.
+#
+#   - `!external\n  repo: ...\n  ref: ...\n  path: ...`  → mapping
+#     referencing a workspace tree to fetch from another repo. Platform
+#     fetches into a content-addressable cache at load time
+#     (internal#77 / molecule-core#105).
+#
+# Both shapes resolve at platform load time, not at validation time.
+# The validator treats them as opaque references — it does NOT chase
+# them down. We mark each parsed value with a sentinel subtype so the
+# `validate_workspace` walk knows to skip them rather than tripping
+# the "missing 'name'" branch.
+class IncludeRef(str):
+    """`!include path/to.yaml` — opaque reference, skipped by validator."""
+
+class ExternalRef(dict):
+    """`!external` mapping — opaque reference, skipped by validator."""
+
 class PermissiveLoader(yaml.SafeLoader):
    pass

+def _include_constructor(loader, node):
+    return IncludeRef(loader.construct_scalar(node))
+
+def _external_constructor(loader, node):
+    return ExternalRef(loader.construct_mapping(node))
+
 def _generic_constructor(loader, tag_suffix, node):
+    # Fallback for unknown tags. Preserve the parsed shape so legacy
+    # docs that lean on tags we have not modeled yet still parse.
    if isinstance(node, yaml.MappingNode):
        return loader.construct_mapping(node)
    if isinstance(node, yaml.SequenceNode):
        return loader.construct_sequence(node)
    return loader.construct_scalar(node)

+PermissiveLoader.add_constructor("!include", _include_constructor)
+PermissiveLoader.add_constructor("!external", _external_constructor)
 PermissiveLoader.add_multi_constructor("!", _generic_constructor)

 errors = []
@ -33,7 +61,13 @@ if not org.get("workspaces") and not org.get("defaults"):
    errors.append("org.yaml must have at least 'workspaces' or 'defaults'")

 def validate_workspace(ws, path=""):
-    # !include tags resolve to strings at parse time; skip non-dicts
+    # `!include path/to.yaml` parses as IncludeRef (str subclass).
+    # `!external {repo, ref, path}` parses as ExternalRef (dict subclass).
+    # Both are opaque references — skip without chasing.
+    if isinstance(ws, (IncludeRef, ExternalRef)):
+        return []
+    # Legacy unknown-tag scalars (handled by _generic_constructor) stay
+    # as plain strings; they are not workspace dicts either.
    if not isinstance(ws, dict):
        return []
    ws_errors = []
@ -59,6 +93,11 @@ if errors:
 def count_ws(nodes):
    c = 0
    for n in nodes:
+        # Skip opaque references — we do not know how many workspaces
+        # they expand to without resolving them, and resolution is the
+        # platform's job, not the validator's.
+        if isinstance(n, (IncludeRef, ExternalRef)):
+            continue
        if not isinstance(n, dict):
            continue
        c += 1
@ -66,4 +105,4 @@ def count_ws(nodes):
    return c

 total = count_ws(org.get("workspaces", []))
-print(f"✓ org.yaml valid: {org['name']} ({total} workspaces)")
+print(f"✓ org.yaml valid: {org['name']} ({total} direct workspaces; external refs not counted)")
--- a/.molecule-ci/scripts/validate-workspace-template.py
+++ b/.molecule-ci/scripts/validate-workspace-template.py
@ -1,47 +0,0 @@
-#!/usr/bin/env python3
-"""Validate a Molecule AI workspace template repo."""
-import os, sys, yaml
-
-errors = []
-
-if not os.path.isfile("config.yaml"):
-    print("::error::config.yaml not found at repo root")
-    sys.exit(1)
-
-with open("config.yaml") as f:
-    config = yaml.safe_load(f)
-
-if not config.get("name"):
-    errors.append("Missing required field: name")
-if not config.get("runtime"):
-    errors.append("Missing required field: runtime")
-
-known = {"langgraph", "claude-code", "crewai", "autogen", "deepagents", "hermes", "gemini-cli", "openclaw"}
-runtime = config.get("runtime", "")
-if runtime and runtime not in known:
-    print(f"::warning::Runtime '{runtime}' is not in the known set. OK for custom runtimes.")
-
-# Check for legacy imports
-if os.path.isfile("adapter.py"):
-    with open("adapter.py") as f:
-        content = f.read()
-        if "molecule_runtime" in content:
-            print("::warning::adapter.py imports 'molecule_runtime' — legacy import, use 'molecule_ai' or platform SDK")
-
-# Check for missing molecule-ai-workspace-runtime dependency hint
-if os.path.isfile("Dockerfile"):
-    with open("Dockerfile") as f:
-        content = f.read()
-        if "molecule-ai-workspace-runtime" not in content:
-            print("::warning::Dockerfile does not reference 'molecule-ai-workspace-runtime' — may need base runtime package")
-
-sv = config.get("template_schema_version")
-if sv is None:
-    errors.append("Missing template_schema_version (add: template_schema_version: 1)")
-
-if errors:
-    for e in errors:
-        print(f"::error::{e}")
-    sys.exit(1)
-
-print(f"✓ config.yaml valid: {config['name']} (runtime: {config.get('runtime')})")
--- a/README.md
+++ b/README.md
@ -12,7 +12,7 @@ name: CI
 on: [push, pull_request]
 jobs:
  validate:
-    uses: Molecule-AI/molecule-ci/.github/workflows/validate-plugin.yml@main
+    uses: Molecule-AI/molecule-ci/.github/workflows/validate-plugin.yml@v1
 ```

 ### Workspace template repos (`molecule-ai-workspace-template-*`)
@ -23,7 +23,7 @@ name: CI
 on: [push, pull_request]
 jobs:
  validate:
-    uses: Molecule-AI/molecule-ci/.github/workflows/validate-workspace-template.yml@main
+    uses: Molecule-AI/molecule-ci/.github/workflows/validate-workspace-template.yml@v1
 ```

 ### Org template repos (`molecule-ai-org-template-*`)
@ -34,9 +34,28 @@ name: CI
 on: [push, pull_request]
 jobs:
  validate:
-    uses: Molecule-AI/molecule-ci/.github/workflows/validate-org-template.yml@main
+    uses: Molecule-AI/molecule-ci/.github/workflows/validate-org-template.yml@v1
 ```

+### Any repo with auto-merge enabled
+
+PR-time guards (currently: disable auto-merge on follow-up push). Consume from a thin caller:
+
+```yaml
+# .github/workflows/pr-guards.yml
+name: pr-guards
+on:
+  pull_request:
+    types: [synchronize]
+permissions:
+  pull-requests: write
+jobs:
+  disable-auto-merge-on-push:
+    uses: Molecule-AI/molecule-ci/.github/workflows/disable-auto-merge-on-push.yml@v1
+```
+
+When the team lands more PR-time guards in this repo, add them as additional jobs in the same caller — keeps each consuming repo's footprint to one file.
+
 ## What each workflow validates

 ### validate-plugin
@ -74,6 +93,21 @@ jobs:
 | `template_schema_version` present | Warning | Missing version contract |
 | No committed secrets | Error | Leaked API keys |

+### disable-auto-merge-on-push
+
+PR-time safety guard. When `pull_request:synchronize` fires (= a new commit pushed to an open PR) and auto-merge is already enabled, this workflow disables auto-merge and posts a comment requiring the operator to re-engage explicitly.
+
+**Why it exists:** on 2026-04-27, molecule-core PR #2174 auto-merged with only its first commit because the second commit was pushed AFTER the merge queue had locked the PR's SHA. The second commit ended up orphaned on a merged-and-deleted branch.
+
+**Pairs with the org-wide repo setting** "Automatically delete head branches" (already enabled on all 10 Molecule-AI repos). Defense in depth:
+
+1. Repo setting blocks pushes to a merged-and-deleted branch (catches the post-merge orphan case).
+2. This workflow catches the in-queue race (push during queue processing) by force-disabling auto-merge.
+
+Together they cover the full lifecycle of "auto-merge enabled → new commits arrive" without operator discipline.
+
+**False-positive note:** if a CI bot pushes (dependency update, secret rotation), this also disables auto-merge. That's intentional — the operator who originally enabled auto-merge gets notified and re-engages, which is exactly the verify-after-machine-edits behavior we want.
+
 ## License

 Business Source License 1.1 — © Molecule AI.
--- a/docs/template-contract.md
+++ b/docs/template-contract.md
@ -0,0 +1,67 @@
+# Workspace Template Contract
+
+Hard rules every `molecule-ai-workspace-template-*` repo must satisfy. Enforced by `scripts/validate-workspace-template.py` on every CI run via the reusable `validate-workspace-template.yml` workflow.
+
+The contract exists because the 8 template repos were extracted from a single monolithic Dockerfile pre-#87, and have drifted as each was edited piecemeal since. Without this gate, a 28-line cascade-friendly Dockerfile in one repo silently regresses to a 25-line non-cache-friendly one in another, and the next runtime publish ships the previous wheel from a stale layer (cache trap observed five times in a row on 2026-04-27).
+
+## Dockerfile
+
+| Rule | Why |
+|---|---|
+| `FROM python:3.11-slim` | Single base everywhere — keeps apt + pip behaviour identical and lets us reason about CVE patches on one base. |
+| `ARG RUNTIME_VERSION=` declared | The arg invalidates the pip-install layer's cache key whenever the cascade publishes a new wheel. Without it the cache hit replays the previous runtime. |
+| `${RUNTIME_VERSION}` referenced in a `RUN` | Just declaring the ARG isn't enough — it has to be in the layer's command line so docker hashes it. Pattern: `if [ -n "${RUNTIME_VERSION}" ]; then pip install --no-cache-dir --upgrade "molecule-ai-workspace-runtime==${RUNTIME_VERSION}"; fi` |
+| `RUN useradd -u 1000 -m -s /bin/bash agent` | The runtime drops to uid 1000 before exec'ing the SDK. Claude Code refuses `--dangerously-skip-permissions` as root for safety. The `/workspace` volume is also chown'd to 1000 by the platform provisioner. |
+| `ENTRYPOINT ["molecule-runtime"]` *or* a wrapper script that exec's `molecule-runtime` | Single entrypoint means the platform's container-restart contract is uniform across templates. Wrapper scripts are allowed (claude-code has `entrypoint.sh` for gosu drop-priv; hermes has `start.sh` to boot the hermes-agent daemon first). |
+| `molecule-ai-workspace-runtime` listed in `requirements.txt` (or installed in the Dockerfile directly) | The runtime wheel is the contract — without it the container has no A2A server, no heartbeat, no MCP bridge. |
+
+## config.yaml
+
+| Required key | Type | Notes |
+|---|---|---|
+| `name` | str | Human-readable; appears on the canvas card. |
+| `runtime` | str | Must be one of: `langgraph`, `claude-code`, `crewai`, `autogen`, `deepagents`, `hermes`, `gemini-cli`, `openclaw`. Custom runtimes warn but are allowed. |
+| `template_schema_version` | int | Currently `1`. Bump when adding a key that changes how the platform consumes config.yaml. **Must be int**, not string — a quoted `"1"` will fail validation. |
+
+| Optional key | Notes |
+|---|---|
+| `description` | Free text, surfaces on canvas. |
+| `version`, `tier` | int, controls platform-side rollout gating. |
+| `model`, `models` | Either a single model id or a list of model ids the agent may use. |
+| `runtime_config` | Nested block of runtime-specific settings (used by claude-code, gemini-cli, hermes). |
+| `env`, `skills`, `tools`, `a2a`, `delegation`, `prompt_files`, `bridge`, `governance` | Optional feature blocks. Add new keys to `OPTIONAL_KEYS` in the validator when introducing them. |
+
+Unknown top-level keys produce a warning (not an error) so accidental drift is visible without blocking.
+
+## adapter.py
+
+Optional. When present, `adapter.py` should:
+- Import `BaseAdapter` from `molecule_runtime.adapter_base`.
+- Override `setup()` and `create_executor()` for the runtime's specific entry point.
+
+The pre-#87 import path (`molecule_ai`) produces a warning if it appears.
+
+## requirements.txt
+
+Must declare `molecule-ai-workspace-runtime` (with a version pin or floor).
+
+## CI
+
+Every template repo's `.github/workflows/ci.yml` should be a one-liner that calls the canonical reusable workflow:
+
+```yaml
+name: CI
+on: [push, pull_request]
+jobs:
+  validate:
+    uses: Molecule-AI/molecule-ci/.github/workflows/validate-workspace-template.yml@v1
+```
+
+The reusable workflow checks out `molecule-ci` itself (into `.molecule-ci-canonical`) and runs the canonical `validate-workspace-template.py` from there — so no per-repo vendoring of the script is needed. The legacy `.molecule-ci/scripts/` directory in each template repo is being phased out.
+
+## Adding a new runtime
+
+1. Add the runtime name to `KNOWN_RUNTIMES` in `scripts/validate-workspace-template.py`.
+2. Add the runtime + image ref to `RuntimeImages` in `molecule-core/workspace-server/internal/provisioner/provisioner.go`.
+3. Stand up the `molecule-ai-workspace-template-<runtime>` repo from the existing template-of-templates pattern (issue #105 covers this).
+4. Confirm CI green on the new repo before opening it for general use.
--- a/scripts/migrate-template.py
+++ b/scripts/migrate-template.py
@ -0,0 +1,224 @@
+#!/usr/bin/env python3
+"""Migrate a workspace template's config.yaml across schema versions.
+
+Companion to validate-workspace-template.py. Whenever the validator
+adds a new schema version, this script gets a corresponding entry in
+MIGRATIONS so each consumer template can mechanically upgrade rather
+than every maintainer figuring out the field changes by hand.
+
+Discipline (matches the validator's header):
+
+  1. Validator gets a SCHEMA_V<N+1> block + KNOWN_SCHEMA_VERSIONS bump.
+  2. This script gets `MIGRATIONS[N]` defined — a function that takes
+     a v<N> dict and returns a v<N+1> dict. Pure, deterministic, no
+     I/O — that way migrations compose: v1 → v2 → v3 just chains them.
+  3. Each migration is FROZEN once shipped. If a v2 migration needs
+     fixing post-ship, ship it as v3 with the corrective migration.
+  4. Consumers run this script (one PR per template repo) before the
+     deprecation window for v<N> closes.
+
+Usage:
+
+    # Migrate the template in cwd from its current version to the latest
+    python3 scripts/migrate-template.py .
+
+    # Migrate to a specific version (bounded; useful when a deprecation
+    # window is closing and you want to skip-ahead)
+    python3 scripts/migrate-template.py --to 3 .
+
+    # Force the source version (override config.yaml's declared version)
+    python3 scripts/migrate-template.py --from 1 --to 2 .
+
+    # Dry-run: print the diff without writing
+    python3 scripts/migrate-template.py --dry-run .
+
+YAML round-trip caveats:
+  - PyYAML's safe_dump is used for output. Comments + anchor/alias
+    forms in the consumer's config.yaml are NOT preserved across
+    migrations — the migrated file is a clean re-emit. Templates
+    rarely have inline comments in config.yaml; on the rare occasion
+    they do, the maintainer needs to re-add them after migration.
+  - Keys are sorted alphabetically on output. This trades a one-time
+    re-ordering diff (reviewable) for stable diffs across future
+    migrations.
+  - Migrations should ONLY mutate keys they're explicitly versioning
+    — leave everything else alone so a consumer template's
+    customizations survive.
+
+A future enhancement could detect comments in the original file and
+opt into ruamel.yaml for round-trip-preserving emission. Not done
+today; flag in the migrator's stderr if comments are detected so the
+maintainer knows what they're losing.
+"""
+from __future__ import annotations
+
+import argparse
+import sys
+from copy import deepcopy
+from pathlib import Path
+from typing import Callable
+
+import yaml
+
+# ──────────────────────────────────────────── migrations registry
+
+# Each entry maps a SOURCE version to the function that produces the
+# next version's dict. Currently empty — no v2 yet. The first time a
+# real schema bump lands, MIGRATIONS[1] gets defined alongside the
+# validator's SCHEMA_V2 block.
+MIGRATIONS: dict[int, Callable[[dict], dict]] = {}
+
+
+# ──────────────────────────────────────────── version detection
+
+def _detect_current_version(config: dict) -> int:
+    sv = config.get("template_schema_version")
+    if sv is None:
+        sys.exit(
+            "error: config.yaml has no `template_schema_version`. "
+            "Add it (likely 1 for legacy templates) before migrating."
+        )
+    if not isinstance(sv, int):
+        sys.exit(
+            f"error: template_schema_version must be int, got "
+            f"{type(sv).__name__}={sv!r}."
+        )
+    return sv
+
+
+def _latest_known_version() -> int:
+    """Maximum version reachable by chaining MIGRATIONS from any
+    starting point. With an empty registry, this is 1 (the floor:
+    every existing template is at v1)."""
+    if not MIGRATIONS:
+        return 1
+    return max(MIGRATIONS.keys()) + 1
+
+
+# ──────────────────────────────────────────── core
+
+def migrate_config(config: dict, from_version: int, to_version: int) -> dict:
+    """Apply migrations sequentially from `from_version` to `to_version`.
+    Returns a NEW dict — does not mutate the input.
+
+    Errors loudly when there's no migration registered for an
+    intermediate step: forward-only, never silently skip a hop. If the
+    user asks for a backward migration, error too — schema versions
+    are append-only and we don't ship downgrades."""
+    if to_version < from_version:
+        sys.exit(
+            f"error: cannot migrate backward (from v{from_version} to "
+            f"v{to_version}). Schema versions are append-only — file a "
+            f"new bug + ship a forward migration instead."
+        )
+    current = from_version
+    out = deepcopy(config)
+    while current < to_version:
+        step = MIGRATIONS.get(current)
+        if step is None:
+            sys.exit(
+                f"error: no migration registered for v{current} → "
+                f"v{current + 1}. Either add it to MIGRATIONS in "
+                f"scripts/migrate-template.py or pick a different --to."
+            )
+        out = step(out)
+        # Every migration MUST stamp the new version on its output —
+        # this assertion catches a class of bugs where a migration
+        # forgets to bump template_schema_version.
+        if out.get("template_schema_version") != current + 1:
+            sys.exit(
+                f"error: MIGRATIONS[{current}] did not stamp "
+                f"template_schema_version={current + 1} on its output. "
+                f"This is a bug in the migration function itself."
+            )
+        current += 1
+    return out
+
+
+def _read_yaml(path: Path) -> dict:
+    with open(path) as f:
+        data = yaml.safe_load(f)
+    if not isinstance(data, dict):
+        sys.exit(f"error: {path} root is not a mapping (got {type(data).__name__})")
+    return data
+
+
+def _write_yaml(path: Path, data: dict) -> None:
+    # Sort keys for stable diffs across migrations. This matches what
+    # `yaml.safe_dump` does when we write — consumer repos with
+    # custom orderings will see their config.yaml re-ordered, which is
+    # one of those round-trip lossy tradeoffs that's worth accepting:
+    # the migration moment is rare and the diff is reviewable.
+    with open(path, "w") as f:
+        yaml.safe_dump(data, f, sort_keys=True, default_flow_style=False)
+
+
+# ──────────────────────────────────────────── CLI
+
+def main(argv: list[str] | None = None) -> int:
+    parser = argparse.ArgumentParser(
+        description="Migrate a workspace template's config.yaml across schema versions."
+    )
+    parser.add_argument(
+        "template_dir",
+        type=Path,
+        help="Path to the template repo root (must contain config.yaml).",
+    )
+    parser.add_argument(
+        "--from",
+        dest="from_version",
+        type=int,
+        default=None,
+        help="Source schema version (defaults to whatever config.yaml declares).",
+    )
+    parser.add_argument(
+        "--to",
+        dest="to_version",
+        type=int,
+        default=None,
+        help="Target schema version (defaults to the highest reachable from MIGRATIONS).",
+    )
+    parser.add_argument(
+        "--dry-run",
+        action="store_true",
+        help="Print the migrated YAML to stdout without modifying the file.",
+    )
+    args = parser.parse_args(argv)
+
+    config_path = args.template_dir / "config.yaml"
+    if not config_path.is_file():
+        sys.exit(f"error: {config_path} does not exist")
+
+    config = _read_yaml(config_path)
+
+    from_version = args.from_version
+    if from_version is None:
+        from_version = _detect_current_version(config)
+
+    to_version = args.to_version
+    if to_version is None:
+        to_version = _latest_known_version()
+
+    if from_version == to_version:
+        print(
+            f"nothing to do: config.yaml is already at v{from_version}.",
+            file=sys.stderr,
+        )
+        return 0
+
+    migrated = migrate_config(config, from_version, to_version)
+
+    if args.dry_run:
+        yaml.safe_dump(migrated, sys.stdout, sort_keys=True, default_flow_style=False)
+        return 0
+
+    _write_yaml(config_path, migrated)
+    print(
+        f"✓ migrated {config_path} from v{from_version} → v{to_version}",
+        file=sys.stderr,
+    )
+    return 0
+
+
+if __name__ == "__main__":
+    sys.exit(main())
--- a/scripts/test_migrate_template.py
+++ b/scripts/test_migrate_template.py
@ -0,0 +1,242 @@
+"""Tests for migrate-template.py — pin the migration framework's
+behavior so the FIRST real schema bump (the one that proves the system
+end-to-end) doesn't have to discover semantics under deadline pressure.
+
+The MIGRATIONS registry is empty today (we have only v1), so most
+tests register a synthetic migration scoped to the test, exercise the
+machinery, and unregister at teardown. This way the framework's
+contract is locked in even before any real migration ships.
+"""
+from __future__ import annotations
+
+import importlib.util
+from pathlib import Path
+
+import pytest
+
+
+MIGRATOR_PATH = Path(__file__).resolve().parent / "migrate-template.py"
+
+
+def _load_migrator():
+    """Load migrate-template.py by path (its filename has a hyphen so
+    we can't `import migrate-template` directly)."""
+    spec = importlib.util.spec_from_file_location("migrator", MIGRATOR_PATH)
+    assert spec is not None and spec.loader is not None
+    mod = importlib.util.module_from_spec(spec)
+    spec.loader.exec_module(mod)
+    return mod
+
+
+@pytest.fixture
+def migrator():
+    """Fresh migrator module per test. Registry is global module
+    state; tests that register synthetic migrations must clean up."""
+    mod = _load_migrator()
+    # Snapshot + restore MIGRATIONS so accidentally-leaked entries
+    # from one test don't poison the next.
+    snapshot = dict(mod.MIGRATIONS)
+    yield mod
+    mod.MIGRATIONS.clear()
+    mod.MIGRATIONS.update(snapshot)
+
+
+def _v1_template_config() -> dict:
+    return {
+        "name": "test-template",
+        "runtime": "claude-code",
+        "template_schema_version": 1,
+        "description": "fixture",
+        "tier": 1,
+    }
+
+
+# ─────────────────────────────────────── version detection
+
+def test_detect_current_version_from_config(migrator):
+    config = _v1_template_config()
+    assert migrator._detect_current_version(config) == 1
+
+
+def test_detect_missing_version_exits(migrator):
+    config = {"name": "t", "runtime": "claude-code"}
+    with pytest.raises(SystemExit) as exc:
+        migrator._detect_current_version(config)
+    assert "no `template_schema_version`" in str(exc.value)
+
+
+def test_detect_non_int_version_exits(migrator):
+    config = {"name": "t", "runtime": "claude-code", "template_schema_version": "1"}
+    with pytest.raises(SystemExit) as exc:
+        migrator._detect_current_version(config)
+    assert "must be int" in str(exc.value)
+
+
+# ─────────────────────────────────────── latest-version reachability
+
+def test_latest_with_empty_registry_is_v1(migrator):
+    """Floor case: every existing template is v1 even when no
+    migrations are registered. Latest reachable = v1, so a no-op
+    migration is the only valid action."""
+    migrator.MIGRATIONS.clear()
+    assert migrator._latest_known_version() == 1
+
+
+def test_latest_with_one_migration_is_v2(migrator):
+    """Adding a v1 → v2 migration moves the ceiling to v2. This is
+    what happens the first time a real schema bump ships."""
+    migrator.MIGRATIONS.clear()
+    migrator.MIGRATIONS[1] = lambda c: {**c, "template_schema_version": 2}
+    assert migrator._latest_known_version() == 2
+
+
+def test_latest_chains_through_multiple_migrations(migrator):
+    """Multi-step ceiling: v1 → v2 → v3 chain produces ceiling=3."""
+    migrator.MIGRATIONS.clear()
+    migrator.MIGRATIONS[1] = lambda c: {**c, "template_schema_version": 2}
+    migrator.MIGRATIONS[2] = lambda c: {**c, "template_schema_version": 3}
+    assert migrator._latest_known_version() == 3
+
+
+# ─────────────────────────────────────── migrate_config core
+
+def test_migrate_no_op_when_versions_match(migrator):
+    """from == to → no migration step runs. Should not require any
+    MIGRATIONS entry to be defined."""
+    migrator.MIGRATIONS.clear()
+    out = migrator.migrate_config(_v1_template_config(), 1, 1)
+    assert out == _v1_template_config()
+    assert out is not _v1_template_config()  # deep-copied, not aliased
+
+
+def test_migrate_one_step_applies_function(migrator):
+    """v1 → v2 with a registered migration produces the expected
+    output and stamps the new version."""
+    migrator.MIGRATIONS.clear()
+    migrator.MIGRATIONS[1] = lambda c: {**c, "template_schema_version": 2, "added_in_v2": True}
+    out = migrator.migrate_config(_v1_template_config(), 1, 2)
+    assert out["template_schema_version"] == 2
+    assert out["added_in_v2"] is True
+    # Pre-existing keys preserved.
+    assert out["name"] == "test-template"
+
+
+def test_migrate_chains_v1_to_v3(migrator):
+    """Two-step migration: v1 → v2 → v3. Each step applies in order."""
+    migrator.MIGRATIONS.clear()
+    migrator.MIGRATIONS[1] = lambda c: {**c, "template_schema_version": 2, "from_v1": True}
+    migrator.MIGRATIONS[2] = lambda c: {**c, "template_schema_version": 3, "from_v2": True}
+    out = migrator.migrate_config(_v1_template_config(), 1, 3)
+    assert out["template_schema_version"] == 3
+    assert out["from_v1"] is True
+    assert out["from_v2"] is True
+
+
+def test_migrate_missing_step_exits(migrator):
+    """If MIGRATIONS lacks the v<current> step, fail loud rather than
+    silently skip the version. Forward-only, never silent skip."""
+    migrator.MIGRATIONS.clear()
+    # No MIGRATIONS[1] registered.
+    with pytest.raises(SystemExit) as exc:
+        migrator.migrate_config(_v1_template_config(), 1, 2)
+    assert "no migration registered for v1 → v2" in str(exc.value)
+
+
+def test_migrate_backward_exits(migrator):
+    """Schema versions are append-only. Asking for v2 → v1 must
+    error, not silently downgrade."""
+    migrator.MIGRATIONS.clear()
+    config = {**_v1_template_config(), "template_schema_version": 2}
+    with pytest.raises(SystemExit) as exc:
+        migrator.migrate_config(config, 2, 1)
+    assert "cannot migrate backward" in str(exc.value)
+
+
+def test_migration_must_stamp_new_version(migrator):
+    """A migration function that forgets to bump
+    `template_schema_version` is a bug — catch it at apply time so
+    the framework can never produce an inconsistent output."""
+    migrator.MIGRATIONS.clear()
+    # Buggy migration: doesn't update the version field.
+    migrator.MIGRATIONS[1] = lambda c: {**c, "added_in_v2": True}
+    with pytest.raises(SystemExit) as exc:
+        migrator.migrate_config(_v1_template_config(), 1, 2)
+    assert "did not stamp template_schema_version=2" in str(exc.value)
+
+
+def test_migrate_does_not_mutate_input(migrator):
+    """migrate_config returns a fresh dict; the caller's input is
+    untouched. Pin this so a shared-state migration can't accidentally
+    poison the caller's view of the original template."""
+    migrator.MIGRATIONS.clear()
+    migrator.MIGRATIONS[1] = lambda c: {**c, "template_schema_version": 2}
+    original = _v1_template_config()
+    snapshot = dict(original)
+    _ = migrator.migrate_config(original, 1, 2)
+    assert original == snapshot
+
+
+# ─────────────────────────────────────── CLI smoke
+
+def test_cli_writes_migrated_yaml(migrator, tmp_path):
+    """End-to-end: --to migrates the file in place and exits 0."""
+    migrator.MIGRATIONS.clear()
+    migrator.MIGRATIONS[1] = lambda c: {**c, "template_schema_version": 2, "added": "v2-marker"}
+
+    cfg = tmp_path / "config.yaml"
+    cfg.write_text(
+        "name: t\n"
+        "runtime: claude-code\n"
+        "template_schema_version: 1\n"
+    )
+    rc = migrator.main([str(tmp_path), "--to", "2"])
+    assert rc == 0
+    written = cfg.read_text()
+    assert "template_schema_version: 2" in written
+    assert "added: v2-marker" in written
+
+
+def test_cli_dry_run_does_not_modify_file(migrator, tmp_path, capsys):
+    """--dry-run prints the migrated YAML to stdout but leaves the
+    on-disk file untouched."""
+    migrator.MIGRATIONS.clear()
+    migrator.MIGRATIONS[1] = lambda c: {**c, "template_schema_version": 2}
+
+    cfg = tmp_path / "config.yaml"
+    cfg.write_text(
+        "name: t\n"
+        "runtime: claude-code\n"
+        "template_schema_version: 1\n"
+    )
+    original_disk = cfg.read_text()
+    rc = migrator.main([str(tmp_path), "--to", "2", "--dry-run"])
+    assert rc == 0
+    assert cfg.read_text() == original_disk  # untouched
+
+    captured = capsys.readouterr()
+    assert "template_schema_version: 2" in captured.out
+
+
+def test_cli_no_op_when_already_at_target(migrator, tmp_path, capsys):
+    """If the template is already at the target version, exit 0
+    without modifying the file. Not an error — common when running
+    the migration script defensively in CI."""
+    migrator.MIGRATIONS.clear()
+    cfg = tmp_path / "config.yaml"
+    cfg.write_text(
+        "name: t\n"
+        "runtime: claude-code\n"
+        "template_schema_version: 1\n"
+    )
+    original = cfg.read_text()
+    rc = migrator.main([str(tmp_path), "--to", "1"])
+    assert rc == 0
+    assert cfg.read_text() == original
+
+
+def test_cli_missing_config_exits(migrator, tmp_path):
+    """If the target dir has no config.yaml, error clearly rather
+    than try to apply migrations to nothing."""
+    with pytest.raises(SystemExit) as exc:
+        migrator.main([str(tmp_path), "--to", "2"])
+    assert "config.yaml" in str(exc.value) and "does not exist" in str(exc.value)
--- a/scripts/test_validate_workspace_template.py
+++ b/scripts/test_validate_workspace_template.py
@ -0,0 +1,686 @@
+"""Tests for validate-workspace-template.py — pin the drift contract.
+
+Each test materialises a tiny template directory in a tmpdir, runs the
+validator's check functions in-process, and asserts on the captured
+ERRORS / WARNINGS lists. The 8 template repos in the wild are the
+ground-truth integration test (CI runs this validator against each on
+push), but those repos can change at any time. These tests pin the
+contract itself so a refactor of the validator can't silently weaken
+it.
+
+Important: the validator was chosen to be import-safe (no top-level
+side effects), so the test patches the cwd via os.chdir into tmpdirs.
+The module's ERRORS/WARNINGS lists are reset at the start of each
+test via _reset_validator_state().
+"""
+
+from __future__ import annotations
+
+import importlib.util
+import os
+from pathlib import Path
+
+import pytest
+
+
+VALIDATOR_PATH = Path(__file__).resolve().parent / "validate-workspace-template.py"
+
+
+def _load_validator():
+    """Load the validator module by path (its filename has a hyphen so
+    we can't `import validate-workspace-template` directly)."""
+    spec = importlib.util.spec_from_file_location("validator", VALIDATOR_PATH)
+    assert spec is not None and spec.loader is not None
+    mod = importlib.util.module_from_spec(spec)
+    spec.loader.exec_module(mod)
+    return mod
+
+
+@pytest.fixture
+def validator(monkeypatch):
+    """Fresh validator module per test, cwd pinned to tmpdir below."""
+    mod = _load_validator()
+    mod.ERRORS.clear()
+    mod.WARNINGS.clear()
+    return mod
+
+
+def _good_dockerfile() -> str:
+    """Canonical Dockerfile that should pass every check."""
+    return (
+        "FROM python:3.11-slim\n"
+        "ARG RUNTIME_VERSION=\n"
+        "RUN useradd -u 1000 -m -s /bin/bash agent\n"
+        "WORKDIR /app\n"
+        "COPY requirements.txt .\n"
+        'RUN pip install -r requirements.txt && \\\n'
+        '    if [ -n "${RUNTIME_VERSION}" ]; then \\\n'
+        '      pip install --upgrade "molecule-ai-workspace-runtime==${RUNTIME_VERSION}"; \\\n'
+        '    fi\n'
+        'ENTRYPOINT ["molecule-runtime"]\n'
+    )
+
+
+def _good_config_yaml() -> str:
+    return (
+        "name: test-template\n"
+        "runtime: claude-code\n"
+        "template_schema_version: 1\n"
+        "description: A test template\n"
+        "tier: 1\n"
+    )
+
+
+def _good_requirements_txt() -> str:
+    return "molecule-ai-workspace-runtime>=0.1.0\n"
+
+
+def _materialise(tmp_path: Path, dockerfile: str | None = None,
+                 config_yaml: str | None = None,
+                 requirements: str | None = None,
+                 adapter_py: str | None = None) -> None:
+    if dockerfile is not None:
+        (tmp_path / "Dockerfile").write_text(dockerfile)
+    if config_yaml is not None:
+        (tmp_path / "config.yaml").write_text(config_yaml)
+    if requirements is not None:
+        (tmp_path / "requirements.txt").write_text(requirements)
+    if adapter_py is not None:
+        (tmp_path / "adapter.py").write_text(adapter_py)
+
+
+# ───────────────────────────────────────────────────────── happy paths
+
+def test_canonical_template_passes(validator, tmp_path, monkeypatch):
+    _materialise(
+        tmp_path,
+        dockerfile=_good_dockerfile(),
+        config_yaml=_good_config_yaml(),
+        requirements=_good_requirements_txt(),
+    )
+    monkeypatch.chdir(tmp_path)
+    validator.check_dockerfile()
+    validator.check_config_yaml()
+    validator.check_requirements()
+    validator.check_adapter()
+    assert validator.ERRORS == [], validator.ERRORS
+
+
+def test_custom_entrypoint_script_passes_when_it_execs_runtime(validator, tmp_path, monkeypatch):
+    """claude-code style: ENTRYPOINT [/entrypoint.sh] + entrypoint.sh
+    that exec's molecule-runtime at the end. Must pass."""
+    df = (
+        "FROM python:3.11-slim\n"
+        "ARG RUNTIME_VERSION=\n"
+        "RUN useradd -u 1000 -m -s /bin/bash agent\n"
+        "COPY requirements.txt .\n"
+        'RUN pip install -r requirements.txt && \\\n'
+        '    if [ -n "${RUNTIME_VERSION}" ]; then \\\n'
+        '      pip install --upgrade "molecule-ai-workspace-runtime==${RUNTIME_VERSION}"; \\\n'
+        '    fi\n'
+        "COPY entrypoint.sh /entrypoint.sh\n"
+        'ENTRYPOINT ["/entrypoint.sh"]\n'
+    )
+    ep = (
+        "#!/bin/sh\n"
+        "set -e\n"
+        '# drop privileges then exec the runtime\n'
+        'exec gosu agent molecule-runtime "$@"\n'
+    )
+    _materialise(
+        tmp_path,
+        dockerfile=df,
+        config_yaml=_good_config_yaml(),
+        requirements=_good_requirements_txt(),
+    )
+    (tmp_path / "entrypoint.sh").write_text(ep)
+    monkeypatch.chdir(tmp_path)
+    validator.check_dockerfile()
+    assert validator.ERRORS == [], validator.ERRORS
+
+
+# ───────────────────────────────────────────────────────── Dockerfile drift
+
+def test_wrong_base_image_errors(validator, tmp_path, monkeypatch):
+    df = _good_dockerfile().replace("python:3.11-slim", "python:3.10-alpine")
+    _materialise(tmp_path, dockerfile=df, config_yaml=_good_config_yaml(),
+                 requirements=_good_requirements_txt())
+    monkeypatch.chdir(tmp_path)
+    validator.check_dockerfile()
+    assert any("FROM python:3.11-slim" in e for e in validator.ERRORS)
+
+
+def test_missing_arg_runtime_version_errors(validator, tmp_path, monkeypatch):
+    """Without ARG RUNTIME_VERSION, the cascade rebuild silently ships
+    the previous runtime — the cache trap that bit us 5x on 2026-04-27."""
+    df = _good_dockerfile().replace("ARG RUNTIME_VERSION=\n", "")
+    _materialise(tmp_path, dockerfile=df, config_yaml=_good_config_yaml(),
+                 requirements=_good_requirements_txt())
+    monkeypatch.chdir(tmp_path)
+    validator.check_dockerfile()
+    assert any("ARG RUNTIME_VERSION" in e for e in validator.ERRORS)
+
+
+def test_missing_runtime_version_in_run_block_errors(validator, tmp_path, monkeypatch):
+    """ARG declared but NEVER referenced in a RUN — same cache-trap,
+    different shape. Pin both."""
+    df = (
+        "FROM python:3.11-slim\n"
+        "ARG RUNTIME_VERSION=\n"
+        "RUN useradd -u 1000 -m -s /bin/bash agent\n"
+        "RUN pip install molecule-ai-workspace-runtime\n"
+        'ENTRYPOINT ["molecule-runtime"]\n'
+    )
+    _materialise(tmp_path, dockerfile=df, config_yaml=_good_config_yaml(),
+                 requirements=_good_requirements_txt())
+    monkeypatch.chdir(tmp_path)
+    validator.check_dockerfile()
+    assert any("RUNTIME_VERSION" in e and "RUN block" in e for e in validator.ERRORS)
+
+
+def test_missing_agent_user_errors(validator, tmp_path, monkeypatch):
+    df = _good_dockerfile().replace("RUN useradd -u 1000 -m -s /bin/bash agent\n", "")
+    _materialise(tmp_path, dockerfile=df, config_yaml=_good_config_yaml(),
+                 requirements=_good_requirements_txt())
+    monkeypatch.chdir(tmp_path)
+    validator.check_dockerfile()
+    assert any("agent" in e for e in validator.ERRORS)
+
+
+def test_missing_entrypoint_errors(validator, tmp_path, monkeypatch):
+    df = _good_dockerfile().replace('ENTRYPOINT ["molecule-runtime"]\n', "")
+    _materialise(tmp_path, dockerfile=df, config_yaml=_good_config_yaml(),
+                 requirements=_good_requirements_txt())
+    monkeypatch.chdir(tmp_path)
+    validator.check_dockerfile()
+    assert any("molecule-runtime" in e and ("ENTRYPOINT" in e or "entrypoint" in e)
+               for e in validator.ERRORS)
+
+
+# ───────────────────────────────────────────────────────── config.yaml drift
+
+def test_missing_required_keys_errors(validator, tmp_path, monkeypatch):
+    """A config without template_schema_version short-circuits with a
+    SINGLE actionable error — listing 'also name and runtime are
+    missing' is noise on top of the real problem (no version means the
+    validator can't pick a schema contract to enforce). Once the
+    version is present, the v1 dispatch will list the other missing
+    keys (next test pins that)."""
+    cfg = "description: only description, no name/runtime/version\n"
+    _materialise(tmp_path, dockerfile=_good_dockerfile(), config_yaml=cfg,
+                 requirements=_good_requirements_txt())
+    monkeypatch.chdir(tmp_path)
+    validator.check_config_yaml()
+    missing_msgs = [e for e in validator.ERRORS if "missing required key" in e]
+    # Exactly one error: the missing version. v1 dispatch is skipped
+    # because we can't choose a contract without a version.
+    assert len(missing_msgs) == 1, missing_msgs
+    assert "template_schema_version" in missing_msgs[0]
+
+
+def test_missing_required_keys_under_v1_dispatch_errors(validator, tmp_path, monkeypatch):
+    """When `template_schema_version: 1` IS present but other required
+    keys are missing, the v1 dispatch fires and lists them. Pins that
+    the v1 contract still enforces name + runtime."""
+    cfg = (
+        "template_schema_version: 1\n"
+        "description: only the version + description\n"
+    )
+    _materialise(tmp_path, dockerfile=_good_dockerfile(), config_yaml=cfg,
+                 requirements=_good_requirements_txt())
+    monkeypatch.chdir(tmp_path)
+    validator.check_config_yaml()
+    missing_msgs = [e for e in validator.ERRORS if "missing required key" in e]
+    keys = {e.split("`")[1] for e in missing_msgs}
+    assert "name" in keys, missing_msgs
+    assert "runtime" in keys, missing_msgs
+
+
+def test_string_template_schema_version_errors(validator, tmp_path, monkeypatch):
+    cfg = (
+        "name: t\n"
+        "runtime: claude-code\n"
+        'template_schema_version: "1"\n'  # str, not int
+    )
+    _materialise(tmp_path, dockerfile=_good_dockerfile(), config_yaml=cfg,
+                 requirements=_good_requirements_txt())
+    monkeypatch.chdir(tmp_path)
+    validator.check_config_yaml()
+    assert any("template_schema_version must be int" in e for e in validator.ERRORS)
+
+
+def test_unknown_runtime_warns_not_errors(validator, tmp_path, monkeypatch):
+    cfg = _good_config_yaml().replace("claude-code", "my-experimental-runtime")
+    _materialise(tmp_path, dockerfile=_good_dockerfile(), config_yaml=cfg,
+                 requirements=_good_requirements_txt())
+    monkeypatch.chdir(tmp_path)
+    validator.check_config_yaml()
+    assert any("not in known set" in w for w in validator.WARNINGS)
+    assert validator.ERRORS == []  # custom runtimes are allowed
+
+
+def test_unknown_top_level_keys_warn(validator, tmp_path, monkeypatch):
+    cfg = _good_config_yaml() + "weird_drift_key: something\n"
+    _materialise(tmp_path, dockerfile=_good_dockerfile(), config_yaml=cfg,
+                 requirements=_good_requirements_txt())
+    monkeypatch.chdir(tmp_path)
+    validator.check_config_yaml()
+    assert any("unknown top-level keys" in w and "weird_drift_key" in w
+               for w in validator.WARNINGS)
+
+
+# ───────────────────────────────────────────────────────── requirements.txt
+
+def test_missing_runtime_in_requirements_errors(validator, tmp_path, monkeypatch):
+    _materialise(tmp_path, dockerfile=_good_dockerfile(), config_yaml=_good_config_yaml(),
+                 requirements="fastapi\n")
+    monkeypatch.chdir(tmp_path)
+    validator.check_requirements()
+    assert any("molecule-ai-workspace-runtime" in e for e in validator.ERRORS)
+
+
+# ───────────────────────────────────────────────────────── adapter.py
+
+def test_legacy_molecule_ai_import_warns(validator, tmp_path, monkeypatch):
+    """Pre-#87 package was named differently. Catch any laggards."""
+    adapter = "from molecule_ai.adapter_base import BaseAdapter\n"
+    _materialise(tmp_path, adapter_py=adapter)
+    monkeypatch.chdir(tmp_path)
+    validator.check_adapter()
+    assert any("molecule_ai" in w for w in validator.WARNINGS)
+
+
+def test_modern_molecule_runtime_import_does_not_warn(validator, tmp_path, monkeypatch):
+    """Regression cover: the original validator's warning ('don't import
+    molecule_runtime') was BACKWARDS — that's the canonical name now.
+    Pin that the new validator does NOT emit a false positive."""
+    adapter = "from molecule_runtime.adapter_base import BaseAdapter\n"
+    _materialise(tmp_path, adapter_py=adapter)
+    monkeypatch.chdir(tmp_path)
+    validator.check_adapter()
+    legacy_warnings = [w for w in validator.WARNINGS if "molecule_ai" in w]
+    assert legacy_warnings == [], legacy_warnings
+
+
+# ──────────────────── adapter.py runtime-load (strong contract)
+#
+# These tests pin the contract that adapter.py must be importable AND
+# define at least one BaseAdapter subclass — the same path the runtime
+# uses at workspace boot. Skipped when molecule-ai-workspace-runtime
+# isn't installed in the test environment (the validator's CI workflow
+# guarantees it via `pip install -r requirements.txt` before invoking
+# the validator; local pytest can run with or without it).
+
+def _has_runtime_installed() -> bool:
+    """True if molecule-ai-workspace-runtime is importable. Used to skip
+    the runtime-load tests when running pytest locally without the
+    runtime in the venv."""
+    try:
+        import molecule_runtime.adapters.base  # noqa: F401, PLC0415
+        return True
+    except ImportError:
+        return False
+
+
+_RUNTIME_AVAILABLE = _has_runtime_installed()
+_skip_no_runtime = pytest.mark.skipif(
+    not _RUNTIME_AVAILABLE,
+    reason="molecule-ai-workspace-runtime not installed in test env",
+)
+
+
+def test_no_adapter_skips_runtime_load_silently(validator, tmp_path, monkeypatch):
+    """No adapter.py = use default langgraph executor from the wheel.
+    That's policy, not drift, so runtime-load check should not fire."""
+    monkeypatch.chdir(tmp_path)
+    validator.check_adapter_runtime_load()
+    # No ERRORS, no runtime-load WARNINGS specifically.
+    runtime_load_warnings = [
+        w for w in validator.WARNINGS if "runtime-load check" in w
+    ]
+    assert validator.ERRORS == [], validator.ERRORS
+    assert runtime_load_warnings == [], runtime_load_warnings
+
+
+def _good_adapter_py() -> str:
+    """A fully concrete BaseAdapter subclass — overrides every
+    abstract method BaseAdapter declares. Mirrors the shape of all 8
+    production templates so tests of the runtime-load check exercise
+    the same path the real templates do."""
+    return (
+        "from molecule_runtime.adapters.base import BaseAdapter\n"
+        "\n"
+        "class MyAdapter(BaseAdapter):\n"
+        "    @staticmethod\n"
+        "    def name(): return 'test-adapter'\n"
+        "    @staticmethod\n"
+        "    def display_name(): return 'Test'\n"
+        "    @staticmethod\n"
+        "    def description(): return 'fixture adapter'\n"
+        "    def setup(self, config): pass\n"
+        "    def create_executor(self, config): return None\n"
+    )
+
+
+@_skip_no_runtime
+def test_valid_baseadapter_subclass_passes(validator, tmp_path, monkeypatch):
+    """The happy path: adapter.py defines a fully concrete class
+    inheriting from BaseAdapter. All 8 production templates match
+    this shape."""
+    _materialise(tmp_path, adapter_py=_good_adapter_py())
+    monkeypatch.chdir(tmp_path)
+    validator.check_adapter_runtime_load()
+    assert validator.ERRORS == [], validator.ERRORS
+
+
+@_skip_no_runtime
+def test_adapter_with_no_baseadapter_subclass_errors(validator, tmp_path, monkeypatch):
+    """The most insidious silent-failure mode: adapter.py imports
+    cleanly, defines classes, but NONE inherit from BaseAdapter. The
+    runtime's class-discovery would silently skip this file and fall
+    through to the default executor — workspace would 'work' but with
+    the wrong runtime. Must hard-error."""
+    adapter = (
+        "class JustSomePlainClass:\n"
+        "    def run(self): pass\n"
+    )
+    _materialise(tmp_path, adapter_py=adapter)
+    monkeypatch.chdir(tmp_path)
+    validator.check_adapter_runtime_load()
+    assert any(
+        "no concrete class inheriting from" in e and "BaseAdapter" in e
+        for e in validator.ERRORS
+    ), validator.ERRORS
+
+
+@_skip_no_runtime
+def test_abstract_intermediate_alone_does_not_count(validator, tmp_path, monkeypatch):
+    """A locally-defined abstract subclass (e.g., a framework-level
+    intermediate that templates extend) must not satisfy the contract
+    on its own. The runtime needs a CONCRETE class to instantiate;
+    accepting an abstract one would let workspace boot fail at
+    instantiation time instead of validator time."""
+    adapter = (
+        "from abc import abstractmethod\n"
+        "from molecule_runtime.adapters.base import BaseAdapter\n"
+        "\n"
+        "class FrameworkAdapter(BaseAdapter):\n"
+        "    @abstractmethod\n"
+        "    def my_abstract_method(self): ...\n"
+    )
+    _materialise(tmp_path, adapter_py=adapter)
+    monkeypatch.chdir(tmp_path)
+    validator.check_adapter_runtime_load()
+    assert any(
+        "no concrete class inheriting from" in e
+        for e in validator.ERRORS
+    ), validator.ERRORS
+
+
+@_skip_no_runtime
+def test_abstract_plus_concrete_passes_with_concrete_only(validator, tmp_path, monkeypatch):
+    """The legitimate factoring pattern: define an abstract framework-
+    level intermediate, then a concrete leaf. Only the concrete leaf
+    counts toward the "at least one" requirement — the framework
+    intermediate is filtered out by `inspect.isabstract`."""
+    adapter = (
+        "from abc import abstractmethod\n"
+        "from molecule_runtime.adapters.base import BaseAdapter\n"
+        "\n"
+        "class FrameworkAdapter(BaseAdapter):\n"
+        "    @abstractmethod\n"
+        "    def framework_specific_hook(self): ...\n"
+        "\n"
+        "class ConcreteAdapter(FrameworkAdapter):\n"
+        "    def framework_specific_hook(self): pass\n"
+        "    @staticmethod\n"
+        "    def name(): return 'concrete'\n"
+        "    @staticmethod\n"
+        "    def display_name(): return 'Concrete'\n"
+        "    @staticmethod\n"
+        "    def description(): return 'leaf'\n"
+        "    def setup(self, config): pass\n"
+        "    def create_executor(self, config): return None\n"
+    )
+    _materialise(tmp_path, adapter_py=adapter)
+    monkeypatch.chdir(tmp_path)
+    validator.check_adapter_runtime_load()
+    assert validator.ERRORS == [], validator.ERRORS
+
+
+@_skip_no_runtime
+def test_multiple_concrete_baseadapter_subclasses_errors(validator, tmp_path, monkeypatch):
+    """Two concrete BaseAdapter subclasses in the same file is a
+    silent ambiguity: the runtime's class-discovery picks one per
+    its own resolution rules, so the WRONG class might be loaded
+    after a future runtime refactor. Force the maintainer to either
+    mark intermediates abstract or split into separate modules."""
+    adapter = (
+        "from molecule_runtime.adapters.base import BaseAdapter\n"
+        "\n"
+        "class FirstConcreteAdapter(BaseAdapter):\n"
+        "    @staticmethod\n"
+        "    def name(): return 'first'\n"
+        "    @staticmethod\n"
+        "    def display_name(): return 'First'\n"
+        "    @staticmethod\n"
+        "    def description(): return 'first'\n"
+        "    def setup(self, config): pass\n"
+        "    def create_executor(self, config): return None\n"
+        "\n"
+        "class SecondConcreteAdapter(BaseAdapter):\n"
+        "    @staticmethod\n"
+        "    def name(): return 'second'\n"
+        "    @staticmethod\n"
+        "    def display_name(): return 'Second'\n"
+        "    @staticmethod\n"
+        "    def description(): return 'second'\n"
+        "    def setup(self, config): pass\n"
+        "    def create_executor(self, config): return None\n"
+    )
+    _materialise(tmp_path, adapter_py=adapter)
+    monkeypatch.chdir(tmp_path)
+    validator.check_adapter_runtime_load()
+    multi_errors = [
+        e for e in validator.ERRORS
+        if "multiple concrete BaseAdapter subclasses" in e
+    ]
+    assert len(multi_errors) == 1, validator.ERRORS
+    # Both names should appear in the error so the operator knows
+    # exactly which classes are competing.
+    assert "FirstConcreteAdapter" in multi_errors[0]
+    assert "SecondConcreteAdapter" in multi_errors[0]
+
+
+@_skip_no_runtime
+def test_aliased_concrete_class_is_deduplicated(validator, tmp_path, monkeypatch):
+    """Production templates often do `Adapter = ConcreteAdapter` as a
+    module-level alias for the runtime's class-discovery convention.
+    `vars(mod)` returns BOTH bindings pointing at the same class
+    object — without identity-based dedup, the multi-concrete-class
+    error fires falsely (regression caught against the real langgraph
+    template during the Q3 fix). Pin that aliased templates pass."""
+    adapter = _good_adapter_py() + "\nAdapter = MyAdapter\n"
+    _materialise(tmp_path, adapter_py=adapter)
+    monkeypatch.chdir(tmp_path)
+    validator.check_adapter_runtime_load()
+    assert validator.ERRORS == [], validator.ERRORS
+
+
+@_skip_no_runtime
+def test_only_imported_baseadapter_subclass_does_not_count(validator, tmp_path, monkeypatch):
+    """Re-exported imports do not satisfy the contract. If the only
+    BaseAdapter subclass in adapter.py is something `from
+    molecule_runtime.adapters.base import BaseAdapter` re-exports (or
+    a future abstract intermediate), the runtime's class-discovery
+    would correctly skip it — and the validator must too. Without
+    this check, an `__module__`-filter regression would mask the
+    'no concrete subclass' case the gate exists to catch.
+    """
+    adapter = (
+        # This file imports BaseAdapter but never SUBCLASSES it.
+        # `BaseAdapter` itself is in vars(mod) but it's already
+        # filtered by `obj is not BaseAdapter`. The new __module__
+        # filter ensures no third-party class slipping in via import
+        # is counted either.
+        "from molecule_runtime.adapters.base import BaseAdapter  # noqa: F401\n"
+    )
+    _materialise(tmp_path, adapter_py=adapter)
+    monkeypatch.chdir(tmp_path)
+    validator.check_adapter_runtime_load()
+    assert any(
+        "no concrete class inheriting from" in e
+        for e in validator.ERRORS
+    ), validator.ERRORS
+
+
+@_skip_no_runtime
+def test_adapter_with_syntax_error_errors(validator, tmp_path, monkeypatch):
+    """SyntaxError at import is the same failure mode that crashes
+    workspace boot. Catch it here."""
+    adapter = "this is not valid python at all\n"
+    _materialise(tmp_path, adapter_py=adapter)
+    monkeypatch.chdir(tmp_path)
+    validator.check_adapter_runtime_load()
+    assert any("failed to import" in e for e in validator.ERRORS), validator.ERRORS
+
+
+@_skip_no_runtime
+def test_adapter_with_import_error_errors(validator, tmp_path, monkeypatch):
+    """ImportError during adapter.py exec — same failure mode as
+    workspace boot. The error message should point the contributor at
+    requirements.txt as the right fix."""
+    adapter = (
+        "import this_package_definitely_does_not_exist_0xdeadbeef\n"
+        "from molecule_runtime.adapters.base import BaseAdapter\n"
+    )
+    _materialise(tmp_path, adapter_py=adapter)
+    monkeypatch.chdir(tmp_path)
+    validator.check_adapter_runtime_load()
+    assert any(
+        "failed to import" in e and "ModuleNotFoundError" in e
+        for e in validator.ERRORS
+    ), validator.ERRORS
+
+
+# ─────────────────────────────────────── schema-version dispatch
+#
+# Pin the contract that the validator routes to per-version checks
+# based on `template_schema_version`, that unknown versions hard-fail,
+# and that deprecated versions warn but pass.
+
+def test_v1_is_in_known_schema_versions(validator):
+    """Document the floor: v1 is always understood. Future bumps add
+    versions; v1 stays accepted (or deprecated) but the validator
+    never silently drops it."""
+    assert 1 in validator.KNOWN_SCHEMA_VERSIONS or 1 in validator.DEPRECATED_SCHEMA_VERSIONS
+
+
+def test_unknown_schema_version_errors(validator, tmp_path, monkeypatch):
+    """A template declaring template_schema_version=999 must hard-fail
+    — silently allowing it would let drift land disguised as a
+    'future' version."""
+    cfg = (
+        "name: t\n"
+        "runtime: claude-code\n"
+        "template_schema_version: 999\n"
+    )
+    _materialise(tmp_path, dockerfile=_good_dockerfile(), config_yaml=cfg,
+                 requirements=_good_requirements_txt())
+    monkeypatch.chdir(tmp_path)
+    validator.check_config_yaml()
+    assert any("template_schema_version=999 is unknown" in e
+               for e in validator.ERRORS), validator.ERRORS
+
+
+def test_deprecated_schema_version_warns_but_passes(validator, tmp_path, monkeypatch):
+    """During a deprecation window, v<N-1> templates still validate
+    (so the consumer can keep merging unrelated PRs while migrating)
+    but the warning surfaces the migration command."""
+    # Inject a fake deprecated version for the duration of this test —
+    # we don't have a real deprecated version yet (only v1 exists).
+    validator.KNOWN_SCHEMA_VERSIONS.add(2)
+    validator.DEPRECATED_SCHEMA_VERSIONS.add(1)
+    validator.SCHEMA_CHECKS[2] = lambda config: None  # accept-all stub for v2
+
+    try:
+        cfg = (
+            "name: t\n"
+            "runtime: claude-code\n"
+            "template_schema_version: 1\n"
+        )
+        _materialise(tmp_path, dockerfile=_good_dockerfile(), config_yaml=cfg,
+                     requirements=_good_requirements_txt())
+        monkeypatch.chdir(tmp_path)
+        validator.check_config_yaml()
+        # No errors — deprecation is warning-only.
+        assert validator.ERRORS == [], validator.ERRORS
+        assert any(
+            "template_schema_version=1 is deprecated" in w
+            and "migrate-template.py" in w
+            for w in validator.WARNINGS
+        ), validator.WARNINGS
+    finally:
+        validator.KNOWN_SCHEMA_VERSIONS.discard(2)
+        validator.DEPRECATED_SCHEMA_VERSIONS.discard(1)
+        validator.SCHEMA_CHECKS.pop(2, None)
+
+
+def test_per_version_dispatch_calls_correct_check(validator, tmp_path, monkeypatch):
+    """Pin that SCHEMA_CHECKS[N] is the function called when a template
+    declares template_schema_version=N. Without this, the dispatch could
+    fire the wrong contract on a multi-version codebase."""
+    called: list[int] = []
+    validator.KNOWN_SCHEMA_VERSIONS.add(7)
+    validator.SCHEMA_CHECKS[7] = lambda config: called.append(7)
+
+    try:
+        cfg = (
+            "name: t\n"
+            "runtime: claude-code\n"
+            "template_schema_version: 7\n"
+        )
+        _materialise(tmp_path, dockerfile=_good_dockerfile(), config_yaml=cfg,
+                     requirements=_good_requirements_txt())
+        monkeypatch.chdir(tmp_path)
+        validator.check_config_yaml()
+        assert called == [7], f"v7 dispatch was not invoked; called={called}"
+    finally:
+        validator.KNOWN_SCHEMA_VERSIONS.discard(7)
+        validator.SCHEMA_CHECKS.pop(7, None)
+
+
+def test_runtime_not_installed_warns_not_errors(validator, tmp_path, monkeypatch):
+    """If the validator runs in an env without molecule-ai-workspace-runtime,
+    we WARN (loud) but don't error — hard-erroring would say 'your adapter
+    is broken' when the actual issue is the CI infra. Mock the import to
+    simulate this regardless of what's installed locally."""
+    adapter = (
+        "from molecule_runtime.adapters.base import BaseAdapter\n"
+        "class A(BaseAdapter): pass\n"
+    )
+    _materialise(tmp_path, adapter_py=adapter)
+    monkeypatch.chdir(tmp_path)
+
+    # Force the runtime import to fail by hiding the module.
+    import sys
+    saved = {k: sys.modules.pop(k) for k in list(sys.modules)
+             if k.startswith("molecule_runtime")}
+    saved_meta = sys.meta_path[:]
+    class _Block:
+        def find_spec(self, name, path=None, target=None):
+            if name == "molecule_runtime" or name.startswith("molecule_runtime."):
+                raise ImportError(f"blocked for test: {name}")
+            return None
+    sys.meta_path.insert(0, _Block())
+    try:
+        validator.check_adapter_runtime_load()
+    finally:
+        sys.meta_path[:] = saved_meta
+        sys.modules.update(saved)
+
+    assert validator.ERRORS == [], validator.ERRORS
+    assert any(
+        "skipping runtime-load check" in w
+        for w in validator.WARNINGS
+    ), validator.WARNINGS
--- a/scripts/validate-workspace-template.py
+++ b/scripts/validate-workspace-template.py
@ -1,47 +1,440 @@
 #!/usr/bin/env python3
-"""Validate a Molecule AI workspace template repo."""
-import os, sys, yaml
+"""Prototype of the beefed-up validate-workspace-template.py.

-errors = []
+Run from a template repo's root. Surfaces hard structural drift in
+Dockerfile + config.yaml + requirements.txt against the canonical
+contract. Replaces the existing soft-warnings-only validator at
+molecule-ci/scripts/validate-workspace-template.py.
+"""
+import os, re, sys
+import yaml

-if not os.path.isfile("config.yaml"):
-    print("::error::config.yaml not found at repo root")
-    sys.exit(1)
+ERRORS: list[str] = []
+WARNINGS: list[str] = []

-with open("config.yaml") as f:
-    config = yaml.safe_load(f)
+def err(msg: str) -> None:
+    ERRORS.append(msg)

-if not config.get("name"):
-    errors.append("Missing required field: name")
-if not config.get("runtime"):
-    errors.append("Missing required field: runtime")
+def warn(msg: str) -> None:
+    WARNINGS.append(msg)

-known = {"langgraph", "claude-code", "crewai", "autogen", "deepagents", "hermes", "gemini-cli", "openclaw"}
-runtime = config.get("runtime", "")
-if runtime and runtime not in known:
-    print(f"::warning::Runtime '{runtime}' is not in the known set. OK for custom runtimes.")

-# Check for legacy imports
-if os.path.isfile("adapter.py"):
-    with open("adapter.py") as f:
-        content = f.read()
-        if "molecule_runtime" in content:
-            print("::warning::adapter.py imports 'molecule_runtime' — legacy import, use 'molecule_ai' or platform SDK")
+# ───────────────────────────────────────────────────────────── Dockerfile

-# Check for missing molecule-ai-workspace-runtime dependency hint
-if os.path.isfile("Dockerfile"):
-    with open("Dockerfile") as f:
-        content = f.read()
-        if "molecule-ai-workspace-runtime" not in content:
-            print("::warning::Dockerfile does not reference 'molecule-ai-workspace-runtime' — may need base runtime package")
+def check_dockerfile() -> None:
+    if not os.path.isfile("Dockerfile"):
+        warn("no Dockerfile — skipping container drift checks (library-only template?)")
+        return
+    df = open("Dockerfile").read()

-sv = config.get("template_schema_version")
-if sv is None:
-    errors.append("Missing template_schema_version (add: template_schema_version: 1)")
+    if not re.search(r"^FROM python:3\.11-slim\b", df, re.MULTILINE):
+        err("Dockerfile: must base on `FROM python:3.11-slim` — see contract doc")

-if errors:
-    for e in errors:
+    if not re.search(r"^ARG RUNTIME_VERSION", df, re.MULTILINE):
+        err(
+            "Dockerfile: missing `ARG RUNTIME_VERSION=`. "
+            "This arg invalidates the pip-install cache when the cascade "
+            "publishes a new wheel; without it, the cascade silently ships "
+            "the previous runtime (cache trap observed 2026-04-27, 5x in a row)."
+        )
+
+    if "molecule-ai-workspace-runtime" not in df and not (
+        os.path.isfile("requirements.txt")
+        and "molecule-ai-workspace-runtime" in open("requirements.txt").read()
+    ):
+        err("Dockerfile + requirements.txt: must install `molecule-ai-workspace-runtime`")
+
+    if "${RUNTIME_VERSION}" not in df and "$RUNTIME_VERSION" not in df:
+        err(
+            "Dockerfile: must reference `${RUNTIME_VERSION}` in a pip install RUN block. "
+            'Pattern: `if [ -n "${RUNTIME_VERSION}" ]; then '
+            'pip install --no-cache-dir --upgrade "molecule-ai-workspace-runtime==${RUNTIME_VERSION}"; fi`'
+        )
+
+    if not re.search(r"useradd[^\n]*\bagent\b", df):
+        err(
+            "Dockerfile: must create the `agent` user "
+            "(`RUN useradd -u 1000 -m -s /bin/bash agent`). "
+            "Runtime drops to uid 1000; without it, claude-code refuses "
+            "`--dangerously-skip-permissions` for safety."
+        )
+
+    has_direct_entrypoint = bool(
+        re.search(r'(ENTRYPOINT|CMD)\s*\[?\s*"?molecule-runtime"?', df)
+    )
+    has_custom_entrypoint = bool(
+        re.search(r'ENTRYPOINT\s*\[?\s*"?(/?[\w./-]*entrypoint\.sh|/?[\w./-]*start\.sh)', df)
+    )
+    if not has_direct_entrypoint and not has_custom_entrypoint:
+        err(
+            "Dockerfile: must end at `molecule-runtime` "
+            "(`ENTRYPOINT [\"molecule-runtime\"]` or via custom "
+            "entrypoint.sh / start.sh that exec's molecule-runtime)"
+        )
+    if has_custom_entrypoint:
+        m = re.search(r'ENTRYPOINT\s*\[?\s*"?(/?[\w./-]+)', df)
+        if m:
+            ep_in_image = m.group(1).lstrip("/")
+            ep_local = os.path.basename(ep_in_image)
+            if os.path.isfile(ep_local):
+                if "molecule-runtime" not in open(ep_local).read():
+                    err(
+                        f"Dockerfile uses ENTRYPOINT [{ep_in_image}] but "
+                        f"{ep_local} does not exec `molecule-runtime`"
+                    )
+            else:
+                warn(
+                    f"Dockerfile points ENTRYPOINT at {ep_in_image} but "
+                    f"{ep_local} not found in repo root — verify it's COPYed in"
+                )
+
+
+# ───────────────────────────────────────────────────────────── config.yaml
+
+KNOWN_RUNTIMES = {
+    "langgraph",
+    "claude-code",
+    "crewai",
+    "autogen",
+    "deepagents",
+    "hermes",
+    "gemini-cli",
+    "openclaw",
+}
+
+# ──────────────────────────────────────────── schema versioning
+#
+# `template_schema_version: int` in each template's config.yaml selects
+# which contract this validator enforces. Versions are FROZEN once
+# shipped — never edit a SCHEMA_V* constant in place. To bump:
+#
+#   1. Add `SCHEMA_V<N+1>_REQUIRED_KEYS` / `SCHEMA_V<N+1>_OPTIONAL_KEYS`
+#      describing the new contract.
+#   2. Add `_check_schema_v<N+1>(config)` that enforces it.
+#   3. Add the entry to SCHEMA_CHECKS below.
+#   4. Move version N from KNOWN_SCHEMA_VERSIONS to
+#      DEPRECATED_SCHEMA_VERSIONS so existing v<N> templates warn but
+#      still pass — buys a deprecation window.
+#   5. Ship a corresponding migration in scripts/migrate-template.py's
+#      MIGRATIONS table (key = N, value = callable that produces the
+#      v<N+1> dict from a v<N> dict).
+#   6. Run migrate-template.py on each consumer template repo as a PR.
+#   7. After all consumers migrate, drop version N from
+#      DEPRECATED_SCHEMA_VERSIONS in a follow-up PR.
+#
+# This discipline means a schema version always has exactly one valid
+# enforcement function, never "branch on minor variants" — the whole
+# point of versioning is to avoid that drift.
+
+KNOWN_SCHEMA_VERSIONS: set[int] = {1}
+DEPRECATED_SCHEMA_VERSIONS: set[int] = set()
+
+# `template_schema_version` is part of the v1 contract and listed
+# here for documentation, but the top-level `check_config_yaml`
+# already verifies it's present and is an int before dispatching
+# here — `_check_schema_v1` does NOT re-check it (would be dead
+# defensive code). The key DOES need to appear in the union of
+# required + optional so it isn't flagged as unknown drift in the
+# `unknown top-level keys` warning at the end of `_check_schema_v1`.
+SCHEMA_V1_REQUIRED_KEYS = ["name", "runtime", "template_schema_version"]
+SCHEMA_V1_OPTIONAL_KEYS = [
+    "description",
+    "version",
+    "tier",
+    "model",
+    "models",
+    "runtime_config",
+    "env",
+    "skills",
+    "tools",
+    "a2a",
+    "delegation",
+    "prompt_files",
+    "bridge",
+    "governance",
+]
+
+
+def _check_schema_v1(config: dict) -> None:
+    """v1 contract — the keys frozen as of monorepo task #90's Phase 2.
+    Currently every production template runs this version. Do NOT edit
+    in place; add v2 instead and migrate consumers (see header)."""
+    for key in SCHEMA_V1_REQUIRED_KEYS:
+        if key == "template_schema_version":
+            # Already verified present + int by the dispatcher; skip
+            # to avoid emitting a duplicate or contradictory error.
+            continue
+        if key not in config:
+            err(f"config.yaml: missing required key `{key}`")
+    runtime = config.get("runtime")
+    if runtime and runtime not in KNOWN_RUNTIMES:
+        warn(
+            f"config.yaml: runtime `{runtime}` not in known set "
+            f"{sorted(KNOWN_RUNTIMES)} — OK for custom runtimes; "
+            f"if canonical, add it to KNOWN_RUNTIMES in validate-workspace-template.py"
+        )
+    unknown = set(config.keys()) - set(SCHEMA_V1_REQUIRED_KEYS) - set(SCHEMA_V1_OPTIONAL_KEYS)
+    if unknown:
+        warn(
+            f"config.yaml: unknown top-level keys {sorted(unknown)} — "
+            f"may be drift. If intentional, add them to SCHEMA_V1_OPTIONAL_KEYS."
+        )
+
+
+SCHEMA_CHECKS = {
+    1: _check_schema_v1,
+}
+
+
+def check_config_yaml() -> None:
+    if not os.path.isfile("config.yaml"):
+        err("config.yaml: missing at repo root")
+        return
+    with open("config.yaml") as f:
+        try:
+            config = yaml.safe_load(f)
+        except yaml.YAMLError as e:
+            err(f"config.yaml: invalid YAML — {e}")
+            return
+    if not isinstance(config, dict):
+        err(f"config.yaml: root must be a mapping, got {type(config).__name__}")
+        return
+
+    # Schema-version dispatch. Validate the version field shape first
+    # so error messages are actionable.
+    sv = config.get("template_schema_version")
+    if sv is None:
+        err("config.yaml: missing required key `template_schema_version`")
+        # Can't dispatch without a version. Don't fall through to v1
+        # checks — that would mask the missing-version error.
+        return
+    if not isinstance(sv, int):
+        err(
+            f"config.yaml: template_schema_version must be int, "
+            f"got {type(sv).__name__}={sv!r}"
+        )
+        return
+
+    if sv in DEPRECATED_SCHEMA_VERSIONS:
+        latest = max(KNOWN_SCHEMA_VERSIONS)
+        warn(
+            f"config.yaml: template_schema_version={sv} is deprecated; "
+            f"migrate to v{latest} via "
+            f"`python3 scripts/migrate-template.py --to {latest} .`. "
+            f"Support for v{sv} will be removed in a future cycle."
+        )
+    elif sv not in KNOWN_SCHEMA_VERSIONS:
+        valid = sorted(KNOWN_SCHEMA_VERSIONS | DEPRECATED_SCHEMA_VERSIONS)
+        err(
+            f"config.yaml: template_schema_version={sv} is unknown — "
+            f"this validator understands {valid}. Either bump the "
+            f"validator (add a SCHEMA_V{sv} block) or correct the version."
+        )
+        return
+
+    SCHEMA_CHECKS[sv](config)
+
+
+# ───────────────────────────────────────────────────────────── requirements.txt
+
+def check_requirements() -> None:
+    if not os.path.isfile("requirements.txt"):
+        warn("no requirements.txt — Dockerfile must install runtime by other means")
+        return
+    reqs = open("requirements.txt").read()
+    if "molecule-ai-workspace-runtime" not in reqs:
+        err("requirements.txt: must declare `molecule-ai-workspace-runtime` as a dependency")
+
+
+# ───────────────────────────────────────────────────────────── adapter.py
+
+def check_adapter() -> None:
+    """Static-text adapter checks. Fast — no imports."""
+    if not os.path.isfile("adapter.py"):
+        warn("no adapter.py — runtime will use the default langgraph executor from the wheel")
+        return
+    content = open("adapter.py").read()
+    # The original validator's warning ("don't import molecule_runtime") was
+    # backwards — that's the canonical package name. The previous check shipped
+    # for ~2 weeks producing false-positive warnings. Removed.
+    if re.search(r"\bfrom molecule_ai\b|\bimport molecule_ai\b", content):
+        warn(
+            "adapter.py imports `molecule_ai` — that's a pre-#87 package name; "
+            "use `molecule_runtime`"
+        )
+
+
+def check_adapter_runtime_load() -> None:
+    """Strong adapter contract: import adapter.py the same way the runtime
+    does at workspace boot, and assert at least one class in it inherits
+    from molecule_runtime.adapters.base.BaseAdapter.
+
+    The Docker build smoke test in validate-workspace-template.yml builds
+    the image but doesn't RUN it — adapter.py is only imported at
+    container startup. So a template with a syntactically-valid Dockerfile
+    + a broken adapter.py (wrong base class, ImportError on a missing
+    framework dep, typo) builds clean and fails on first user prompt.
+    This check exercises the same class-resolution path the runtime uses,
+    so a passing validator means a passing workspace boot for the
+    adapter-load step.
+
+    Skip conditions:
+      - No adapter.py exists. Templates without one inherit the default
+        langgraph executor from the wheel (intentional, not drift).
+      - molecule-ai-workspace-runtime not importable in the validator
+        environment. That's a CI-config bug — the workflow that runs
+        this validator must `pip install molecule-ai-workspace-runtime`
+        first. Warn loudly so the misconfiguration surfaces, but don't
+        hard-fail (we'd be saying "your adapter is broken" when the
+        actual cause is missing infra). The `pip install -r
+        requirements.txt` step in validate-workspace-template.yml
+        normally satisfies this transitively.
+
+    Hard-error conditions:
+      - adapter.py raises any exception during import. The same
+        exception would crash workspace boot.
+      - No class in the module inherits from BaseAdapter. The runtime's
+        adapter-discovery would silently fall through to the default
+        executor, ignoring this file — exactly the kind of human-error
+        mode this contract is supposed to eliminate.
+    """
+    if not os.path.isfile("adapter.py"):
+        return  # check_adapter() already warned; don't double-warn
+
+    try:
+        from molecule_runtime.adapters.base import BaseAdapter  # noqa: PLC0415
+    except ImportError:
+        warn(
+            "adapter.py: skipping runtime-load check — "
+            "`molecule-ai-workspace-runtime` not installed in the validator "
+            "environment. The CI workflow that invokes this script must "
+            "`pip install molecule-ai-workspace-runtime` (or `pip install "
+            "-r requirements.txt`) first; otherwise this critical check is "
+            "silently bypassed."
+        )
+        return
+
+    # Load adapter.py as a module under a per-call-unique name so it
+    # doesn't collide with any installed `adapter` package OR with a
+    # previous invocation in the same Python process. The id() of the
+    # cwd-anchored absolute path is sufficient — we just need
+    # different invocations to land on different sys.modules keys so
+    # one invocation's lingering references can't bleed into the
+    # next's adapter discovery.
+    import importlib.util  # noqa: PLC0415
+    import sys             # noqa: PLC0415
+
+    abs_path = os.path.abspath("adapter.py")
+    module_name = f"_template_adapter_under_validation_{abs(hash(abs_path)):x}"
+    spec = importlib.util.spec_from_file_location(module_name, "adapter.py")
+    if spec is None or spec.loader is None:
+        err("adapter.py: cannot construct an import spec — file may be unreadable")
+        return
+
+    mod = importlib.util.module_from_spec(spec)
+    sys.modules[module_name] = mod  # required so dataclass / pydantic refs resolve
+
+    try:
+        spec.loader.exec_module(mod)
+    except Exception as e:
+        err(
+            f"adapter.py: failed to import — `{type(e).__name__}: {e}`. "
+            f"This is the same failure mode that crashes workspace boot at "
+            f"runtime; the cure is to fix the adapter, not skip this check. "
+            f"If the import fails because a transitive dep isn't installed in "
+            f"this CI env, add it to the template's requirements.txt — that's "
+            f"what the workspace container does, and the validator job "
+            f"installs requirements.txt before running this check."
+        )
+        sys.modules.pop(module_name, None)
+        return
+
+    # Class discovery: only count CONCRETE classes DEFINED in
+    # adapter.py, not re-exported imports and not abstract
+    # intermediates. Three filter axes:
+    #
+    #   1. `__module__ == module_name` — defined HERE, not imported
+    #      from molecule_runtime or a third-party framework.
+    #   2. `obj is not BaseAdapter` — BaseAdapter itself doesn't count.
+    #   3. `not inspect.isabstract(obj)` — abstract intermediates
+    #      defined locally don't count. Catches the
+    #      `class Framework(BaseAdapter): pass` + `class Concrete(Framework):`
+    #      pattern where vars(mod) has BOTH and we'd otherwise count
+    #      both as "real" adapters.
+    import inspect  # noqa: PLC0415
+    # Deduplicate by class identity. Many production adapters do
+    # `Adapter = ConcreteAdapter` as a module-level alias for the
+    # runtime's discovery — `vars(mod)` returns both bindings
+    # (`Adapter` AND `ConcreteAdapter`) pointing at the same class
+    # object. Without dedup, the multiple-concrete-subclasses
+    # error fires falsely on every aliased template.
+    adapter_classes = list({
+        id(obj): obj
+        for name, obj in vars(mod).items()
+        if isinstance(obj, type)
+        and obj is not BaseAdapter
+        and issubclass(obj, BaseAdapter)
+        and getattr(obj, "__module__", None) == module_name
+        and not inspect.isabstract(obj)
+    }.values())
+    sys.modules.pop(module_name, None)
+
+    if not adapter_classes:
+        err(
+            "adapter.py: no concrete class inheriting from "
+            "`molecule_runtime.adapters.base.BaseAdapter` defined "
+            "in this file. The runtime resolves the adapter via "
+            "class discovery on adapter.py's own definitions — "
+            "imports of base classes from molecule_runtime do not "
+            "count, and abstract intermediates do not count. "
+            "Without a concrete subclass DEFINED here, workspace "
+            "boot falls through to the default langgraph executor "
+            "and ignores this file silently. If that's intentional, "
+            "delete adapter.py."
+        )
+        return
+
+    if len(adapter_classes) > 1:
+        names = sorted(c.__name__ for c in adapter_classes)
+        err(
+            f"adapter.py: multiple concrete BaseAdapter subclasses "
+            f"defined: {names}. The runtime's class-discovery picks "
+            f"one per its own resolution rules (typically last-defined "
+            f"or first-by-iteration), so shipping more than one is a "
+            f"silent ambiguity — the wrong class might be loaded after "
+            f"a future runtime refactor. Either keep exactly one "
+            f"concrete subclass + mark the others abstract via "
+            f"`abc.ABC` / abstract methods, or move them to separate "
+            f"importable modules."
+        )
+
+
+def main() -> None:
+    # --static-only skips check_adapter_runtime_load(), which calls
+    # importlib's exec_module() on the template's adapter.py. That's
+    # untrusted code execution — fine on internal PRs and post-merge,
+    # unsafe on external fork PRs (#135). Static checks (file presence,
+    # YAML parse, regex/AST inspection) stay enabled in static mode.
+    static_only = "--static-only" in sys.argv
+
+    check_dockerfile()
+    check_config_yaml()
+    check_requirements()
+    check_adapter()
+    if not static_only:
+        check_adapter_runtime_load()
+    else:
+        print("::notice::skipping adapter.py import check (--static-only mode)")
+
+    for w in WARNINGS:
+        print(f"::warning::{w}")
+    for e in ERRORS:
        print(f"::error::{e}")
-    sys.exit(1)
+    if ERRORS:
+        sys.exit(1)
+    suffix = " [static-only]" if static_only else ""
+    print(f"✓ Template validation passed ({len(WARNINGS)} warning(s)){suffix}")

-print(f"✓ config.yaml valid: {config['name']} (runtime: {config.get('runtime')})")
+
+if __name__ == "__main__":
+    main()