Compare commits

...

47 Commits

Author SHA1 Message Date
c2f5d68830 Merge pull request 'feat(actions): add audit-force-merge composite action' (#5) from feat/audit-force-merge-composite-action into main 2026-05-09 03:30:02 +00:00
120b71c564 feat(actions): add audit-force-merge composite action
§SOP-6 force-merge detector, hosted as a Gitea Actions composite
action so it can be vendored into every org repo via a single
`uses:` line instead of copy-pasting the bash. Source of truth
for the audit script logic.

Why composite vs reusable workflow: Gitea 1.22.6 doesn't support
cross-repo `uses: org/repo/.gitea/workflows/X.yml@ref`. Cross-repo
reusable workflows landed in go-gitea/gitea#32562 (1.26.0, Oct 2025)
and have not been backported. Composite actions resolve via the
actions-fetch path which works cross-repo against a public callee.
Re-evaluate when operator host runs Gitea ≥ 1.26.

Consumer workflow shape:

    on:
      pull_request_target:
        types: [closed]
    jobs:
      audit:
        if: github.event.pull_request.merged == true
        runs-on: ubuntu-latest
        steps:
          - uses: molecule-ai/molecule-ci/.gitea/actions/audit-force-merge@main
            with:
              gitea-token: ${{ secrets.SOP_TIER_CHECK_TOKEN }}
              repo: ${{ github.repository }}
              pr-number: ${{ github.event.pull_request.number }}
              required-checks: |
                sop-tier-check / tier-check (pull_request)

No actions/checkout step needed in the consumer — the audit script
does pure API calls, never reads working tree. Removing checkout is
also a small security win (PR head code never loaded).

Verified end-to-end on internal#123 + molecule-core#150 with the
inline copies (which this PR will replace via consumer-side stub
PRs once merged). Tier: low.
2026-05-08 20:29:40 -07:00
9f76a0faab Merge pull request 'fix(validate): recognize !external + !include as opaque refs (skip, not error)' (#4) from fix/validator-external-include-tags into main 2026-05-08 15:52:57 +00:00
dev-lead
d47c15d526 fix(validate): recognize !external + !include as opaque refs (skip, not error)
molecule-ai-org-template-molecule-dev's CI has been red since the
"pin: dev-department v1.0.0" merge. Symptom:

  ::error::Workspace at <unnamed>: missing 'name'
  ::error::Workspace at <unnamed>: missing 'name'

Root cause: org.yaml uses `!external` for the dev-department subtree
fetch (introduced internal#77 / molecule-core#105). The PermissiveLoader
formerly handed every unknown tag to a single multi-constructor that
flattens the parsed value to a plain dict. The validator's
validate_workspace() then saw a dict with no `name` key and tripped
the "missing name" error — but the dict was a `!external` directive,
not a malformed workspace.

The fix wraps both supported tags in distinct sentinel types:

  - !include  → IncludeRef (str subclass)
  - !external → ExternalRef (dict subclass)

validate_workspace() and count_ws() now skip these instead of treating
them as workspace shape. Real workspace dicts (with names) still get
the full structural check. Unknown tags fall through to the
multi-constructor exactly as before, preserving back-compat.

Verified on the live failing org.yaml:
  ✓ org.yaml valid: Molecule AI Dev Team (0 direct workspaces;
    external refs not counted)

And on a synthetic case with one real bug (missing-name workspace
nested under children):
  ::error::Workspace at <unnamed>: missing 'name'
  ::error::Workspace at <unnamed>/<unnamed>: missing 'name'
  exit 1

So the validator still catches real shape bugs; it just doesn't
false-positive on the new !external pattern.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-08 08:52:32 -07:00
785251f9ab Merge pull request 'fix(ci): replace cross-repo actions/checkout with direct git clone' (#3) from fix/git-clone-instead-of-actions-checkout into main 2026-05-07 08:40:43 +00:00
security-auditor
3eb62072a2 fix(ci): replace cross-repo actions/checkout with direct git clone
molecule-ci#2 attempted token: '' to force anonymous on the cross-repo
checkout. CI on plugin-molecule-careful-bash@663bf72 (post-merge of #2)
revealed actions/checkout@v4 errors with:

  ::error::Input required and not supplied: token

Even though token's input definition is required:false with a default,
the action's runtime auth-helper calls getInput('token', {required: true})
internally — empty string fails that check.

Fix: replace the cross-repo actions/checkout with a direct git clone
shell step. molecule-ci is public; anonymous git clone has neither the
auth-trips-Gitea-404 problem (#2's target) nor the empty-token-input-
required problem (#2's actual failure shape).

3 files updated, 4 sites total:
  * validate-plugin.yml (1 site)
  * validate-workspace-template.yml (2 sites)
  * validate-org-template.yml (1 site)

Refs: internal#46. Closes the third root cause uncovered by the
verification cycle on plugin-molecule-careful-bash.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 01:37:34 -07:00
d2bb7cf255 Merge pull request 'fix(ci): force anon checkout of public molecule-ci to bypass Gitea cross-repo 404' (#2) from fix/anon-cross-repo-checkout into main 2026-05-07 08:34:55 +00:00
security-auditor
7e2bde9b77 fix(ci): force anon checkout of public molecule-ci to bypass Gitea cross-repo 404
After lowercasing the slug (molecule-ci#1) and flipping molecule-ci public,
plugin/template/org-template CI still failed at the SECOND actions/checkout
step (the one that fetches molecule-ci itself for canonical validator scripts).

Failure mode in act_runner log:
  Run actions/checkout@v4
    repository: molecule-ai/molecule-ci
    path: .molecule-ci-canonical
  Syncing repository: molecule-ai/molecule-ci
  [git config http.https://git.moleculesai.app/.extraheader AUTHORIZATION: basic ***]
  ::error::The target couldn't be found.
   Failure - Main actions/checkout@v4

Root cause: actions/checkout@v4 sends `Authorization: basic <github.token>` —
the per-job Gitea-issued token, scoped to the calling plugin/template repo
only. On Gitea, an authenticated request that lacks repo-permission 404s
instead of falling back to anonymous-public-read (a Gitea-vs-GitHub
behaviour difference). Anonymous git clone of molecule-ci succeeds; the auth
header is what trips the 404.

Fix: pass `token: ''` to force anonymous fetch on the cross-repo checkouts.
molecule-ci is public; no auth is needed for read.

3 sites updated:
  * validate-plugin.yml (1 site)
  * validate-workspace-template.yml (2 sites — both jobs in the file)
  * validate-org-template.yml (1 site)

Verified by: re-triggering plugin-molecule-careful-bash#2 will be GREEN
end-to-end after this lands. The 33 downstream lowercase-slug PRs are NOT
mass-merged until that verification.

Refs: internal#46

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 01:23:37 -07:00
226975d377 Merge pull request 'fix(ci): lowercase 'molecule-ai/' in cross-repo workflow refs' (#1) from fix/lowercase-org-slug into main 2026-05-07 08:07:02 +00:00
security-auditor
2bcd52b444 fix(ci): lowercase 'molecule-ai/' in cross-repo workflow refs
Gitea is case-sensitive on owner slugs; canonical is lowercase
`molecule-ai/...`. Mixed-case `Molecule-AI/...` refs fail-at-0s
when the runner tries to resolve the cross-repo workflow / checkout.

Same fix as molecule-controlplane#12. Mechanical case-correction;
no behavior change beyond making CI resolve again.

Refs: internal#46

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 00:58:55 -07:00
Hongming Wang
b31b722899
Merge pull request #33 from Molecule-AI/feat/turn-smoke-publish-image-gate
ci(publish): bump boot-smoke timeout to 90s/120s for SDK-init-wedge coverage
2026-05-01 17:52:28 -07:00
Hongming Wang
50e84f89e9 ci(publish): bump boot-smoke timeout to 90s/120s for SDK-init-wedge coverage
Pairs with molecule-core PR #2473 (run_executor_smoke now consults
runtime_wedge.is_wedged() at the end of every result path).

10s smoke timeout was shorter than claude-agent-sdk's 60s
initialize() handshake — when a malformed CLI argv made the SDK
spin on init (PR #25 in claude-code template), the outer wait_for
fired first, run_executor_smoke saw "execution proceeding past
imports → timeout → PASS" and shipped the broken image to GHCR.

Bumping to 90s lets the SDK time itself out, the executor's wedge
catch arm runs, and runtime_wedge.mark_wedged() flips the flag
that smoke_mode now reads. Outer `timeout` bumped to 120s — the
runner-level safety net stays slightly longer than the inner cap
so a smoke_mode regression that doesn't terminate surfaces as exit
124 with a clear error, not just exit 1.

Step comment names this calibration explicitly so a future
contributor doesn't shrink it back without injecting a wedge in
the smoke_mode unit tests first. Error message references
runtime_wedge so a failure-mode reader knows where to look.
2026-05-01 17:48:51 -07:00
Hongming Wang
a79ef8e9fa
Merge pull request #32 from Molecule-AI/feat/template-validation-aggregator
ci: add Template validation aggregator (restore historical check name)
2026-04-30 23:01:43 -07:00
Hongming Wang
375bcc4376 ci(validate-workspace-template): add Template validation aggregator
The workflow was refactored from one `validate` job (display name
"Template validation") into matrix-named validate-static +
validate-runtime jobs ("(static)" / "(runtime)" suffixes) for
fork-PR security. The new check names — `validate / Template
validation (static)` and `validate / Template validation
(runtime)` — never match the original `validate / Template
validation` that template-repo branch protection requires. Result:
auto-merge silently hangs in BLOCKED forever on every template
repo because the required check never reports.

Add a third aggregator job `template-validation` (display name
"Template validation") that depends on both real jobs and emits
the original check name. `if: always()` so it reports out even
when validate-static fails — without that GitHub marks the
aggregator SKIPPED and branch protection still blocks because the
required check never reaches a final state.

Treats `skipped` as pass for validate-runtime so fork PRs (where
runtime is intentionally skipped on the security gate) don't
become un-mergeable.

Caught while shipping the boot-smoke fixes for openclaw#11 and
hermes#29 — both PRs sat BLOCKED with all real checks green.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 23:01:01 -07:00
Hongming Wang
2bbc6e0e80
Merge pull request #31 from Molecule-AI/fix/publish-template-smoke-cleanup
fix(publish-template-image): tolerate host-side uid 1000 ownership in smoke cleanup
2026-04-30 21:56:53 -07:00
Hongming Wang
da6407e58a fix(publish-template-image): make smoke-cleanup tolerate host-side uid 1000 ownership
Third hot-fix for #2275 Phase 2 — claude-code re-run #3 showed the
boot smoke ITSELF passing (`[smoke-mode] PASS: timed out past import-
tree (imports healthy)`), but the workflow step still exited 1 because
the post-smoke cleanup `rm -rf "${SMOKE_CONFIG_DIR}"` failed with
`Permission denied`.

Root cause: the image entrypoint (entrypoint.sh) does
`chown -R agent:agent /configs` before exec'ing molecule-runtime as
uid 1000. Because /configs is a bind-mount of the host's mktemp dir,
the chown propagates to the host — the runner user (the GHA `runner`
account, NOT root) can no longer delete the files inside it. With
`set -e` in effect, that rm exit propagates and we report failure
even though the gate itself passed.

Fix: best-effort rm with sudo fallback and final `|| true`. The
runner is ephemeral; /tmp gets cleaned automatically at job teardown.

Verified against run 25202859503 which showed every other step green
+ the smoke itself passing — only this rm was the blocker.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 21:56:36 -07:00
Hongming Wang
86092315a7
Merge pull request #30 from Molecule-AI/fix/publish-template-smoke-pythonpath
fix(publish-template-image): inject PYTHONPATH=/app for boot smoke
2026-04-30 21:54:18 -07:00
Hongming Wang
a9df950801 fix(publish-template-image): inject PYTHONPATH=/app to match production provisioner
Second hot-fix for #2275 Phase 2 — boot smoke kept failing with
`ModuleNotFoundError: No module named 'adapter'` even after the
permissions fix landed.

Root cause: the production platform's provisioner sets PYTHONPATH=/app
on every workspace container (provisioner.go:563) so molecule-runtime —
a pip console_scripts entry point whose sys.path[0] is /usr/local/bin,
NOT /app — can resolve `importlib.import_module('adapter')`. The
existing static import smoke didn't hit this because `python3 -c "import
$mod"` adds cwd to sys.path; only the entry-point invocation needs
PYTHONPATH.

Mirrors prod by passing `-e PYTHONPATH=/app` in the docker run.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 21:54:02 -07:00
Hongming Wang
b4e17014fa
Merge pull request #29 from Molecule-AI/fix/publish-template-smoke-perms
fix(publish-template-image): chmod a+rX + drop :ro so agent can read /configs
2026-04-30 21:49:46 -07:00
Hongming Wang
a5212a349b fix(publish-template-image): chmod a+rX + drop :ro so agent can read /configs
Hot-fix for #2275 Phase 2 — the boot smoke step in v1@3c8f8fe failed
on every template publish with `PermissionError: [Errno 13] Permission
denied: '/configs/config.yaml'` because `mktemp -d` creates the dir
with mode 700 and `chmod -R go+r` adds 'r' to files but doesn't add
'x' to directories. Inside the image the entrypoint drops priv to
uid 1000 (agent), which then cannot traverse /configs to even reach
config.yaml — main.py exits before any executor code runs.

Two changes:
1. `chmod -R a+rX` (capital X) adds 'x' to directories AND already-
   executable files, so the temp dir becomes traversable for agent
   while config.yaml stays a regular world-readable file.
2. Drop `:ro` on the mount so the entrypoint's `chown -R agent
   /configs` succeeds. The container is ephemeral; modifications to
   the host mktemp dir don't matter and the dir gets nuked right
   after the smoke run.

Reproduced + diagnosed against claude-code publish run 25202651546
which failed within a few seconds on Path('/configs/config.yaml').exists()
in molecule_runtime/config.py:298.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 21:49:26 -07:00
Hongming Wang
3c8f8fe48b
Merge pull request #28 from Molecule-AI/feat/publish-template-image-boot-smoke
feat(publish-template-image): add execute()-against-stub-deps boot smoke (#2275)
2026-04-30 21:44:30 -07:00
Hongming Wang
434d1782e6 feat(publish-template-image): add execute()-against-stub-deps boot smoke (#2275)
Adds a step between the existing import smoke and the GHCR push that
boots the just-built image with MOLECULE_SMOKE_MODE=1, which routes
molecule-runtime through the new smoke_mode.run_executor_smoke() —
invokes executor.execute(stub_ctx, stub_queue) once with a 10s timeout.

Healthy import tree → execution proceeds far enough to hit a network
boundary and times out (exit 0). Broken lazy import inside an
`async def execute(...)` body → ImportError/ModuleNotFoundError
(exit 1). The 2026-04-2x v0→v1 a2a-sdk migration shipped 5 such
regressions in templates that the existing static import smoke missed.

Skip path: when the installed runtime predates 0.1.60 (pre-smoke_mode),
the step prints a warning + exits 0. Templates pinned to older runtimes
keep publishing without this gate flipping red; cascade-triggered
builds (which forward the just-published version as RUNTIME_VERSION)
get the gate automatically.

Belt-and-suspenders `timeout 60` wrapper so smoke_mode itself can't
wedge the runner past one minute per template.

After merge, bump v1 tag to point at the new main SHA (caller repos
pin to @v1; the change has no effect until the moving tag advances).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 21:41:35 -07:00
Hongming Wang
53f01d5b44
Merge pull request #27 from Molecule-AI/auto/p135-fork-pr-lockdown
ci: lock down validate-workspace-template against fork-PR untrusted code (P135)
2026-04-30 01:08:40 -07:00
Hongming Wang
d420b4a24f ci: lock down validate-workspace-template against fork-PR untrusted code (P135)
Splits the reusable validator into two jobs to keep external fork
PRs from running arbitrary template code on the runner.

Background

The reusable workflow runs three primitives that execute
template-supplied code:
  - pip install -r requirements.txt  (setup.py + post-install hooks)
  - importlib.exec_module(adapter)   (top-level Python in adapter.py)
  - docker build                     (RUN steps in Dockerfile)

Token scope is already minimal (contents: read), GitHub forced
fork-PR tokens read-only in 2021, and the workflow_call interface
doesn't accept secrets. So the actual exploit surface is "what can
a malicious actor do with arbitrary code execution on a GitHub-
hosted runner that has no useful credentials?" — answer: crypto-
mine, DNS-exfiltrate runner metadata, attempt lateral movement
within the runner's network. Annoying, not catastrophic, but a
real attack surface that this PR closes.

The fix

Two-job split:

  validate-static    Always runs, including external fork PRs.
                     File-content checks (secret scan, YAML parse,
                     AST inspection of adapter.py without import),
                     pip install only the validator's pyyaml dep
                     (not the template's requirements.txt). NO
                     third-party code execution.

  validate-runtime   Skipped when github.event.pull_request.head.
                     repo.fork == true. pip install requirements.txt
                     + adapter import + docker build. Internal PRs
                     and push events to internal branches still get
                     the full coverage.

The validator script gains a --static-only flag that skips
check_adapter_runtime_load() (the function that calls
exec_module). The validate-static job uses it; validate-runtime
uses the existing full mode.

Trade-off

External contributors get static feedback only on their PR. If
their template metadata passes static checks but breaks runtime
loading, branch protection on staging/main blocks the merge once
runtime validation runs (post-merge or after an internal
contributor reposts). Fewer false-positive CI failures for honest
external contributors; same coverage at the merge-protected
boundary.

What this does NOT close

- Maintainer-approved external PRs that consciously execute
  third-party code. The maintainer must approve a workflow run
  via GitHub's first-time-contributor gate; that's a human
  decision, not a workflow-level gate.
- requirements.txt that pulls a malicious transitive dep from
  PyPI even on internal PRs. Mitigated by branch-protection +
  human review of PRs that touch requirements.txt.

Closes task #135.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 01:07:58 -07:00
Hongming Wang
fd60655089
Merge pull request #26 from Molecule-AI/auto/p133-readme-v1-pin
docs: pin reusable-workflow examples from @main to @v1 (P133)
2026-04-30 01:04:32 -07:00
Hongming Wang
8f041a9485 docs: pin reusable-workflow examples from @main to @v1 (P133)
The v1 tag exists in this repo but README + docs still showed
@main in the caller-pattern examples. Followers of the docs were
copy-pasting unstable @main pins. Fix: update all 6 example
references to @v1 across:

- README.md (4 examples)
- docs/template-contract.md (1 example)
- .github/workflows/auto-promote-staging-pr.yml header comment
  (1 example, just shipped in PR #25)

Operational note: v1 is meant to track the latest stable patch
within the v1 major. Cutting a new v1.X.Y or breaking-change v2
requires moving the v1 tag forward — same convention as
actions/checkout@v4 etc.

Doesn't migrate any consumer repo. Consumer migration from @main
to @v1 is a per-repo follow-up; this PR ships the docs that
guide that migration.

Closes task #133.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 01:04:06 -07:00
Hongming Wang
6afeb47e5e
Merge pull request #25 from Molecule-AI/auto/p9-reusable-auto-promote
ci: extract PR-based auto-promote-staging into reusable workflow (P9)
2026-04-30 01:02:34 -07:00
Hongming Wang
e7c6798fba ci: extract PR-based auto-promote-staging into reusable workflow (P9)
Moves the canonical PR-based staging→main auto-promote flow into a
reusable workflow that protected-branch repos can call instead of
duplicating ~240 lines of YAML each.

Why two reusable variants in this repo:

  auto-promote-staging.yml           (existing — ff-only, direct push)
    For repos WITHOUT required-status-checks branch protection.
    Already used for molecule-ci, molecule-app, molecule-docs,
    molecule-monorepo. Cannot satisfy protected-branch rules
    requiring status checks "set by expected GitHub apps".

  auto-promote-staging-pr.yml        (THIS PR — PR-based)
    For repos WITH required-status-checks. Opens (or reuses) a
    staging→main PR, enables auto-merge, lets the merge queue land
    it. Required path for molecule-core + molecule-controlplane
    (per the 2026-04-28 incident where direct ff-only push was
    failing GH006 on protected refs).

Inputs:
  gates           — CSV of workflow filenames to require green
  target-branch   — promote target (default: main)
  source-branch   — promote source (default: staging)
  enabled-var     — repo variable name gating rollout
                    (default: AUTO_PROMOTE_ENABLED)
  merge-method    — merge|squash|rebase (default: merge — matches
                    user preference for merge commits over squash)
  force           — pass through caller's workflow_dispatch.force input

Caller pattern (kept minimal — see header comment in the workflow):

  on:
    workflow_run:
      workflows: [CI, ...]
      types: [completed]
    workflow_dispatch:
      inputs:
        force: ...
  permissions:
    contents: write
    pull-requests: write
  jobs:
    promote:
      uses: Molecule-AI/molecule-ci/.github/workflows/auto-promote-staging-pr.yml@main
      with:
        gates: "ci.yml,e2e-staging-canvas.yml,..."
        force: ${{ github.event.inputs.force == 'true' }}
      secrets: inherit

The caller's `on.workflow_run.workflows` (display names) MUST stay in
sync with the `gates` input (filenames). The reusable can't validate
this because GitHub Actions decouples display names from filenames;
this is the same coupling the original molecule-core workflow had.

Migration of the existing 242-line molecule-core workflow to this
reusable is a follow-up PR. Same pattern applies to
molecule-controlplane once it grows protected-branch
auto-promote (today CP uses the auto-sync-main-to-staging shape
inherited from #142).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 01:01:52 -07:00
Hongming Wang
e21371f40e
Merge pull request #24 from Molecule-AI/fix/validate-fetch-scripts-from-ci
fix(validate): fetch scripts from molecule-ci instead of vendored copy
2026-04-29 02:01:13 -07:00
Hongming Wang
56facc8a42 fix(validate): fetch validator scripts from molecule-ci instead of expecting them in caller
The validate-org-template.yml and validate-plugin.yml workflows
expected `.molecule-ci/scripts/` to be vendored INTO each calling
repo. That worked for the repos that copied the directory in, but
broke on the ones that didn't:

- molecule-ai-org-template-medo-smoke
- molecule-ai-org-template-molecule-worker-gemini
- molecule-ai-org-template-reno-stars
- molecule-ai-plugin-molecule-compliance
- molecule-ai-plugin-molecule-freeze-scope
- molecule-ai-plugin-molecule-prompt-watchdog

Surfaced when the secret-scan rollout PRs hit those repos and the
required validate check failed on missing
`.molecule-ci/scripts/requirements.txt`.

Mirror the same fix already in validate-workspace-template.yml: a
second `actions/checkout@v4` of molecule-ci into
`.molecule-ci-canonical/`, with script paths re-pointed accordingly.
Single source of truth — callers never need to vendor or sync.

Also adds `.molecule-ci-canonical` to the secret-scan SKIP_DIRS so
the side-checked-out tree doesn't get walked.

Callers can drop their vendored `.molecule-ci/scripts/` copies in a
follow-up cleanup. Both shapes work after this PR — the vendored
copy is harmless dead weight, not a conflict.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-29 01:56:25 -07:00
Hongming Wang
2e40916b57
fix(validator): handle abstract intermediates + class-aliasing + lock GITHUB_TOKEN scope (#21)
Independent post-merge review of #19 surfaced two more findings.
Both shipped here.

Q3 — abstract intermediates + multiple-concrete-classes.

  The class-discovery filter from O1 (#19) only excluded BaseAdapter
  itself. Two failure modes slipped through:

    (a) A locally-defined abstract intermediate
        `class FrameworkAdapter(BaseAdapter): @abstractmethod ...`
        passed the filter, falsely satisfying "at least one
        concrete subclass" while still being non-instantiable at
        workspace boot.

    (b) A template defining BOTH `class FrameworkAdapter(BaseAdapter)`
        AND `class ConcreteAdapter(FrameworkAdapter)` had both pass
        the filter, producing a silent ambiguity where the runtime's
        class-discovery picks one per its resolution rules — wrong
        class loaded after a future runtime refactor.

  Fixes:
    - Add `not inspect.isabstract(obj)` to the discovery filter so
      abstract intermediates are excluded.
    - Hard-error if `len(adapter_classes) > 1` listing both names so
      the contributor knows exactly which classes are competing.

  Three new tests pin the behaviors:
    - test_abstract_intermediate_alone_does_not_count
    - test_abstract_plus_concrete_passes_with_concrete_only
    - test_multiple_concrete_baseadapter_subclasses_errors

Identity-based deduplication.

  Caught against the real langgraph template during smoke-testing
  the Q3 fix: production adapters often do
  `Adapter = ConcreteAdapter` as a module-level alias for the
  runtime's discovery convention. `vars(mod)` returns BOTH bindings
  pointing at the same class object, so the new
  multiple-concrete-classes error fired falsely on every aliased
  template.

  Fix: deduplicate by `id(obj)` BEFORE counting, so the same class
  object under multiple bindings counts once. New regression test
  test_aliased_concrete_class_is_deduplicated pins this against
  any future filter regression.

Existing tests updated to use fully-concrete BaseAdapter subclasses
(matching production templates) since the new abstract-filter
correctly rejects partial stubs that don't override every abstract
method BaseAdapter declares (5 methods: name, display_name,
description, setup, create_executor).

Q5 — GITHUB_TOKEN scope lockdown.

  validate-workspace-template.yml runs untrusted-by-design code from
  the calling template repo: pip post-install hooks, adapter.py
  imports, Dockerfile RUN steps. Each of those primitives executes
  with GITHUB_TOKEN in env. The workflow had no `permissions:`
  block, defaulting to whatever the calling repo grants — often
  contents: write.

  Add `permissions: contents: read` at the workflow level. Worst-
  case-with-token now drops to "read public repo state" — no write
  to issues, no push to branches, no comment-spam, no workflow
  re-trigger. Partial mitigation; the deeper `pull_request_target`
  discipline is bigger scope (tracked separately).

Verification:
  - 47/47 tests pass (was 43; +3 abstract/multi-concrete + +1 alias)
  - All 8 production templates pass the full updated validator
    end-to-end with 0 warnings / 0 errors

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 12:27:09 -07:00
Hongming Wang
30e094220a
chore: remove accidentally-committed __pycache__ + gitignore Python caches (#20)
Cleanup of #19's commit, which inadvertently included scripts/__pycache__/
.pyc files generated by running pytest locally during the review-
followup work. The repo's .gitignore had no Python-cache section at
all, so nothing prevented this — adding it now to make the same
mistake structurally impossible.

Files removed from tracking (still ignored locally going forward):
  - scripts/__pycache__/migrate-template.cpython-313.pyc
  - scripts/__pycache__/test_migrate_template.cpython-313-pytest-9.0.3.pyc
  - scripts/__pycache__/test_validate_workspace_template.cpython-313-pytest-9.0.3.pyc
  - scripts/__pycache__/validate-workspace-template.cpython-313.pyc

Gitignore additions cover the standard set:
  __pycache__/, *.pyc, *.pyo, *.pyd, .pytest_cache/, .mypy_cache/,
  .ruff_cache/

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 12:18:46 -07:00
Hongming Wang
f125d68910
fix(validator): address post-merge review findings on #17 + #18 (#19)
Independent code review of #17 (adapter runtime-load) and #18 (schema
versioning) surfaced four Required and three Optional findings worth
fixing before the patterns harden into the codebase.

Required:

  R1: Delete .molecule-ci/scripts/{validate-workspace-template,
      migrate-template}.py — dead-vendored mirror. The new validator
      workflow invokes .molecule-ci-canonical/scripts/ (the canonical
      clone), not .molecule-ci/scripts/. The mirror was the exact drift
      class #90 is supposed to eliminate: next contributor would edit
      one copy and silently diverge. Other workflows (validate-plugin,
      validate-org-template) still use the legacy path and keep their
      own scripts there — so removing OUR two files is asymmetric but
      correct, and the legacy path can phase out organically.

  R2: validate-workspace-template.yml's `cache-dependency-path` pointed
      at the validator's own deps file (just `pyyaml>=6.0`). Pip cache
      key never invalidated when the template added crewai/langgraph/
      etc. Repoint to the calling repo's `requirements.txt`, which is
      the file the heavy install actually uses one step later.

  R3: `_check_schema_v1` looped `SCHEMA_V1_REQUIRED_KEYS` and re-emitted
      "missing required key `template_schema_version`" — but the
      dispatcher already verified the field is present + int before
      reaching v1, so that branch was dead defensive code. Skip it
      explicitly with a comment, but keep the field in the constant for
      contract documentation + the unknown-keys filter.

  R4: `_template_adapter_under_validation` was a fixed sys.modules key,
      meaning back-to-back invocations in the same Python process
      shared the slot. Use a per-call-unique name keyed on the absolute
      path's hash. No observed bug today; defensive-only.

Optional:

  O1: Class-discovery filter now also requires `__module__ == module_name`.
      Without this, an `from molecule_runtime.adapters.base import
      AbstractCLIAdapter` re-export would count as a "real" adapter,
      masking the genuine "no concrete subclass" case the gate exists
      to catch. Cheap and forward-proofs against any future abstract
      intermediate the runtime might expose. Added a sibling test
      pinning the new behavior.

  O2: migrate-template.py's docstring claimed "uses ruamel.yaml when
      available" but the implementation only ever calls `yaml.safe_dump`.
      Replaced the lie with a clearer caveat block + a forward-pointer
      to ruamel-when-comments-detected as a future enhancement.

  O3: Reordered the workflow so the secret-scan step runs BEFORE
      `pip install -r requirements.txt`. Same threat surface as the
      Docker build smoke (which already runs first), but cheap defense-
      in-depth: a malicious template PR adding a malicious dep to
      requirements.txt now has its post-install hook execute AFTER the
      secret scanner has already inspected the diff.

Test changes:

  - test_adapter_with_no_baseadapter_subclass_errors updated for the
    new error message ("no concrete class inheriting from").
  - New test_only_imported_baseadapter_subclass_does_not_count pins
    the O1 __module__-filter behavior.
  - 43/43 tests pass (was 42/42 before the new test).
  - Real langgraph template still passes the full validator end-to-end.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 12:17:44 -07:00
Hongming Wang
84a104a146
feat(validator): schema-version dispatch + migrate-template.py framework (#18)
Closes the schema-versioning workstream of #90. Sets up the machinery
for "we will be updating a lot" (the user's framing) without forcing
the first real schema bump to discover semantics under deadline
pressure. Today every template is at v1; this PR adds the framework,
ships zero behavior change for v1 templates, and reserves v2+ for
when there's a concrete reason to bump.

Validator changes:

  - `KNOWN_SCHEMA_VERSIONS = {1}` — the set the validator currently
    accepts. Future bumps add to this set.
  - `DEPRECATED_SCHEMA_VERSIONS: set[int] = set()` — versions accepted
    with warning during a deprecation window.
  - Per-version contract: `_check_schema_v1(config)` enforces the v1
    REQUIRED_KEYS / OPTIONAL_KEYS / KNOWN_RUNTIMES contract — exactly
    what the previous monolithic check_config_yaml did.
  - Dispatch table: `SCHEMA_CHECKS = {1: _check_schema_v1}`. Versions
    that aren't in the table hard-error.

  - check_config_yaml() now: reads template_schema_version → emits
    deprecation warning if applicable → dispatches to the right
    SCHEMA_CHECKS entry → unknown versions hard-error with actionable
    instructions ("add a SCHEMA_V<N> block").

  - Schema versions are FROZEN once shipped: never edit a SCHEMA_V<N>
    constant in place. To bump, ADD v<N+1> alongside, deprecate v<N>,
    migrate consumers, drop v<N> next cycle. Header comment documents
    the discipline.

New script `migrate-template.py`:

  - `MIGRATIONS: dict[int, Callable[[dict], dict]]` registry — each
    entry maps a SOURCE version to the function that produces the
    next version's dict. Empty today.
  - `migrate_config(config, from, to)` chains migrations sequentially.
    Forward-only (errors on backward), errors on missing intermediate
    steps (never silently skip), asserts every migration stamps its
    output's template_schema_version.
  - CLI: `migrate-template.py [--from N] [--to M] [--dry-run] DIR`.
    Defaults: --from = whatever config.yaml declares, --to = highest
    reachable from MIGRATIONS (currently 1, so a no-op).

Behavior change to the existing
test_missing_required_keys_errors test:

  Previously the validator emitted 3 "missing required key" errors
  when name/runtime/template_schema_version were all missing. Now it
  short-circuits on missing version with a single actionable error —
  listing downstream missing keys is noise on top of the real
  problem (no version means we can't pick a contract). The test was
  updated to pin the new behavior; a new sibling test
  (test_missing_required_keys_under_v1_dispatch_errors) pins that v1
  still lists name/runtime/etc. when present-with-v1.

Verification:

  - 42/42 tests pass (20 prior + 9 new schema-dispatch tests in
    test_validate_workspace_template.py + 17 new migrator tests in
    test_migrate_template.py).
  - Real langgraph template runs through the full updated validator
    end-to-end with 0 warnings / 0 errors.

This + #17 means #90 is done end-to-end:
  - Phase 2: validator green on all 8 templates as a required check (already shipped)
  - Phase 2.5: adapter.py runtime-load contract (#17)
  - Phase 3: schema versioning + migration framework (this PR)

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 12:07:04 -07:00
Hongming Wang
8309a55e6c
feat(validator): runtime-load check for adapter.py contract (#17)
Adds the third workstream of #90 (eliminate template repo drift): a
strong contract check that exercises adapter.py the same way the
runtime does at workspace boot. Without this, a template can have a
syntactically-valid Dockerfile + an adapter.py that ImportErrors at
runtime, build clean through Docker smoke, and crash on first user
prompt — exactly the human-error class #90 is meant to eliminate.

Existing checks ranked from weakest to strongest:

  1. check_adapter()         — text-grep for legacy `molecule_ai`
                                imports. Catches one specific footgun.
  2. Docker build smoke      — `docker build` succeeds. Doesn't RUN
                                the image, so adapter.py is never
                                imported. Misses every adapter-load
                                bug.
  3. (NEW) check_adapter_runtime_load — imports adapter.py via the
                                same `importlib.spec_from_file_location`
                                path the runtime uses, and asserts at
                                least one class inherits from
                                molecule_runtime.adapters.base.BaseAdapter.

Hard-error conditions:
  - adapter.py raises any exception during import (SyntaxError,
    ImportError, NameError, etc.). Same exception would crash the
    workspace at boot.
  - No class in the module inherits from BaseAdapter. The runtime's
    class-discovery silently falls through to the default langgraph
    executor in this case — exactly the silent-failure shape the
    contract is meant to catch.

Skip conditions:
  - No adapter.py exists. Templates without one inherit the default
    executor by design (policy, not drift).
  - molecule-ai-workspace-runtime not importable in the validator
    env. Warns loudly so the CI-config bug surfaces, but doesn't
    hard-fail (we'd be reporting "your adapter is broken" when the
    actual cause is missing infra).

Workflow update: validate-workspace-template.yml now installs the
template's requirements.txt before invoking the validator (or
falls back to installing molecule-ai-workspace-runtime alone if the
template has no requirements.txt). This satisfies the runtime-load
check's import dependencies the same way the workspace container
does at boot — `pip install -r requirements.txt`.

Verified locally:
  - 20/20 tests in test_validate_workspace_template.py pass
    (14 existing + 6 new).
  - Real langgraph template passes the full new validator including
    runtime-load (0 warnings, 0 errors).
  - Surveyed all 8 production templates' adapter.py shapes; every
    one already inherits from BaseAdapter, so this check turns green
    on first run with zero per-template fixups needed.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 12:02:33 -07:00
Hongming Wang
b24e11976a
docs: recommend @v1 over @main in reusable-workflow adoption snippets (#16)
Closes the README half of monorepo task #133. The v1 git tag now
exists at the current main HEAD (8b0fbac — includes the auto-promote
fail-loud fix from #15). Consumers should pin reusable-workflow refs
to @v1 so future breaking changes land on @v2 with @v1 staying
backward-compatible — same pattern as `actions/checkout@v4`.

This commit only updates the EXAMPLE adoption snippets in the
workflow headers. Existing consumers pinned at @main keep working
identically (the workflow content is unchanged); they migrate at
their own pace when next touching their CI. New consumers see @v1
as the recommended pin.

Touched:

  - auto-promote-branch.yml (also added a paragraph explaining the
    @v1 vs @main convention so future contributors don't reintroduce
    @main as the recommendation)
  - auto-promote-staging.yml (the snippet inside this file's header
    references auto-promote-branch.yml, also moved to @v1)
  - disable-auto-merge-on-push.yml
  - publish-template-image.yml

The validate-* workflows (validate-plugin.yml, validate-org-template.yml,
validate-workspace-template.yml) don't have adoption snippets in their
headers — adding canonical examples there is a separate scope and not
part of this PR.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 11:14:12 -07:00
Hongming Wang
8b0fbac78a
fix(auto-promote): fail loud on 403 instead of silently degrading (#15)
Independent code review caught a Critical issue inherited from the
pre-extraction workflow: the branch-protection API call falls through
to '{}' on any non-200, then the empty-GATES check treats this as
"no gates configured (or API inaccessible)" and sets ok=true. Combined
with --ff-only being ancestry-only (not test-status), a green-but-
flaky source branch could ff-promote red commits to the target with
zero CI enforcement.

The conflation of three response classes is the bug:

  200 with .contexts[] populated  → honor the gates (correct)
  200 with empty .contexts        → "no gates configured" → ok=true (correct)
  404 (no branch protection)      → "no gates configured" → ok=true (correct)
  403 (token lacks permission)    → silently treated like 404 (BUG)

Use `gh api -i` to capture the HTTP status line and discriminate:

  - 200 → extract body, proceed to gate-check loop
  - 404 → legitimate fallback to --ff-only safety, log notice
  - 403/401 → fail loud with a concrete fix ("add administration: read
    to your caller's permissions block")
  - any other → fail loud with the response prefix for debugging

Also:

  - Update the README in the workflow header to document the
    administration: read requirement.
  - Add administration: read to molecule-ci's own self-caller
    (auto-promote-staging.yml) so its behavior is preserved.

Verified locally against four real API responses:

  - molecule-core/staging        → HTTP 200, 8 gates → loop runs
  - molecule-ci/main             → HTTP 200, 0 gates → ok=true (notice)
  - hackathon org-template/main  → HTTP 200, 0 gates → ok=true (notice)
  - this-repo-does-not-exist     → HTTP 404 → legitimate fallback path

Closes a Critical from the post-merge review of #14.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 11:11:59 -07:00
236923196f
Merge pull request #13 from Molecule-AI/feat/strict-template-drift-check
feat(validate-workspace-template): strict drift gate + canonical-fetch workflow
2026-04-28 08:38:10 -07:00
55aa3ce1d3
Merge pull request #14 from Molecule-AI/feat/auto-promote-reusable
feat(auto-promote): extract as reusable workflow_call for org-wide adoption
2026-04-28 08:37:56 -07:00
Hongming Wang
9d67da3ef9 feat(auto-promote): extract as reusable workflow_call for org-wide adoption
Splits auto-promote-staging.yml into:

  - auto-promote-branch.yml — new reusable workflow with
    `on: workflow_call`. Inputs `from-branch` (default 'staging') and
    `to-branch` (default 'main'). Repo-agnostic: gates are read from
    the consuming repo's branch protection at run time, not hardcoded.

  - auto-promote-staging.yml — molecule-ci's own self-running flow,
    now a ~25-line wrapper that calls the reusable workflow with
    staging→main hardcoded. Trigger and behavior unchanged for
    molecule-ci itself.

Adoption pattern in any consumer repo:

    # .github/workflows/auto-promote.yml
    name: Auto-promote staging → main
    on:
      push:
        branches: [staging]
      workflow_dispatch:
    permissions:
      contents: write
      statuses: read
    jobs:
      promote:
        uses: Molecule-AI/molecule-ci/.github/workflows/auto-promote-branch.yml@main
        with:
          from-branch: staging
          to-branch: main

Excluded by policy: molecule-core + molecule-controlplane stay
manual per CEO directive 2026-04-24. Those repos do NOT adopt the
reusable workflow; the extraction adds no surface to repos that
don't call it.

Closes monorepo task #93.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 18:33:01 -07:00
Hongming Wang
73102cdaa9 feat(validate-workspace-template): strict drift gate + canonical-fetch workflow
P6 Phase 1: enforce the workspace-template contract via CI on every
template-repo push, eliminating the slow drift that produced 8
copies of a 28-line Dockerfile in different states of decay.

The previous validator (50 lines, soft warnings only) couldn't
catch the cache-trap pattern (Dockerfile missing ARG RUNTIME_VERSION)
that silently shipped the previous runtime wheel during cascade
publishes — observed five times in a row on 2026-04-27. Hardened
into structural checks that fail CI, not just warn:

  - Dockerfile must base on python:3.11-slim
  - Dockerfile must declare ARG RUNTIME_VERSION AND reference
    ${RUNTIME_VERSION} in a RUN block (the arg has to be in the
    layer's command line for docker to hash it into the cache key)
  - Dockerfile must create the agent uid-1000 user (Claude Code
    refuses --dangerously-skip-permissions as root for safety)
  - Dockerfile must end at molecule-runtime — directly via
    ENTRYPOINT or via a wrapper script that exec's it (claude-code
    has entrypoint.sh for gosu drop-priv; hermes has start.sh to
    boot the hermes-agent daemon first; both are allowed)
  - config.yaml must have name + runtime + integer
    template_schema_version. Quoted "1" fails — observed previously
    in a copy-pasted template that the YAML loader turned into str
  - requirements.txt must declare molecule-ai-workspace-runtime

Also fixed: the original validator's warning telling adapter.py
NOT to import molecule_runtime was backwards — that's the
canonical package name post-#87. Now it warns on the legacy
molecule_ai prefix instead.

Reusable workflow change: instead of running
.molecule-ci/scripts/validate-workspace-template.py (a per-template
vendored copy that drifts as the validator evolves), the workflow
now checks out molecule-ci itself into .molecule-ci-canonical and
runs the canonical script from there. Single source of truth —
every template runs the SAME contract on every CI run. The legacy
.molecule-ci/scripts/ directories in each template repo can be
deleted in a Phase 2 cleanup PR.

14 unit tests pin the contract:
  - canonical template passes
  - claude-code-style custom entrypoint passes when the wrapper
    exec's molecule-runtime
  - 5 Dockerfile drift modes each error individually
  - 3 config.yaml drift modes each error/warn
  - requirements.txt missing-runtime errors
  - legacy molecule_ai import warns
  - regression cover: modern molecule_runtime import does NOT
    trigger the (deleted) backwards warning

All 8 production template repos pass the new contract today —
this PR locks in the current good state, it does not force any
template-repo edits.

Contract documented at docs/template-contract.md so the rules are
discoverable without reading the validator.
2026-04-27 14:50:55 -07:00
Hongming Wang
9c7f4f5542
feat(reusable): forward runtime_version as RUNTIME_VERSION build-arg (#12)
Closes the cascade cache trap that bit us 5x today. Each cascade
rebuild ran against the same Dockerfile + requirements.txt content,
producing the same docker layer cache key — so even though
publish-runtime had just shipped a new version, pip install hit the
cached layer with the OLD version.

Mechanism:
- Reusable workflow now accepts optional `runtime_version` input
- Forwarded as `--build-arg RUNTIME_VERSION=$VERSION` to docker build
- Templates that declare `ARG RUNTIME_VERSION` get cache-key
  invalidation per-version (different ARG value → different cache
  key → fresh pip install layer)
- Templates that don't declare the ARG silently ignore it (no
  breakage; phased rollout)

Pairs with molecule-core PR #2181 (PyPI propagation wait + path
filter expansion). Together: cascade waits until PyPI serves the
new version, then fires with the version, templates rebuild against
that exact version with cache invalidation. No more "I shipped
0.1.X but image installs 0.1.X-1."

Phase 2 (separate PRs in template repos): each template's caller
forwards `${{ github.event.client_payload.runtime_version }}` and
each Dockerfile declares `ARG RUNTIME_VERSION` near pip install.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 07:44:34 -07:00
Hongming Wang
6409d65106
docs: add disable-auto-merge-on-push to README (#11)
Documents the new reusable workflow shipped in PR #10:
- Caller pattern (~10 lines per consuming repo) under Usage
- Full description in "What each workflow validates" — explains the
  2026-04-27 motivation, the org-wide repo setting it pairs with,
  and the false-positive note for CI bot pushes

Companion to molecule-core CONTRIBUTING.md update (PR #2177) which
documents the contract from the developer's perspective. Both must
land for the safety guards to be discoverable from where teams read.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 06:46:40 -07:00
Hongming Wang
d5caaac219
feat(reusable): disable auto-merge when a new commit is pushed (#10)
Reusable workflow that consumers call from their pr-guards.yml on
pull_request:synchronize. When a new commit is pushed to an open PR
that has auto-merge enabled, this disables auto-merge and posts a
comment so the operator must explicitly re-engage after verifying.

Background: on 2026-04-27, PR #2174 in molecule-core auto-merged
with only the first commit because the second commit was pushed
AFTER the merge queue had locked the PR's SHA. The second commit
ended up orphaned on a merged-and-deleted branch (the wider
"automatically delete head branches" repo setting now blocks the
push entirely; this workflow catches the race window where the PR
is queued but not yet merged).

Defense in depth — if both fixes are active:
1. Repo setting "delete branch on merge" prevents pushes to a
   merged branch (post-merge orphan case).
2. This workflow catches in-queue races (push lands while the
   queue is processing) by force-disabling auto-merge so the
   operator must re-engage explicitly.

Together they cover the full lifecycle of "auto-merge enabled →
new commits arrive" without relying on operator discipline.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 06:38:57 -07:00
Hongming Wang
0335ec169c
feat(lint): read runtime module list from wheel manifest, not inline (#9)
Switches the bare-imports lint from an inline RUNTIME_MODULES list
to the _runtime_modules.json manifest emitted by molecule-core's
build_runtime_package.py. Eliminates the third place the runtime
module list lived — now the build script is the single source of
truth.

Tonight surfaced that the same closed list lived in three places
that drifted independently. The build script's TOP_LEVEL_MODULES
went stale on transcript_auth, the smoke-test step here had a
hardcoded mirror that would have drifted next time a top-level
module was added, and runtime-pin-compat tested transitively via
import molecule_runtime.main (which only catches breakage, not
drift). One source of truth fixes all three at once.

Implementation:
- pip download molecule-ai-workspace-runtime --no-deps to /tmp
- unzip _runtime_modules.json from the wheel
- merge top_level_modules + subpackages into the regex alternation
  (subpackages can be bare-imported too — `from lib.pre_stop`)
- on any fetch failure (network, missing manifest in older wheel),
  fall back to the inline list with a workflow warning so the lint
  still runs but the operator knows to investigate

Two consequences:
- Templates rebuilt against runtime ≥ the version that ships the
  manifest get the always-fresh list automatically.
- Templates rebuilt against the old wheel (pre-manifest) still get
  the working inline list — no regression.

Future cleanup (separate PR after a few release cycles): once all
template repos have rebuilt at least once with the manifest path,
the inline fallback can shrink to a panic message.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 06:19:23 -07:00
Hongming Wang
7315460ff5
feat(publish-template-image): bare-import lint + import-every-app-py smoke (#8)
Two new gates that would have prevented today's
post-#87 template-extraction bug parade:

1. **Bare-import lint** — fail-fast pre-build check that grep's
   template *.py files for `from <runtime_module> import` (where
   <runtime_module> is in the closed list mirroring workspace/*.py
   basenames). When the runtime was bundled into workspace/, bare
   imports resolved against sibling files; in standalone template
   repos they explode at startup. Five separate templates shipped
   broken on 2026-04-27 because of this exact pattern (claude-code:
   plugins, executor_helpers, heartbeat, a2a_client, platform_auth;
   langgraph: agent, a2a_executor; deepagents: a2a_executor;
   gemini-cli: config, executor_helpers x2). The lint runs before
   docker login + buildx setup so a bad PR returns red in seconds.

2. **Import every /app/*.py at boot** (deeper smoke) — replaces
   `python -c "import adapter"` with a loop importing every Python
   module at /app/. The old single-import didn't traverse to
   sibling modules adapter.py imports lazily inside
   `create_executor()` (the executor.py family). That's why the
   hermes a2a-sdk migration bug and langgraph's bare a2a_executor
   import slipped through every prior gate even though the boot
   smoke "passed." Importing every module module-level forces all
   imports to resolve, including those in executor.py.

Both gates use the closed-list pattern (deliberate, easy to update,
no false-positives on legit third-party imports). The runtime module
list mirrors the equivalent in scripts/build_runtime_package.py;
both should be updated together when a new top-level workspace
module ships.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 05:33:10 -07:00
Hongming Wang
b6f43a1145
feat(publish-template-image): boot image and import adapter.py before pushing :latest (#7)
Today's incident: a template's adapter.py imported a symbol
(RuntimeCapabilities) from molecule_runtime that the published runtime
didn't yet export. The image built fine, the existing "smoke test"
inspected the entrypoint string and passed, and a broken :latest
shipped to GHCR. Every claude-code + hermes provision then hung in
"provisioning" status until the 10-min sweep marked them failed.

The old smoke test was named correctly but didn't actually exercise
anything — `docker inspect` doesn't catch ImportError. This change
splits the build/push step into three:

1. Build with `load: true, push: false` so the image lands on the
   runner's local docker.
2. Smoke test runs `docker run ... python -c "import adapter"` against
   the loaded image. This catches the version-skew class of bug
   (adapter.py imports a symbol the installed runtime doesn't export),
   plus syntax errors, missing files, and anything else that breaks
   import-time.
3. Push :latest + :sha-* only if the smoke test passes. The push step
   reuses the cached build, so it's fast.

Net cost: ~5s per publish (the docker run). Net benefit: broken images
can no longer poison :latest.

All 8 caller templates (claude-code, gemini-cli, hermes, langgraph,
crewai, autogen, deepagents, openclaw) inherit the gate automatically
since this is the reusable workflow they all call.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 02:12:13 -07:00
19 changed files with 2926 additions and 231 deletions

View File

@ -0,0 +1,55 @@
name: 'Audit force-merge'
description: >-
§SOP-6 force-merge audit. Detects PRs merged with required-status-checks
not green at HEAD SHA and emits incident.force_merge JSON to runner
stdout. Vector docker_logs source ships the line to Loki on
molecule-canonical-obs (per reference_obs_stack_phase1).
# Why a composite action and not a reusable workflow:
# Gitea 1.22.6 does NOT support cross-repo `uses: org/repo/.gitea/
# workflows/X.yml@ref`. Cross-repo reusable workflows landed in
# go-gitea/gitea PR #32562 in Gitea 1.26.0 (Oct 2025). On 1.22.x the
# clone fails because act_runner mints a caller-scoped GITEA_TOKEN.
# Composite actions resolve via the actions-fetch path which works
# cross-repo on 1.22 against a public callee — that's us. Re-evaluate
# this choice when the operator host upgrades to Gitea ≥ 1.26.
inputs:
gitea-token:
description: >-
PAT for sop-tier-bot (or equivalent read-only audit identity).
Needs read:user,read:repository,read:issue scopes — admin scope
is intentionally NOT required.
required: true
gitea-host:
description: 'Gitea host'
required: false
default: 'git.moleculesai.app'
repo:
description: 'owner/name; typically ${{ github.repository }}'
required: true
pr-number:
description: 'PR number; typically ${{ github.event.pull_request.number }}'
required: true
required-checks:
description: >-
Newline-separated required-status-check context names. Mirror
of branch protection's status_check_contexts. Declared at the
caller because /branch_protections requires admin scope which
this audit identity intentionally does not hold (least-privilege).
When the required-check set changes, update both branch
protection AND this input.
required: true
runs:
using: composite
steps:
- name: Detect force-merge + emit audit event
shell: bash
env:
GITEA_TOKEN: ${{ inputs.gitea-token }}
GITEA_HOST: ${{ inputs.gitea-host }}
REPO: ${{ inputs.repo }}
PR_NUMBER: ${{ inputs.pr-number }}
REQUIRED_CHECKS: ${{ inputs.required-checks }}
run: bash "$GITHUB_ACTION_PATH/audit.sh"

View File

@ -0,0 +1,118 @@
#!/usr/bin/env bash
# audit-force-merge — detect a §SOP-6 force-merge on a closed PR, emit
# `incident.force_merge` to stdout as structured JSON.
#
# Invoked by the `audit-force-merge` composite action defined alongside
# this script (action.yml). Caller workflows fire on
# `pull_request_target: closed` and gate on `merged == true`. See
# action.yml for the supported inputs.
#
# Vector's docker_logs source picks up runner stdout; the JSON gets
# shipped to Loki on molecule-canonical-obs, indexable by event_type.
# Query example:
#
# {host="operator"} |= "event_type" |= "incident.force_merge" | json
#
# A force-merge is detected when a merged PR had at least one of the
# caller-declared required-status-check contexts in a state other than
# "success" at the PR HEAD. That's exactly what the Gitea
# force_merge:true API call lets through, so it's a faithful detector
# of the override path.
#
# Required env (set by the composite action via inputs):
# GITEA_TOKEN, GITEA_HOST, REPO, PR_NUMBER, REQUIRED_CHECKS
#
# REQUIRED_CHECKS is newline-separated context names. Declared by the
# caller (mirror of branch protection's status_check_contexts) rather
# than fetched from /branch_protections, which requires admin scope —
# the audit identity is intentionally read-only (least-privilege; see
# memory/feedback_least_privilege_via_workflow_env).
set -euo pipefail
: "${GITEA_TOKEN:?required}"
: "${GITEA_HOST:?required}"
: "${REPO:?required}"
: "${PR_NUMBER:?required}"
: "${REQUIRED_CHECKS:?required (newline-separated context names)}"
OWNER="${REPO%%/*}"
NAME="${REPO##*/}"
API="https://${GITEA_HOST}/api/v1"
AUTH="Authorization: token ${GITEA_TOKEN}"
# 1. Fetch the PR. If not merged, no-op.
PR=$(curl -sS -H "$AUTH" "${API}/repos/${OWNER}/${NAME}/pulls/${PR_NUMBER}")
MERGED=$(echo "$PR" | jq -r '.merged // false')
if [ "$MERGED" != "true" ]; then
echo "::notice::PR #${PR_NUMBER} closed without merge — no audit emission."
exit 0
fi
MERGE_SHA=$(echo "$PR" | jq -r '.merge_commit_sha // empty')
MERGED_BY=$(echo "$PR" | jq -r '.merged_by.login // "unknown"')
TITLE=$(echo "$PR" | jq -r '.title // ""')
BASE_BRANCH=$(echo "$PR" | jq -r '.base.ref // "main"')
HEAD_SHA=$(echo "$PR" | jq -r '.head.sha // empty')
if [ -z "$MERGE_SHA" ]; then
echo "::warning::PR #${PR_NUMBER} merged=true but no merge_commit_sha — cannot evaluate force-merge."
exit 0
fi
# 2. Required status checks declared in the workflow env.
REQUIRED="$REQUIRED_CHECKS"
if [ -z "${REQUIRED//[[:space:]]/}" ]; then
echo "::notice::REQUIRED_CHECKS empty — force-merge not applicable."
exit 0
fi
# 3. Status-check state at the PR HEAD (where checks ran). The merge
# commit doesn't get its own checks; we evaluate the PR's last
# commit, which is what branch protection compared against.
STATUS=$(curl -sS -H "$AUTH" \
"${API}/repos/${OWNER}/${NAME}/commits/${HEAD_SHA}/status")
declare -A CHECK_STATE
while IFS=$'\t' read -r ctx state; do
[ -n "$ctx" ] && CHECK_STATE[$ctx]="$state"
done < <(echo "$STATUS" | jq -r '.statuses // [] | .[] | "\(.context)\t\(.status)"')
# 4. For each required check, was it green at merge? YAML block scalars
# (`|`) leave a trailing newline; skip blank/whitespace-only lines.
FAILED_CHECKS=()
while IFS= read -r req; do
trimmed="${req#"${req%%[![:space:]]*}"}" # ltrim
trimmed="${trimmed%"${trimmed##*[![:space:]]}"}" # rtrim
[ -z "$trimmed" ] && continue
state="${CHECK_STATE[$trimmed]:-missing}"
if [ "$state" != "success" ]; then
FAILED_CHECKS+=("${trimmed}=${state}")
fi
done <<< "$REQUIRED"
if [ "${#FAILED_CHECKS[@]}" -eq 0 ]; then
echo "::notice::PR #${PR_NUMBER} merged with all required checks green — not a force-merge."
exit 0
fi
# 5. Emit structured audit event.
NOW=$(date -u +%Y-%m-%dT%H:%M:%SZ)
FAILED_JSON=$(printf '%s\n' "${FAILED_CHECKS[@]}" | jq -R . | jq -s .)
# Print as a single-line JSON so Vector's parse_json transform can pick
# it up cleanly from docker_logs.
jq -nc \
--arg event_type "incident.force_merge" \
--arg ts "$NOW" \
--arg repo "$REPO" \
--argjson pr "$PR_NUMBER" \
--arg title "$TITLE" \
--arg base "$BASE_BRANCH" \
--arg merged_by "$MERGED_BY" \
--arg merge_sha "$MERGE_SHA" \
--argjson failed_checks "$FAILED_JSON" \
'{event_type: $event_type, ts: $ts, repo: $repo, pr: $pr, title: $title,
base_branch: $base, merged_by: $merged_by, merge_sha: $merge_sha,
failed_checks: $failed_checks}'
echo "::warning::FORCE-MERGE detected on PR #${PR_NUMBER} by ${MERGED_BY}: ${#FAILED_CHECKS[@]} required check(s) not green at merge time."

View File

@ -0,0 +1,219 @@
name: Auto-promote branch (reusable)
# Reusable version of the auto-promote-staging workflow that lived
# directly in molecule-ci. Any repo with a `from-branch` (typically
# `staging`) → `to-branch` (typically `main`) flow can call this
# workflow to fast-forward `to-branch` whenever `from-branch` is
# strictly ahead AND all configured required-status-checks on the
# `from-branch` HEAD are green.
#
# Adoption pattern in a consumer repo:
#
# # .github/workflows/auto-promote.yml
# name: Auto-promote staging → main
# on:
# push:
# branches: [staging]
# workflow_dispatch:
# permissions:
# contents: write # push the fast-forward to to-branch
# statuses: read # read commit status checks
# administration: read # read branch protection (REQUIRED — see below)
# jobs:
# promote:
# uses: molecule-ai/molecule-ci/.github/workflows/auto-promote-branch.yml@v1
# with:
# from-branch: staging
# to-branch: main
#
# Repo-agnostic by design — gates are read from the consuming repo's
# branch protection at run time, not hardcoded here.
#
# `@v1` is a moving tag pointing at the latest 1.x release of
# molecule-ci's reusable workflows (GitHub Actions convention, same
# as `actions/checkout@v4`). Breaking changes get a new `@v2` tag
# and the old `@v1` keeps working for existing consumers. Pinning to
# `@main` is also accepted for forward-compat preview but is
# unstable — any change merged here rolls out instantly to consumers
# without a release boundary.
#
# `administration: read` is REQUIRED. Without it, the branch-protection
# API returns 403 and the workflow refuses to fast-forward (fail-loud),
# rather than silently degrading to --ff-only-only enforcement (which
# is ancestry-only, not test-status — a green-but-flaky branch would
# ff-promote red commits). If you intentionally want no-gate
# enforcement, leave from-branch unprotected — a 404 from the API is
# treated as "no gates configured" and falls back to --ff-only safety.
#
# Excluded-by-policy repos (molecule-core + molecule-controlplane per
# CEO directive 2026-04-24) simply do not adopt this workflow; the
# reusable shape adds no surface area to repos that don't call it.
on:
workflow_call:
inputs:
from-branch:
description: "Source branch with green CI"
required: false
default: staging
type: string
to-branch:
description: "Target branch to fast-forward"
required: false
default: main
type: string
permissions:
contents: write
statuses: read
jobs:
promote:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
token: ${{ secrets.GITHUB_TOKEN }}
- name: Check required gates (if configured) on source HEAD
id: gates
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
REPO: ${{ github.repository }}
HEAD_SHA: ${{ github.sha }}
FROM_BRANCH: ${{ inputs.from-branch }}
shell: bash
run: |
set -euo pipefail
# Read required gates from branch protection. Three response
# classes, distinguished by HTTP status:
#
# 200 — branch protection is configured. Honor the gates.
# 404 — branch is not protected. Legitimate "no gates";
# fall back to --ff-only as the sole safety net.
# 403 — caller's GITHUB_TOKEN can't read branch protection.
# FAIL LOUD. The previous behavior conflated this
# with 404 ("api inaccessible") and silently degraded
# to --ff-only-only — which is ancestry-only, not
# test-status. A green-but-flaky branch would
# ff-promote red commits to the target. The fix:
# require the caller to add `administration: read`
# to its permissions block, or explicitly accept the
# no-gates posture by removing branch protection on
# the source branch.
#
# `gh api` exit code is 0 only on 2xx; non-zero on anything
# else. We use --include to capture HTTP status to discriminate.
if PROTECTION_RESP=$(gh api -i "repos/${REPO}/branches/${FROM_BRANCH}/protection/required_status_checks" 2>&1); then
HTTP_STATUS=200
else
HTTP_STATUS=$(echo "$PROTECTION_RESP" | grep -oE '^HTTP/[12](\.[01])? [0-9]{3}' | awk '{print $2}' | head -1)
HTTP_STATUS=${HTTP_STATUS:-unknown}
fi
case "$HTTP_STATUS" in
200)
# Strip headers from gh -i output to get just the body.
GATES_JSON=$(echo "$PROTECTION_RESP" | awk 'p{print} /^[[:space:]]*$/ && !p {p=1}')
;;
404)
echo "::notice::No branch protection on '${FROM_BRANCH}' — relying on --ff-only safety."
echo "ok=true" >> "$GITHUB_OUTPUT"
exit 0
;;
403|401)
echo "::error::Cannot read branch protection on '${FROM_BRANCH}' (HTTP ${HTTP_STATUS})."
echo "::error::Caller's GITHUB_TOKEN lacks 'administration: read' permission."
echo "::error::Refusing to fast-forward without explicit gate enforcement —"
echo "::error::a silent fallback to --ff-only here would let green-but-flaky"
echo "::error::branches promote red commits."
echo "::error::"
echo "::error::Fix: add to the caller's workflow's permissions block:"
echo "::error:: permissions:"
echo "::error:: contents: write"
echo "::error:: statuses: read"
echo "::error:: administration: read"
echo "::error::"
echo "::error::Or, if you intentionally want no-gate enforcement, remove"
echo "::error::branch protection on '${FROM_BRANCH}' so the API returns 404."
exit 1
;;
*)
echo "::error::Unexpected HTTP status '${HTTP_STATUS}' from branch-protection API."
echo "::error::Response (first 5 lines):"
echo "$PROTECTION_RESP" | head -5 | sed 's/^/::error:: /'
exit 1
;;
esac
GATES=$(echo "${GATES_JSON}" | jq -r '.contexts[]?' 2>/dev/null || true)
if [ -z "$GATES" ]; then
echo "::notice::Branch protection on '${FROM_BRANCH}' has zero required-status-checks contexts — relying on --ff-only safety."
echo "ok=true" >> "$GITHUB_OUTPUT"
exit 0
fi
echo "Required gates on '${FROM_BRANCH}':"
echo "${GATES}" | sed 's/^/ - /'
ALL_GREEN=true
while IFS= read -r gate; do
[ -z "$gate" ] && continue
conclusion=$(gh api "repos/${REPO}/commits/${HEAD_SHA}/check-runs" \
--jq "[.check_runs[] | select(.name == \"${gate}\")] | sort_by(.completed_at) | last.conclusion" \
2>/dev/null || echo "")
if [ -z "$conclusion" ] || [ "$conclusion" = "null" ]; then
conclusion=$(gh api "repos/${REPO}/commits/${HEAD_SHA}/status" \
--jq "[.statuses[] | select(.context == \"${gate}\")] | sort_by(.updated_at) | last.state" \
2>/dev/null || echo "")
fi
if [ "$conclusion" != "success" ] && [ "$conclusion" != "SUCCESS" ]; then
echo "::warning::Gate '${gate}' is '${conclusion:-missing}' on ${HEAD_SHA} — skipping promote."
ALL_GREEN=false
else
echo " ✓ ${gate}: success"
fi
done <<< "$GATES"
echo "ok=${ALL_GREEN}" >> "$GITHUB_OUTPUT"
- name: Fast-forward target branch to source HEAD
if: steps.gates.outputs.ok == 'true'
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
FROM_BRANCH: ${{ inputs.from-branch }}
TO_BRANCH: ${{ inputs.to-branch }}
shell: bash
run: |
set -euo pipefail
git config user.email "actions@github.com"
git config user.name "github-actions[bot]"
# Source branch is what's checked out (workflow fires on push to
# source). Can't fetch into it. Fetch target into a local target.
git fetch origin "${TO_BRANCH}"
git checkout -B "${TO_BRANCH}" "origin/${TO_BRANCH}"
# Check if target is already at or ahead of source.
if git merge-base --is-ancestor "origin/${FROM_BRANCH}" "${TO_BRANCH}" 2>/dev/null; then
echo "${TO_BRANCH} already contains ${FROM_BRANCH}; nothing to promote."
exit 0
fi
# --ff-only refuses if target has independent commits not on
# source (divergence — hotfix direct to target). Human resolves.
if ! git merge --ff-only "origin/${FROM_BRANCH}" 2>&1; then
echo "::warning::${TO_BRANCH} has diverged from ${FROM_BRANCH} — refusing fast-forward. Resolve manually (likely a direct-to-${TO_BRANCH} commit exists that ${FROM_BRANCH} doesn't have)."
exit 0
fi
git push origin "${TO_BRANCH}"
echo "::notice::Promoted: ${TO_BRANCH} is now at $(git rev-parse --short HEAD)"

View File

@ -0,0 +1,262 @@
name: Auto-promote staging → main (PR-based, reusable)
# Reusable PR-based auto-promote for repos whose `main` branch has
# protection rules that require status checks "set by the expected
# GitHub apps" — direct `git push` from a workflow can't satisfy
# that, only PR merges through the merge queue can.
#
# Distinct from the simpler ff-only auto-promote in this same repo
# (auto-promote-staging.yml): that one does `git merge --ff-only` +
# direct push and only works on repos WITHOUT required-status-checks.
# This reusable workflow is for the protected-branch case.
#
# Call from each repo's .github/workflows/ via a thin wrapper:
#
# name: Auto-promote staging → main
# on:
# workflow_run:
# workflows: [CI, E2E Staging Canvas, ...]
# types: [completed]
# workflow_dispatch:
# inputs:
# force:
# description: "Force promote (manual override)"
# required: false
# default: "false"
# permissions:
# contents: write
# pull-requests: write
# jobs:
# promote:
# uses: molecule-ai/molecule-ci/.github/workflows/auto-promote-staging-pr.yml@v1
# with:
# gates: "ci.yml,e2e-staging-canvas.yml,e2e-api.yml,codeql.yml"
# force: ${{ github.event.inputs.force == 'true' }}
# secrets: inherit
#
# IMPORTANT: the caller MUST keep the `on.workflow_run.workflows`
# display-name list in sync with the `gates` input (which uses
# workflow filenames). The reusable can't validate this — display
# names and filenames are decoupled in GitHub Actions.
#
# Required repo settings (one-time, in the CALLER repo):
#
# Settings → Actions → General → Workflow permissions
# → ✅ Allow GitHub Actions to create and approve pull requests
#
# Without it, every workflow run fails with:
#
# pull request create failed: GraphQL: GitHub Actions is not
# permitted to create or approve pull requests (createPullRequest)
#
# Toggle: caller repo variable AUTO_PROMOTE_ENABLED=true. Override
# via the `enabled-var` input if a different name is needed.
# When the variable is unset, the workflow logs what it would have
# done but doesn't open the PR — useful for dry-running the gate
# logic without surfacing a noisy PR while staging CI is still flaky.
on:
workflow_call:
inputs:
gates:
description: >-
Comma-separated list of workflow FILENAMES (not display
names) that must be conclusion=success on the staging head
SHA before promote fires. Example:
"ci.yml,e2e-staging-canvas.yml,codeql.yml". File paths are
used (not display names) because gh run list with display
names is ambiguous when two workflows share a name (observed
2026-04-28 with codeql.yml + GitHub UI's Code-quality default
setup both surfacing as "CodeQL").
required: true
type: string
target-branch:
description: "Target branch to promote TO (default: main)"
required: false
type: string
default: main
source-branch:
description: "Source branch to promote FROM (default: staging)"
required: false
type: string
default: staging
enabled-var:
description: >-
Repo variable name that gates this workflow. Set this
variable to "true" in the caller repo's Settings →
Variables → Actions to enable. Defaults to
AUTO_PROMOTE_ENABLED.
required: false
type: string
default: AUTO_PROMOTE_ENABLED
merge-method:
description: >-
Merge method for `gh pr merge --auto`. One of merge|squash|
rebase. Defaults to "merge" (matches user preference for
merge commits over squash).
required: false
type: string
default: merge
force:
description: >-
Skip the AUTO_PROMOTE_ENABLED variable check. Pass true
when the caller's workflow_dispatch input is force=true.
Default false.
required: false
type: boolean
default: false
jobs:
check-all-gates-green:
# Only consider promotions for the source branch's push events.
# PR runs into the source branch don't promote. workflow_dispatch
# passes through unconditionally.
if: >
(github.event_name == 'workflow_run' &&
github.event.workflow_run.head_branch == inputs.source-branch &&
github.event.workflow_run.event == 'push')
|| github.event_name == 'workflow_dispatch'
runs-on: ubuntu-latest
outputs:
all_green: ${{ steps.gates.outputs.all_green }}
head_sha: ${{ steps.gates.outputs.head_sha }}
steps:
- name: Check all required gates on this SHA
id: gates
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
HEAD_SHA: ${{ github.event.workflow_run.head_sha || github.sha }}
REPO: ${{ github.repository }}
GATES_CSV: ${{ inputs.gates }}
SOURCE_BRANCH: ${{ inputs.source-branch }}
run: |
set -euo pipefail
# Split the comma-separated gates input. Trim whitespace per
# entry so callers can format readably (e.g. "ci.yml, e2e.yml").
IFS=',' read -ra GATES <<< "$GATES_CSV"
echo "head_sha=${HEAD_SHA}" >> "$GITHUB_OUTPUT"
echo "Checking gates on SHA ${HEAD_SHA}"
ALL_GREEN=true
for gate_raw in "${GATES[@]}"; do
gate="${gate_raw## }"
gate="${gate%% }"
if [ -z "$gate" ]; then
continue
fi
# Query the most recent run of this workflow on this SHA.
# event=push to avoid picking up PR runs. branch filter
# guards against someone dispatching the gate on a non-
# source branch at the same SHA.
RESULT=$(gh run list \
--repo "$REPO" \
--workflow "$gate" \
--branch "$SOURCE_BRANCH" \
--event push \
--commit "$HEAD_SHA" \
--limit 1 \
--json status,conclusion \
--jq '.[0] | "\(.status)/\(.conclusion // "none")"' \
2>/dev/null || echo "missing/none")
echo " $gate → $RESULT"
# Only completed/success counts. Anything else aborts.
if [ "$RESULT" != "completed/success" ]; then
ALL_GREEN=false
fi
done
echo "all_green=${ALL_GREEN}" >> "$GITHUB_OUTPUT"
if [ "$ALL_GREEN" != "true" ]; then
echo "::notice::auto-promote: not all gates are green on ${HEAD_SHA} — staying on current ${{ inputs.target-branch }}"
fi
promote:
needs: check-all-gates-green
if: needs.check-all-gates-green.outputs.all_green == 'true'
runs-on: ubuntu-latest
steps:
- name: Check rollout gate
env:
ENABLED_VAR_NAME: ${{ inputs.enabled-var }}
ENABLED_VAR_VALUE: ${{ vars[inputs.enabled-var] }}
FORCE: ${{ inputs.force }}
run: |
set -eu
# Caller repo controls rollout via the named variable.
# Default name is AUTO_PROMOTE_ENABLED; callers can override.
if [ "${ENABLED_VAR_VALUE:-}" != "true" ] && [ "${FORCE:-false}" != "true" ]; then
{
echo "## ⏸ Auto-promote disabled"
echo
echo "Repo variable \`${ENABLED_VAR_NAME}\` is not set to \`true\`."
echo "All gates are green on ${{ inputs.source-branch }}; would have opened a promote PR to \`${{ inputs.target-branch }}\`."
echo
echo "To enable: Settings → Secrets and variables → Actions → Variables → \`${ENABLED_VAR_NAME}=true\`."
echo "To test once manually: workflow_dispatch with \`force=true\`."
} >> "$GITHUB_STEP_SUMMARY"
echo "::notice::auto-promote disabled — dry run only"
exit 0
fi
- name: Open (or reuse) ${{ inputs.source-branch }} → ${{ inputs.target-branch }} promote PR + enable auto-merge
if: ${{ vars[inputs.enabled-var] == 'true' || inputs.force == true }}
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
REPO: ${{ github.repository }}
TARGET_SHA: ${{ needs.check-all-gates-green.outputs.head_sha }}
SOURCE_BRANCH: ${{ inputs.source-branch }}
TARGET_BRANCH: ${{ inputs.target-branch }}
MERGE_METHOD: ${{ inputs.merge-method }}
GATES_CSV: ${{ inputs.gates }}
run: |
set -euo pipefail
# Look for an existing open promote PR (idempotent on re-run).
# The PR's head IS the source branch — the whole point is
# "advance target to source's tip", so we don't need a per-SHA
# branch like auto-sync-main-to-staging.yml uses.
PR_NUM=$(gh pr list --repo "$REPO" \
--base "$TARGET_BRANCH" --head "$SOURCE_BRANCH" --state open \
--json number --jq '.[0].number // ""')
if [ -z "$PR_NUM" ]; then
TITLE="${SOURCE_BRANCH} → ${TARGET_BRANCH}: auto-promote ${TARGET_SHA:0:7}"
BODY_FILE=$(mktemp)
cat > "$BODY_FILE" <<EOFBODY
Automated promotion of \`${SOURCE_BRANCH}\` (\`${TARGET_SHA:0:8}\`) to \`${TARGET_BRANCH}\`. Required gates green at this SHA: ${GATES_CSV}.
This PR is auto-generated by a thin caller of \`molecule-ai/molecule-ci/.github/workflows/auto-promote-staging-pr.yml\` whenever every required gate completes green on the same source-branch SHA. It exists because protected branches require status checks "set by the expected GitHub apps" — direct \`git push\` from a workflow can't satisfy that, only PR merges through the queue can.
Merge queue lands this; no human action needed unless gates fail.
EOFBODY
PR_URL=$(gh pr create --repo "$REPO" \
--base "$TARGET_BRANCH" --head "$SOURCE_BRANCH" \
--title "$TITLE" \
--body-file "$BODY_FILE")
PR_NUM=$(echo "$PR_URL" | grep -oE '[0-9]+$' | tail -1)
rm -f "$BODY_FILE"
echo "::notice::Opened PR #${PR_NUM}"
else
echo "::notice::Re-using existing promote PR #${PR_NUM}"
fi
# Enable auto-merge — the merge queue picks it up once
# required gates are green on the merge_group ref.
if ! gh pr merge "$PR_NUM" --repo "$REPO" --auto --"$MERGE_METHOD" 2>&1; then
echo "::warning::Failed to enable auto-merge on PR #${PR_NUM} — operator may need to merge manually."
fi
{
echo "## ✅ Auto-promote PR opened"
echo
echo "- Source: \`${SOURCE_BRANCH}\` at \`${TARGET_SHA:0:8}\`"
echo "- Target: \`${TARGET_BRANCH}\`"
echo "- PR: #${PR_NUM}"
echo
echo "Merge queue lands the PR once required gates are green; no human action needed unless gates fail."
} >> "$GITHUB_STEP_SUMMARY"

View File

@ -1,24 +1,14 @@
name: Auto-promote staging → main
# Fast-forwards `main` to `staging` when staging is strictly ahead (main
# is an ancestor). Eliminates the manual sync-PR round for non-critical
# repos.
# molecule-ci's own auto-promote: thin wrapper over the reusable
# `auto-promote-branch.yml` workflow factored out for org-wide reuse.
# Other repos consume the same reusable workflow via:
#
# Gate handling:
# - If the repo has required_status_checks configured AND the API
# returns them, all must be SUCCESS on the staging HEAD commit.
# - If no gates are configured (or the API 403s on a private free-tier
# repo), `--ff-only` is the sole safety. It refuses if main has
# independent commits staging doesn't contain.
# uses: molecule-ai/molecule-ci/.github/workflows/auto-promote-branch.yml@v1
#
# Excluded by policy: molecule-core + molecule-controlplane. Those two
# stay manual per CEO directive 2026-04-24.
#
# Safety:
# - Only fires on push to staging (PRs into staging don't promote)
# - `--ff-only` refuses if main has diverged (hotfix landed directly)
# - Promote commit goes through GITHUB_TOKEN; shows up in git log as
# a deliberate act
# Excluded by policy: molecule-core + molecule-controlplane stay
# manual per CEO directive 2026-04-24. Those repos do NOT call the
# reusable workflow.
on:
push:
@ -26,94 +16,14 @@ on:
workflow_dispatch:
permissions:
contents: write
statuses: read
contents: write # push the fast-forward to main
statuses: read # read commit status checks
administration: read # read branch protection (required by the
# reusable workflow — see its header for why)
jobs:
promote:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
token: ${{ secrets.GITHUB_TOKEN }}
- name: Check required gates (if configured) on staging HEAD
id: gates
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
REPO: ${{ github.repository }}
HEAD_SHA: ${{ github.sha }}
shell: bash
run: |
set -euo pipefail
# Try to read required gates from branch protection. Free-tier
# private repos may 403; handle that gracefully.
GATES_JSON=$(gh api "repos/${REPO}/branches/staging/protection/required_status_checks" 2>/dev/null || echo '{}')
GATES=$(echo "${GATES_JSON}" | jq -r '.contexts[]?' 2>/dev/null || true)
if [ -z "$GATES" ]; then
echo "No required gates configured (or API inaccessible). Relying on --ff-only safety."
echo "ok=true" >> "$GITHUB_OUTPUT"
exit 0
fi
echo "Required gates on staging:"
echo "${GATES}" | sed 's/^/ - /'
ALL_GREEN=true
while IFS= read -r gate; do
[ -z "$gate" ] && continue
conclusion=$(gh api "repos/${REPO}/commits/${HEAD_SHA}/check-runs" \
--jq "[.check_runs[] | select(.name == \"${gate}\")] | sort_by(.completed_at) | last.conclusion" \
2>/dev/null || echo "")
if [ -z "$conclusion" ] || [ "$conclusion" = "null" ]; then
conclusion=$(gh api "repos/${REPO}/commits/${HEAD_SHA}/status" \
--jq "[.statuses[] | select(.context == \"${gate}\")] | sort_by(.updated_at) | last.state" \
2>/dev/null || echo "")
fi
if [ "$conclusion" != "success" ] && [ "$conclusion" != "SUCCESS" ]; then
echo "::warning::Gate '${gate}' is '${conclusion:-missing}' on ${HEAD_SHA} — skipping promote."
ALL_GREEN=false
else
echo " ✓ ${gate}: success"
fi
done <<< "$GATES"
echo "ok=${ALL_GREEN}" >> "$GITHUB_OUTPUT"
- name: Fast-forward main to staging
if: steps.gates.outputs.ok == 'true'
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
shell: bash
run: |
set -euo pipefail
git config user.email "actions@github.com"
git config user.name "github-actions[bot]"
# staging is the checked-out branch (workflow fires on push to
# staging). Can't fetch into it. Fetch main into a local main.
git fetch origin main
git checkout -B main origin/main
# Check if main is already at or ahead of origin/staging.
if git merge-base --is-ancestor origin/staging main 2>/dev/null; then
echo "main already contains staging; nothing to promote."
exit 0
fi
# --ff-only refuses if main has independent commits not on
# staging (divergence — hotfix direct to main). Human resolves.
if ! git merge --ff-only origin/staging 2>&1; then
echo "::warning::main has diverged from staging — refusing fast-forward. Resolve manually (likely a direct-to-main commit exists that staging doesn't have)."
exit 0
fi
git push origin main
echo "::notice::Promoted: main is now at $(git rev-parse --short HEAD)"
uses: ./.github/workflows/auto-promote-branch.yml
with:
from-branch: staging
to-branch: main

View File

@ -0,0 +1,53 @@
name: Disable auto-merge on push
# Reusable guard against the "I enabled auto-merge then pushed more
# commits" race. Background: on 2026-04-27, PR #2174 in molecule-core
# auto-merged with only the first commit because the second commit
# was pushed AFTER the merge queue had already locked the PR's SHA.
# The second commit ended up orphaned on a merged-and-deleted branch.
#
# Mechanism: on every `pull_request: synchronize` event (= new commit
# pushed to an open PR), check if auto-merge is enabled. If yes,
# disable it and post a comment. This forces the operator to
# re-engage `gh pr merge --auto` after the new push, with the
# re-engagement acting as the verification step.
#
# Call from each repo's .github/workflows/ via a thin wrapper:
#
# name: pr-guards
# on:
# pull_request:
# types: [synchronize]
# permissions:
# pull-requests: write
# jobs:
# disable-auto-merge-on-push:
# uses: molecule-ai/molecule-ci/.github/workflows/disable-auto-merge-on-push.yml@v1
#
# False-positive behavior: if a CI bot pushes (e.g. dependency-update
# rebase, secret rotation), this also disables auto-merge for that
# PR. That's acceptable — the operator who originally enabled
# auto-merge gets notified and re-engages, which is exactly the
# verify-after-machine-edits behavior we want.
on:
workflow_call:
jobs:
guard:
name: Disable auto-merge on push
runs-on: ubuntu-latest
if: github.event.pull_request.auto_merge != null
permissions:
pull-requests: write
steps:
- name: Disable auto-merge
env:
GH_TOKEN: ${{ github.token }}
PR: ${{ github.event.pull_request.number }}
REPO: ${{ github.repository }}
NEW_SHA: ${{ github.event.pull_request.head.sha }}
run: |
set -eu
gh pr merge "$PR" --disable-auto -R "$REPO" || true
gh pr comment "$PR" -R "$REPO" --body "🔒 Auto-merge disabled — new commit (\`${NEW_SHA:0:7}\`) pushed after auto-merge was enabled. The merge queue locks SHAs at entry, so subsequent pushes can race. Verify the new commit and re-enable with \`gh pr merge --auto\`."

View File

@ -1,6 +1,6 @@
name: Publish Workspace Template Image
# Reusable workflow for every Molecule-AI/molecule-ai-workspace-template-*
# Reusable workflow for every molecule-ai/molecule-ai-workspace-template-*
# repo. Builds the template's Dockerfile on main and pushes to GHCR as
# `ghcr.io/molecule-ai/workspace-template-<runtime>:latest` (plus a
# per-commit `sha-<7>` tag). Auto-derives <runtime> from the caller repo
@ -17,7 +17,7 @@ name: Publish Workspace Template Image
# packages: write
# jobs:
# publish:
# uses: Molecule-AI/molecule-ci/.github/workflows/publish-template-image.yml@main
# uses: molecule-ai/molecule-ci/.github/workflows/publish-template-image.yml@v1
# secrets: inherit
#
# Runner choice (2026-04-22): ubuntu-latest
@ -40,6 +40,19 @@ on:
required: false
type: string
default: ""
runtime_version:
description: >-
molecule-ai-workspace-runtime version to install. Forwarded
as RUNTIME_VERSION docker build-arg. When unset, the
Dockerfile's requirements.txt pin is used. Cascade-triggered
builds forward client_payload.runtime_version here so each
rebuild has a unique build-arg → unique cache key →
guaranteed fresh `pip install`. Solves the
"cascade rebuilt but image still has old runtime" cache
trap that bit us repeatedly on 2026-04-27.
required: false
type: string
default: ""
outputs:
image:
description: "Full image reference that was pushed (with :latest tag)"
@ -90,6 +103,64 @@ jobs:
echo "sha=${SHA}" >> "$GITHUB_OUTPUT"
echo "::notice::Publishing runtime='${RUNTIME}' → ${IMAGE}:latest + :sha-${SHA}"
- name: Lint — no bare imports of runtime modules
# Templates that bare-import a workspace/ runtime module
# (e.g. `from plugins import load_plugins` instead of
# `from molecule_runtime.plugins import load_plugins`) work in
# the monorepo's bundled-runtime layout but explode at startup
# with `ModuleNotFoundError` once the runtime is installed as a
# package. This bit claude-code (5 imports), langgraph,
# deepagents, and gemini-cli on 2026-04-27 — each one a
# separate workspace-stuck-in-provisioning incident.
#
# Source of truth: molecule_runtime/_runtime_modules.json
# inside the published wheel (emitted by
# scripts/build_runtime_package.py). Pulling the manifest
# from PyPI's latest wheel ensures the lint never drifts from
# the rewriter's actual closed list. If the manifest can't be
# fetched (older wheel, PyPI down, etc.), falls back to the
# inline list — known to be correct as of 2026-04-27 — so
# the lint never silently passes on a fetch failure.
#
# Fail-fast: this runs before docker login + buildx setup so
# a bad PR returns red in seconds, not minutes.
shell: bash
run: |
set -eu
# Fallback list — used only when the manifest fetch fails.
# Mirrors scripts/build_runtime_package.py:TOP_LEVEL_MODULES
# at the time this comment was written.
FALLBACK_MODULES='plugins|adapter_base|config|main|preflight|prompt|coordinator|consolidation|events|heartbeat|transcript_auth|runtime_wedge|watcher|skill_loader|policies|adapters|builtin_tools|executor_helpers|a2a_executor|a2a_client|a2a_tools|a2a_cli|a2a_mcp_server|agent|agents_md|initial_prompt|molecule_ai_status|platform_auth|shared_runtime'
RUNTIME_MODULES=""
mkdir -p /tmp/runtime-wheel
if pip download --quiet molecule-ai-workspace-runtime --no-deps -d /tmp/runtime-wheel 2>/dev/null; then
WHEEL=$(ls /tmp/runtime-wheel/*.whl 2>/dev/null | head -1)
if [ -n "$WHEEL" ]; then
# Pull both top_level + subpackage names; both can be bare-imported.
RUNTIME_MODULES=$(unzip -p "$WHEEL" molecule_runtime/_runtime_modules.json 2>/dev/null \
| python3 -c "import sys,json; m=json.load(sys.stdin); print('|'.join(sorted(set(m['top_level_modules']) | set(m['subpackages']))))" 2>/dev/null || echo "")
fi
fi
if [ -n "$RUNTIME_MODULES" ]; then
echo "::notice::lint module list pulled from molecule-ai-workspace-runtime wheel manifest"
else
RUNTIME_MODULES="$FALLBACK_MODULES"
echo "::warning::could not read _runtime_modules.json from PyPI wheel — using inline fallback list"
fi
# Match `from <module> import` at start of line OR after any whitespace
# (function-scope imports inside if/try blocks count too).
if HITS=$(grep -nE "^\s*from (${RUNTIME_MODULES}) import" *.py 2>/dev/null); then
echo "::error::Bare imports of runtime modules found — must use \`from molecule_runtime.<module> import\`"
echo "$HITS" | sed 's/^/ /'
echo "::error::Fix: prefix each match with 'molecule_runtime.' (e.g. 'from plugins' → 'from molecule_runtime.plugins')."
exit 1
fi
echo "::notice::✓ no bare imports of runtime modules in template *.py files"
- name: Log in to GHCR
uses: docker/login-action@v3
with:
@ -100,7 +171,213 @@ jobs:
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Build & push template image to GHCR
- name: Build template image (load for smoke test, do not push yet)
# Build into the runner's local docker first so the smoke test can
# actually boot the image. We push :latest + :sha-* only AFTER the
# smoke test passes — this is the gate that prevents broken images
# from poisoning :latest. Background: 2026-04-27 outage where the
# template's adapter.py imported a symbol (RuntimeCapabilities)
# that the published runtime didn't yet export. The old smoke
# test only inspected the entrypoint string, so the broken image
# shipped to GHCR and every workspace provision hung.
uses: docker/build-push-action@v6
with:
context: .
file: ./Dockerfile
platforms: linux/amd64
load: true
push: false
tags: ${{ steps.tags.outputs.image }}:sha-${{ steps.tags.outputs.sha }}
cache-from: type=gha
cache-to: type=gha,mode=max
# RUNTIME_VERSION is empty by default. When the cascade fires
# (or workflow_dispatch is invoked with a version), it's the
# exact runtime version about to be installed. Forwarded as a
# build-arg so Dockerfiles that declare `ARG RUNTIME_VERSION`
# get cache-key invalidation per-version. Templates that
# don't declare the ARG silently ignore it (no breakage).
build-args: |
RUNTIME_VERSION=${{ inputs.runtime_version }}
labels: |
org.opencontainers.image.source=https://github.com/${{ github.repository }}
org.opencontainers.image.revision=${{ github.sha }}
org.opencontainers.image.description=Molecule AI workspace template — ${{ steps.tags.outputs.runtime }} runtime
- name: Smoke test — boot image and import every /app/*.py
# The real boot test. Imports every Python module at /app/ inside
# the image, which exercises:
# - adapter.py exists, no syntax errors, all module-level
# imports resolve against the pip-installed runtime version
# (catches version skew — symbol added to runtime but PyPI
# not yet republished, etc.)
# - executor.py / cli_executor.py / claude_sdk_executor.py /
# etc. — sibling modules adapter.py imports lazily inside
# create_executor(). Plain `import adapter` doesn't catch
# bugs there because they're behind `def create_executor`.
# This bit hermes (a2a-sdk migration) and langgraph
# (LangGraphA2AExecutor bare import) on 2026-04-27.
# - cross-cutting: any bare `from <runtime_module>` (the lint
# above catches these statically; this catches them at
# resolution time too, plus any imports of third-party
# packages that the lint can't reason about).
# We bypass the gosu/agent entrypoint with --entrypoint sh
# because import smoke doesn't need workspace permissions.
shell: bash
env:
IMAGE: ${{ steps.tags.outputs.image }}:sha-${{ steps.tags.outputs.sha }}
run: |
set -eu
docker run --rm --entrypoint sh "${IMAGE}" -c '
set -e
cd /app
for f in *.py; do
[ "$f" = "__init__.py" ] && continue
mod="${f%.py}"
python3 -c "import $mod" || { echo "::error::failed to import $mod"; exit 1; }
echo " ✓ $mod"
done
'
echo "::notice::✓ ${IMAGE} all /app/*.py modules import cleanly against installed runtime"
- name: Boot smoke — execute() against stub deps (#2275, task #131)
# The static import smoke above only IMPORTs /app/*.py — lazy
# imports buried inside `async def execute(...)` bodies (e.g.
# `from a2a.types import FilePart`) NEVER evaluate at static-
# import time. The 2026-04-2x v0→v1 a2a-sdk migration shipped 5
# such regressions in templates that all looked fine at module-
# load smoke (claude-code, langgraph, deepagents, gemini-cli,
# hermes — every one a separate provisioning incident).
#
# This step boots the image with MOLECULE_SMOKE_MODE=1, which
# routes molecule-runtime through smoke_mode.run_executor_smoke()
# — invokes executor.execute(stub_ctx, stub_queue) once with a
# short timeout. Healthy import tree → execution proceeds far
# enough to hit a network boundary and times out (exit 0).
# Broken lazy import → ImportError/ModuleNotFoundError from
# inside the executor body (exit 1).
#
# Universal turn-smoke (task #131): run_executor_smoke also
# consults runtime_wedge.is_wedged() at the end of every result
# path and upgrades a provisional PASS to FAIL when an adapter
# marked the runtime wedged. Catches PR-25-class regressions
# (claude-agent-sdk init wedge from a malformed CLI argv) where
# the SDK takes 60s to time out on `initialize()` — the outer
# wait_for must outlast that handshake so the adapter's wedge
# catch arm runs before the smoke gives up. That's why the
# smoke timeout is 90s (NOT the original 10s) and the outer
# `timeout` wrapper is 120s (NOT 60s). Lowering either back
# makes this gate blind to init-wedge bugs again — confirm with
# an injected wedge in test_smoke_mode.py before changing.
#
# Requires runtime >= 0.1.60 (the version that introduced
# smoke_mode). Older runtimes silently no-op and would hang on
# uvicorn, so we detect the module first and skip if absent —
# this lets templates pinned to older runtimes continue to
# publish without this gate flipping red, while every fresh
# cascade-triggered build (which forwards the just-published
# version as RUNTIME_VERSION) gets the gate automatically.
#
# Wrapped in `timeout` as a belt-and-suspenders safety net in
# case smoke_mode itself wedges — runner shouldn't hang
# indefinitely on a single template.
shell: bash
env:
IMAGE: ${{ steps.tags.outputs.image }}:sha-${{ steps.tags.outputs.sha }}
run: |
set -eu
HAS_SMOKE_MODE=$(docker run --rm --entrypoint sh "${IMAGE}" -c \
'python3 -c "import molecule_runtime.smoke_mode" >/dev/null 2>&1 && echo yes || echo no')
if [ "${HAS_SMOKE_MODE}" = "no" ]; then
echo "::warning::installed runtime predates molecule-core#2275 (no molecule_runtime.smoke_mode); skipping boot smoke. Bump requirements.txt to molecule-ai-workspace-runtime>=0.1.60 to enable."
exit 0
fi
if [ ! -f config.yaml ]; then
echo "::error::config.yaml not found at repo root — boot smoke needs it to populate /configs. Templates without a config.yaml at root cannot be boot-smoked; either add one or skip this gate by setting an old runtime pin."
exit 1
fi
# Mount the repo's own config.yaml at /configs so the runtime
# can reach create_executor() — that's where the lazy imports
# we want to test actually live. The image's entrypoint drops
# priv from root to agent (uid 1000) before exec'ing
# molecule-runtime, so /configs needs to be readable AND
# traversable from uid 1000.
#
# Use `a+rX` (capital X — only adds x where it's already
# executable, i.e. directories): mktemp -d creates the dir
# with mode 700, so a bare `go+r` would leave the dir
# un-traversable for agent and config.py would
# PermissionError on `Path('/configs/config.yaml').exists()`.
# Mount RW (not :ro) so the entrypoint's `chown -R agent
# /configs` succeeds — its silent chown failure on a :ro
# mount was the original symptom.
SMOKE_CONFIG_DIR=$(mktemp -d)
cp config.yaml "${SMOKE_CONFIG_DIR}/"
chmod -R a+rX "${SMOKE_CONFIG_DIR}"
# Stub credentials — adapters validate shape at create_executor
# time but the smoke times out before any real call goes out.
# Set the common ones so any adapter that early-validates a
# specific key sees a non-empty value.
# PYTHONPATH=/app mirrors what the platform's provisioner
# injects at workspace startup (workspace-server/internal/
# provisioner/provisioner.go:563). Without it,
# `importlib.import_module('adapter')` in the runtime's
# preflight check fails with ModuleNotFoundError because
# molecule-runtime is a console_scripts entry point —
# sys.path[0] is /usr/local/bin, NOT /app. The existing
# static import smoke step above doesn't hit this because
# `python3 -c "import $mod"` adds cwd to sys.path; only the
# entry-point invocation needs PYTHONPATH.
set +e
# MOLECULE_SMOKE_TIMEOUT_SECS=90 is calibrated to outlast
# claude-agent-sdk's 60s initialize() handshake (see step
# comment above + workspace/smoke_mode.py top docstring) so
# adapter wedge catch arms run before run_executor_smoke
# gives up. Outer `timeout 120` is the runner-level safety
# net — slightly longer than the inner timeout so a hung
# smoke_mode itself surfaces as exit 124 and gets a clear
# error message instead of just `exit 1`.
timeout 120 docker run --rm \
-v "${SMOKE_CONFIG_DIR}:/configs" \
-e WORKSPACE_ID=fake-smoke \
-e PYTHONPATH=/app \
-e MOLECULE_SMOKE_MODE=1 \
-e MOLECULE_SMOKE_TIMEOUT_SECS=90 \
-e CLAUDE_CODE_OAUTH_TOKEN=sk-fake-smoke-token \
-e ANTHROPIC_API_KEY=sk-fake-smoke-key \
-e GEMINI_API_KEY=fake-smoke-key \
-e OPENAI_API_KEY=sk-fake-smoke-key \
"${IMAGE}"
rc=$?
set -e
# Cleanup is best-effort: the entrypoint chowns /configs to
# uid 1000 (agent) inside the container, which propagates to
# the host bind-mount, leaving the runner user unable to
# remove the files. Fall back to `sudo rm` and ignore any
# remaining failure — the runner is ephemeral, /tmp is
# cleaned automatically post-job.
rm -rf "${SMOKE_CONFIG_DIR}" 2>/dev/null \
|| sudo rm -rf "${SMOKE_CONFIG_DIR}" 2>/dev/null \
|| true
if [ "${rc}" -eq 124 ]; then
echo "::error::boot smoke wedged past 120s — smoke_mode itself failed to terminate (look for blocking calls before MOLECULE_SMOKE_TIMEOUT_SECS fires)"
exit 1
fi
if [ "${rc}" -ne 0 ]; then
echo "::error::boot smoke failed (exit ${rc}) — executor.execute() raised an import error OR an adapter marked runtime_wedge.is_wedged() (PR-25-class init wedge). Check the container log above for the offending lazy import or wedge reason."
exit "${rc}"
fi
echo "::notice::✓ ${IMAGE} executor.execute() smoke passed (imports healthy, no runtime wedge)"
- name: Push image to GHCR (post-smoke)
# Now that the smoke test passed, push both tags. build-push-action
# reuses the cached build from the load step above, so this is fast
# — it's effectively a layer push, not a rebuild. Same build-args
# passed for cache key consistency.
uses: docker/build-push-action@v6
with:
context: .
@ -112,24 +389,9 @@ jobs:
${{ steps.tags.outputs.image }}:sha-${{ steps.tags.outputs.sha }}
cache-from: type=gha
cache-to: type=gha,mode=max
build-args: |
RUNTIME_VERSION=${{ inputs.runtime_version }}
labels: |
org.opencontainers.image.source=https://github.com/${{ github.repository }}
org.opencontainers.image.revision=${{ github.sha }}
org.opencontainers.image.description=Molecule AI workspace template — ${{ steps.tags.outputs.runtime }} runtime
- name: Smoke test the pushed image
# Pull the tag we just pushed and verify the entrypoint is set.
# Catches "image pushed but binary missing" regressions without a
# full end-to-end provision test. We don't `docker run` — most
# templates need platform env (WORKSPACE_ID, PLATFORM_URL, etc.)
# to actually boot, so inspection is the right layer here.
shell: bash
env:
IMAGE: ${{ steps.tags.outputs.image }}:sha-${{ steps.tags.outputs.sha }}
run: |
set -eu
docker pull "${IMAGE}"
docker inspect "${IMAGE}" --format '{{.Config.Entrypoint}} {{.Config.Cmd}}' \
| tee /dev/stderr \
| grep -qE '.' || { echo "::error::Image has empty entrypoint+cmd"; exit 1; }
echo "::notice::✓ ${IMAGE} pulled and entrypoint verified"

View File

@ -9,13 +9,23 @@ jobs:
timeout-minutes: 10
steps:
- uses: actions/checkout@v4
# Canonical validator script lives in molecule-ci, fetched fresh on
# every run. The previous setup expected `.molecule-ci/scripts/` to
# be vendored INTO each org-template repo, which drifted across the
# 5 org-template repos as the validator evolved. Single source of
# truth eliminates that drift class entirely. Mirrors the same
# pattern already used by validate-workspace-template.yml.
# Direct git-clone — see validate-plugin.yml for the rationale.
# Anonymous fetch of public molecule-ci, no actions/checkout idiosyncrasies.
- name: Fetch molecule-ci canonical scripts
run: git clone --depth 1 https://git.moleculesai.app/molecule-ai/molecule-ci.git .molecule-ci-canonical
- uses: actions/setup-python@v5
with:
python-version: "3.11"
cache: "pip"
cache-dependency-path: .molecule-ci/scripts/requirements.txt
cache-dependency-path: .molecule-ci-canonical/.molecule-ci/scripts/requirements.txt
- run: pip install pyyaml -q
- run: python3 .molecule-ci/scripts/validate-org-template.py
- run: python3 .molecule-ci-canonical/.molecule-ci/scripts/validate-org-template.py
- name: Check for secrets
run: |
python3 - << 'PYEOF'
@ -32,7 +42,7 @@ jobs:
re.compile(r'''ghp_[a-zA-Z0-9]{36,}'''),
re.compile(r'''sk-ant-[a-zA-Z0-9]{50,}'''),
]
SKIP_DIRS = {'.molecule-ci', '.git', 'node_modules', '__pycache__'}
SKIP_DIRS = {'.molecule-ci', '.molecule-ci-canonical', '.git', 'node_modules', '__pycache__'}
EXTENSIONS = {'.yaml', '.yml', '.md', '.py', '.sh'}
def is_false_positive(line):

View File

@ -9,13 +9,32 @@ jobs:
timeout-minutes: 10
steps:
- uses: actions/checkout@v4
# Canonical validator script lives in molecule-ci, fetched fresh on
# every run. The previous setup expected `.molecule-ci/scripts/` to
# be vendored INTO each plugin repo, which drifted across the
# 20+ plugin repos as the validator evolved. Single source of
# truth eliminates that drift class entirely. Mirrors the same
# pattern already used by validate-workspace-template.yml.
# Direct git-clone instead of actions/checkout@v4 because:
# (a) actions/checkout@v4 sends Authorization: basic <github.token> by default,
# and Gitea 404s the cross-repo authenticated request (different from
# GitHub which falls back to anon-public-read).
# (b) Passing token: '' triggers actions/checkout's runtime "Input required
# and not supplied: token" error — the input is documented as
# required:false but the action's runtime calls getInput with
# required:true on its auth-helper path.
# Anonymous git clone of public molecule-ci has neither problem.
# See molecule-ci#1 (lowercase fix) + #2 (token:'' attempt) +
# the post-merge CI run on plugin-molecule-careful-bash@663bf72.
- name: Fetch molecule-ci canonical scripts
run: git clone --depth 1 https://git.moleculesai.app/molecule-ai/molecule-ci.git .molecule-ci-canonical
- uses: actions/setup-python@v5
with:
python-version: "3.11"
cache: "pip"
cache-dependency-path: .molecule-ci/scripts/requirements.txt
cache-dependency-path: .molecule-ci-canonical/.molecule-ci/scripts/requirements.txt
- run: pip install pyyaml -q
- run: python3 .molecule-ci/scripts/validate-plugin.py
- run: python3 .molecule-ci-canonical/.molecule-ci/scripts/validate-plugin.py
- name: Check for secrets
run: |
python3 - << 'PYEOF'
@ -32,7 +51,7 @@ jobs:
re.compile(r'''ghp_[a-zA-Z0-9]{36,}'''),
re.compile(r'''sk-ant-[a-zA-Z0-9]{50,}'''),
]
SKIP_DIRS = {'.molecule-ci', '.git', 'node_modules', '__pycache__'}
SKIP_DIRS = {'.molecule-ci', '.molecule-ci-canonical', '.git', 'node_modules', '__pycache__'}
EXTENSIONS = {'.yaml', '.yml', '.md', '.py', '.sh'}
def is_false_positive(line):

View File

@ -2,23 +2,66 @@ name: Validate Workspace Template
on:
workflow_call:
# Defense-in-depth on the GITHUB_TOKEN scope. This workflow runs
# untrusted-by-design code from the calling template repo — pip
# installs the template's requirements.txt (post-install hooks),
# imports adapter.py, and `docker build`s the Dockerfile (RUN
# steps). Each of those primitives can execute arbitrary code with
# the token in env. Pinning `contents: read` means the worst a
# malicious template PR can do with the token is read public repo
# state — no write to issues, no push to branches, no comment-spam,
# no workflow re-trigger.
#
# Fork-PR lockdown (#135): the workflow splits into two jobs:
#
# validate-static — file-content checks only (secret scan, YAML
# parse, AST inspection of adapter.py without
# import). Always runs, including external fork
# PRs. Safe because no third-party code executes.
#
# validate-runtime — pip install requirements.txt + import
# adapter.py + docker build. SKIPPED on fork
# PRs because each step is arbitrary code
# execution from the template repo's perspective.
# Internal PRs and post-merge runs still get
# the full coverage.
#
# What this prevents: a malicious external PR can no longer
# crypto-mine on the runner, DNS-exfiltrate runner metadata, or
# attempt to read GitHub-Actions internal env via a setup.py
# postinstall hook. They still get static feedback (secret scan
# is the most important security check anyway).
#
# What this does NOT prevent: malicious template metadata that
# passes static checks. The runtime job catches those once the PR
# merges (or an internal contributor reposts the change), at which
# point branch protection on staging/main blocks the merge if
# runtime validation fails.
permissions:
contents: read
jobs:
validate:
name: Template validation
validate-static:
name: Template validation (static)
runs-on: ubuntu-latest
timeout-minutes: 15
timeout-minutes: 5
steps:
# Calling template repo (Dockerfile + config.yaml + adapter.py).
- uses: actions/checkout@v4
# Canonical validator script lives in molecule-ci, fetched fresh on
# every run. The previous setup expected `.molecule-ci/scripts/` to
# be vendored INTO each template repo, which drifted across the 8
# template repos as the validator evolved. Single source of truth
# eliminates that drift class entirely — every template runs the
# same canonical contract check on every CI run.
# Direct git-clone — see validate-plugin.yml for the rationale.
# Anonymous fetch of public molecule-ci, no actions/checkout idiosyncrasies.
- name: Fetch molecule-ci canonical scripts
run: git clone --depth 1 https://git.moleculesai.app/molecule-ai/molecule-ci.git .molecule-ci-canonical
- uses: actions/setup-python@v5
with:
python-version: "3.11"
cache: "pip"
cache-dependency-path: .molecule-ci/scripts/requirements.txt
- run: pip install pyyaml -q
- run: python3 .molecule-ci/scripts/validate-workspace-template.py
- name: Docker build smoke test
if: hashFiles('Dockerfile') != ''
run: docker build -t template-test . --no-cache 2>&1 | tail -5 && echo "✓ Docker build succeeded"
# Secret scan — the most important check. Always runs.
- name: Check for secrets
run: |
python3 - << 'PYEOF'
@ -68,3 +111,100 @@ jobs:
else:
print("::notice::No secrets detected")
PYEOF
# Static-only validator — file existence checks, YAML parse,
# AST inspection of adapter.py (no import). Doesn't execute
# any third-party code; safe on fork PRs.
- run: pip install pyyaml -q
- run: python3 .molecule-ci-canonical/scripts/validate-workspace-template.py --static-only
validate-runtime:
name: Template validation (runtime)
runs-on: ubuntu-latest
timeout-minutes: 15
needs: validate-static
# Skip when the PR comes from a fork — those are external,
# untrusted, and would let attackers run pip install / docker
# build / adapter.py import on our runner. Internal PRs (head
# repo == base repo, fork == false) and push events to internal
# branches both keep full coverage.
#
# github.event.pull_request.head.repo.fork is null for non-PR
# events (push, schedule, etc.) — defaults to running.
if: github.event.pull_request.head.repo.fork != true
steps:
- uses: actions/checkout@v4
# Direct git-clone — see validate-plugin.yml for the rationale.
# Anonymous fetch of public molecule-ci, no actions/checkout idiosyncrasies.
- name: Fetch molecule-ci canonical scripts
run: git clone --depth 1 https://git.moleculesai.app/molecule-ai/molecule-ci.git .molecule-ci-canonical
- uses: actions/setup-python@v5
with:
python-version: "3.11"
# Cache pip against the calling repo's own requirements.txt
# (the file we install one step below). Pointing the cache key
# at the validator's own deps was decorative — pyyaml never
# changes, so the key never invalidated even when the template
# added a heavy dep like crewai.
cache: "pip"
cache-dependency-path: requirements.txt
- run: pip install pyyaml -q
# Install the template's runtime dependencies so the validator's
# `check_adapter_runtime_load()` can import adapter.py the same way
# the workspace container does at boot. Without this, a
# syntactically-valid adapter that ImportErrors on a missing
# transitive dep would build clean and crash on first user prompt.
# The fallback (no requirements.txt) installs the runtime alone so
# BaseAdapter is at least importable for the class-discovery check.
- if: hashFiles('requirements.txt') != ''
run: pip install -q -r requirements.txt
- if: hashFiles('requirements.txt') == ''
run: pip install -q molecule-ai-workspace-runtime
# Full validator — includes adapter.py import (exec_module).
- run: python3 .molecule-ci-canonical/scripts/validate-workspace-template.py
- name: Docker build smoke test
if: hashFiles('Dockerfile') != ''
run: docker build -t template-test . --no-cache 2>&1 | tail -5 && echo "✓ Docker build succeeded"
# Aggregator that emits a single `Template validation` check name —
# the caller's job (`validate:` in each template's ci.yml) plus this
# job's name produces `validate / Template validation`, which is what
# template-repo branch protection has historically required.
#
# Why it's needed: the workflow was refactored from one job into
# validate-static + validate-runtime (with matrix-suffixed display
# names) for fork-PR security. The matrix names never match the
# original required-check name, so PR auto-merge silently hung in
# BLOCKED forever on every template repo (caught while shipping
# fixes for the boot-smoke gate, openclaw#11 + hermes#29).
#
# `if: always()` so it reports out even when validate-static fails —
# without that, GitHub marks the aggregator as SKIPPED and branch
# protection still blocks because the required check never reports
# a final state.
#
# Fork-PR semantics: validate-runtime is intentionally skipped on
# fork PRs (security gate). Treat `skipped` as a pass for the
# aggregator on forks so static-only coverage doesn't make every
# external PR un-mergeable.
template-validation:
name: Template validation
runs-on: ubuntu-latest
needs: [validate-static, validate-runtime]
if: always()
timeout-minutes: 1
steps:
- name: Aggregate
run: |
static="${{ needs.validate-static.result }}"
runtime="${{ needs.validate-runtime.result }}"
echo "validate-static: $static"
echo "validate-runtime: $runtime"
if [ "$static" != "success" ]; then
echo "::error::validate-static did not succeed: $static"
exit 1
fi
if [ "$runtime" != "success" ] && [ "$runtime" != "skipped" ]; then
echo "::error::validate-runtime did not succeed: $runtime"
exit 1
fi
echo "::notice::Template validation aggregate passed (static=$static, runtime=$runtime)"

9
.gitignore vendored
View File

@ -19,3 +19,12 @@
# Workspace auth tokens
.auth-token
.auth_token
# Python bytecode + caches — never commit. Generated by every test run.
__pycache__/
*.pyc
*.pyo
*.pyd
.pytest_cache/
.mypy_cache/
.ruff_cache/

View File

@ -2,19 +2,47 @@
"""Validate a Molecule AI org template repo."""
import os, sys, yaml
# Support !include and other custom YAML tags used by org templates.
# These resolve at platform load time, not at validation time — we just
# need to parse past them without crashing.
# Support custom YAML tags used by org templates. Two shapes:
#
# - `!include teams/pm.yaml` → scalar string referencing another YAML
# file in the same repo. Platform inlines at load time.
#
# - `!external\n repo: ...\n ref: ...\n path: ...` → mapping
# referencing a workspace tree to fetch from another repo. Platform
# fetches into a content-addressable cache at load time
# (internal#77 / molecule-core#105).
#
# Both shapes resolve at platform load time, not at validation time.
# The validator treats them as opaque references — it does NOT chase
# them down. We mark each parsed value with a sentinel subtype so the
# `validate_workspace` walk knows to skip them rather than tripping
# the "missing 'name'" branch.
class IncludeRef(str):
"""`!include path/to.yaml` — opaque reference, skipped by validator."""
class ExternalRef(dict):
"""`!external` mapping — opaque reference, skipped by validator."""
class PermissiveLoader(yaml.SafeLoader):
pass
def _include_constructor(loader, node):
return IncludeRef(loader.construct_scalar(node))
def _external_constructor(loader, node):
return ExternalRef(loader.construct_mapping(node))
def _generic_constructor(loader, tag_suffix, node):
# Fallback for unknown tags. Preserve the parsed shape so legacy
# docs that lean on tags we have not modeled yet still parse.
if isinstance(node, yaml.MappingNode):
return loader.construct_mapping(node)
if isinstance(node, yaml.SequenceNode):
return loader.construct_sequence(node)
return loader.construct_scalar(node)
PermissiveLoader.add_constructor("!include", _include_constructor)
PermissiveLoader.add_constructor("!external", _external_constructor)
PermissiveLoader.add_multi_constructor("!", _generic_constructor)
errors = []
@ -33,7 +61,13 @@ if not org.get("workspaces") and not org.get("defaults"):
errors.append("org.yaml must have at least 'workspaces' or 'defaults'")
def validate_workspace(ws, path=""):
# !include tags resolve to strings at parse time; skip non-dicts
# `!include path/to.yaml` parses as IncludeRef (str subclass).
# `!external {repo, ref, path}` parses as ExternalRef (dict subclass).
# Both are opaque references — skip without chasing.
if isinstance(ws, (IncludeRef, ExternalRef)):
return []
# Legacy unknown-tag scalars (handled by _generic_constructor) stay
# as plain strings; they are not workspace dicts either.
if not isinstance(ws, dict):
return []
ws_errors = []
@ -59,6 +93,11 @@ if errors:
def count_ws(nodes):
c = 0
for n in nodes:
# Skip opaque references — we do not know how many workspaces
# they expand to without resolving them, and resolution is the
# platform's job, not the validator's.
if isinstance(n, (IncludeRef, ExternalRef)):
continue
if not isinstance(n, dict):
continue
c += 1
@ -66,4 +105,4 @@ def count_ws(nodes):
return c
total = count_ws(org.get("workspaces", []))
print(f"✓ org.yaml valid: {org['name']} ({total} workspaces)")
print(f"✓ org.yaml valid: {org['name']} ({total} direct workspaces; external refs not counted)")

View File

@ -1,47 +0,0 @@
#!/usr/bin/env python3
"""Validate a Molecule AI workspace template repo."""
import os, sys, yaml
errors = []
if not os.path.isfile("config.yaml"):
print("::error::config.yaml not found at repo root")
sys.exit(1)
with open("config.yaml") as f:
config = yaml.safe_load(f)
if not config.get("name"):
errors.append("Missing required field: name")
if not config.get("runtime"):
errors.append("Missing required field: runtime")
known = {"langgraph", "claude-code", "crewai", "autogen", "deepagents", "hermes", "gemini-cli", "openclaw"}
runtime = config.get("runtime", "")
if runtime and runtime not in known:
print(f"::warning::Runtime '{runtime}' is not in the known set. OK for custom runtimes.")
# Check for legacy imports
if os.path.isfile("adapter.py"):
with open("adapter.py") as f:
content = f.read()
if "molecule_runtime" in content:
print("::warning::adapter.py imports 'molecule_runtime' — legacy import, use 'molecule_ai' or platform SDK")
# Check for missing molecule-ai-workspace-runtime dependency hint
if os.path.isfile("Dockerfile"):
with open("Dockerfile") as f:
content = f.read()
if "molecule-ai-workspace-runtime" not in content:
print("::warning::Dockerfile does not reference 'molecule-ai-workspace-runtime' — may need base runtime package")
sv = config.get("template_schema_version")
if sv is None:
errors.append("Missing template_schema_version (add: template_schema_version: 1)")
if errors:
for e in errors:
print(f"::error::{e}")
sys.exit(1)
print(f"✓ config.yaml valid: {config['name']} (runtime: {config.get('runtime')})")

View File

@ -12,7 +12,7 @@ name: CI
on: [push, pull_request]
jobs:
validate:
uses: Molecule-AI/molecule-ci/.github/workflows/validate-plugin.yml@main
uses: Molecule-AI/molecule-ci/.github/workflows/validate-plugin.yml@v1
```
### Workspace template repos (`molecule-ai-workspace-template-*`)
@ -23,7 +23,7 @@ name: CI
on: [push, pull_request]
jobs:
validate:
uses: Molecule-AI/molecule-ci/.github/workflows/validate-workspace-template.yml@main
uses: Molecule-AI/molecule-ci/.github/workflows/validate-workspace-template.yml@v1
```
### Org template repos (`molecule-ai-org-template-*`)
@ -34,9 +34,28 @@ name: CI
on: [push, pull_request]
jobs:
validate:
uses: Molecule-AI/molecule-ci/.github/workflows/validate-org-template.yml@main
uses: Molecule-AI/molecule-ci/.github/workflows/validate-org-template.yml@v1
```
### Any repo with auto-merge enabled
PR-time guards (currently: disable auto-merge on follow-up push). Consume from a thin caller:
```yaml
# .github/workflows/pr-guards.yml
name: pr-guards
on:
pull_request:
types: [synchronize]
permissions:
pull-requests: write
jobs:
disable-auto-merge-on-push:
uses: Molecule-AI/molecule-ci/.github/workflows/disable-auto-merge-on-push.yml@v1
```
When the team lands more PR-time guards in this repo, add them as additional jobs in the same caller — keeps each consuming repo's footprint to one file.
## What each workflow validates
### validate-plugin
@ -74,6 +93,21 @@ jobs:
| `template_schema_version` present | Warning | Missing version contract |
| No committed secrets | Error | Leaked API keys |
### disable-auto-merge-on-push
PR-time safety guard. When `pull_request:synchronize` fires (= a new commit pushed to an open PR) and auto-merge is already enabled, this workflow disables auto-merge and posts a comment requiring the operator to re-engage explicitly.
**Why it exists:** on 2026-04-27, molecule-core PR #2174 auto-merged with only its first commit because the second commit was pushed AFTER the merge queue had locked the PR's SHA. The second commit ended up orphaned on a merged-and-deleted branch.
**Pairs with the org-wide repo setting** "Automatically delete head branches" (already enabled on all 10 Molecule-AI repos). Defense in depth:
1. Repo setting blocks pushes to a merged-and-deleted branch (catches the post-merge orphan case).
2. This workflow catches the in-queue race (push during queue processing) by force-disabling auto-merge.
Together they cover the full lifecycle of "auto-merge enabled → new commits arrive" without operator discipline.
**False-positive note:** if a CI bot pushes (dependency update, secret rotation), this also disables auto-merge. That's intentional — the operator who originally enabled auto-merge gets notified and re-engages, which is exactly the verify-after-machine-edits behavior we want.
## License
Business Source License 1.1 — © Molecule AI.

67
docs/template-contract.md Normal file
View File

@ -0,0 +1,67 @@
# Workspace Template Contract
Hard rules every `molecule-ai-workspace-template-*` repo must satisfy. Enforced by `scripts/validate-workspace-template.py` on every CI run via the reusable `validate-workspace-template.yml` workflow.
The contract exists because the 8 template repos were extracted from a single monolithic Dockerfile pre-#87, and have drifted as each was edited piecemeal since. Without this gate, a 28-line cascade-friendly Dockerfile in one repo silently regresses to a 25-line non-cache-friendly one in another, and the next runtime publish ships the previous wheel from a stale layer (cache trap observed five times in a row on 2026-04-27).
## Dockerfile
| Rule | Why |
|---|---|
| `FROM python:3.11-slim` | Single base everywhere — keeps apt + pip behaviour identical and lets us reason about CVE patches on one base. |
| `ARG RUNTIME_VERSION=` declared | The arg invalidates the pip-install layer's cache key whenever the cascade publishes a new wheel. Without it the cache hit replays the previous runtime. |
| `${RUNTIME_VERSION}` referenced in a `RUN` | Just declaring the ARG isn't enough — it has to be in the layer's command line so docker hashes it. Pattern: `if [ -n "${RUNTIME_VERSION}" ]; then pip install --no-cache-dir --upgrade "molecule-ai-workspace-runtime==${RUNTIME_VERSION}"; fi` |
| `RUN useradd -u 1000 -m -s /bin/bash agent` | The runtime drops to uid 1000 before exec'ing the SDK. Claude Code refuses `--dangerously-skip-permissions` as root for safety. The `/workspace` volume is also chown'd to 1000 by the platform provisioner. |
| `ENTRYPOINT ["molecule-runtime"]` *or* a wrapper script that exec's `molecule-runtime` | Single entrypoint means the platform's container-restart contract is uniform across templates. Wrapper scripts are allowed (claude-code has `entrypoint.sh` for gosu drop-priv; hermes has `start.sh` to boot the hermes-agent daemon first). |
| `molecule-ai-workspace-runtime` listed in `requirements.txt` (or installed in the Dockerfile directly) | The runtime wheel is the contract — without it the container has no A2A server, no heartbeat, no MCP bridge. |
## config.yaml
| Required key | Type | Notes |
|---|---|---|
| `name` | str | Human-readable; appears on the canvas card. |
| `runtime` | str | Must be one of: `langgraph`, `claude-code`, `crewai`, `autogen`, `deepagents`, `hermes`, `gemini-cli`, `openclaw`. Custom runtimes warn but are allowed. |
| `template_schema_version` | int | Currently `1`. Bump when adding a key that changes how the platform consumes config.yaml. **Must be int**, not string — a quoted `"1"` will fail validation. |
| Optional key | Notes |
|---|---|
| `description` | Free text, surfaces on canvas. |
| `version`, `tier` | int, controls platform-side rollout gating. |
| `model`, `models` | Either a single model id or a list of model ids the agent may use. |
| `runtime_config` | Nested block of runtime-specific settings (used by claude-code, gemini-cli, hermes). |
| `env`, `skills`, `tools`, `a2a`, `delegation`, `prompt_files`, `bridge`, `governance` | Optional feature blocks. Add new keys to `OPTIONAL_KEYS` in the validator when introducing them. |
Unknown top-level keys produce a warning (not an error) so accidental drift is visible without blocking.
## adapter.py
Optional. When present, `adapter.py` should:
- Import `BaseAdapter` from `molecule_runtime.adapter_base`.
- Override `setup()` and `create_executor()` for the runtime's specific entry point.
The pre-#87 import path (`molecule_ai`) produces a warning if it appears.
## requirements.txt
Must declare `molecule-ai-workspace-runtime` (with a version pin or floor).
## CI
Every template repo's `.github/workflows/ci.yml` should be a one-liner that calls the canonical reusable workflow:
```yaml
name: CI
on: [push, pull_request]
jobs:
validate:
uses: Molecule-AI/molecule-ci/.github/workflows/validate-workspace-template.yml@v1
```
The reusable workflow checks out `molecule-ci` itself (into `.molecule-ci-canonical`) and runs the canonical `validate-workspace-template.py` from there — so no per-repo vendoring of the script is needed. The legacy `.molecule-ci/scripts/` directory in each template repo is being phased out.
## Adding a new runtime
1. Add the runtime name to `KNOWN_RUNTIMES` in `scripts/validate-workspace-template.py`.
2. Add the runtime + image ref to `RuntimeImages` in `molecule-core/workspace-server/internal/provisioner/provisioner.go`.
3. Stand up the `molecule-ai-workspace-template-<runtime>` repo from the existing template-of-templates pattern (issue #105 covers this).
4. Confirm CI green on the new repo before opening it for general use.

224
scripts/migrate-template.py Executable file
View File

@ -0,0 +1,224 @@
#!/usr/bin/env python3
"""Migrate a workspace template's config.yaml across schema versions.
Companion to validate-workspace-template.py. Whenever the validator
adds a new schema version, this script gets a corresponding entry in
MIGRATIONS so each consumer template can mechanically upgrade rather
than every maintainer figuring out the field changes by hand.
Discipline (matches the validator's header):
1. Validator gets a SCHEMA_V<N+1> block + KNOWN_SCHEMA_VERSIONS bump.
2. This script gets `MIGRATIONS[N]` defined a function that takes
a v<N> dict and returns a v<N+1> dict. Pure, deterministic, no
I/O that way migrations compose: v1 v2 v3 just chains them.
3. Each migration is FROZEN once shipped. If a v2 migration needs
fixing post-ship, ship it as v3 with the corrective migration.
4. Consumers run this script (one PR per template repo) before the
deprecation window for v<N> closes.
Usage:
# Migrate the template in cwd from its current version to the latest
python3 scripts/migrate-template.py .
# Migrate to a specific version (bounded; useful when a deprecation
# window is closing and you want to skip-ahead)
python3 scripts/migrate-template.py --to 3 .
# Force the source version (override config.yaml's declared version)
python3 scripts/migrate-template.py --from 1 --to 2 .
# Dry-run: print the diff without writing
python3 scripts/migrate-template.py --dry-run .
YAML round-trip caveats:
- PyYAML's safe_dump is used for output. Comments + anchor/alias
forms in the consumer's config.yaml are NOT preserved across
migrations the migrated file is a clean re-emit. Templates
rarely have inline comments in config.yaml; on the rare occasion
they do, the maintainer needs to re-add them after migration.
- Keys are sorted alphabetically on output. This trades a one-time
re-ordering diff (reviewable) for stable diffs across future
migrations.
- Migrations should ONLY mutate keys they're explicitly versioning
leave everything else alone so a consumer template's
customizations survive.
A future enhancement could detect comments in the original file and
opt into ruamel.yaml for round-trip-preserving emission. Not done
today; flag in the migrator's stderr if comments are detected so the
maintainer knows what they're losing.
"""
from __future__ import annotations
import argparse
import sys
from copy import deepcopy
from pathlib import Path
from typing import Callable
import yaml
# ──────────────────────────────────────────── migrations registry
# Each entry maps a SOURCE version to the function that produces the
# next version's dict. Currently empty — no v2 yet. The first time a
# real schema bump lands, MIGRATIONS[1] gets defined alongside the
# validator's SCHEMA_V2 block.
MIGRATIONS: dict[int, Callable[[dict], dict]] = {}
# ──────────────────────────────────────────── version detection
def _detect_current_version(config: dict) -> int:
sv = config.get("template_schema_version")
if sv is None:
sys.exit(
"error: config.yaml has no `template_schema_version`. "
"Add it (likely 1 for legacy templates) before migrating."
)
if not isinstance(sv, int):
sys.exit(
f"error: template_schema_version must be int, got "
f"{type(sv).__name__}={sv!r}."
)
return sv
def _latest_known_version() -> int:
"""Maximum version reachable by chaining MIGRATIONS from any
starting point. With an empty registry, this is 1 (the floor:
every existing template is at v1)."""
if not MIGRATIONS:
return 1
return max(MIGRATIONS.keys()) + 1
# ──────────────────────────────────────────── core
def migrate_config(config: dict, from_version: int, to_version: int) -> dict:
"""Apply migrations sequentially from `from_version` to `to_version`.
Returns a NEW dict does not mutate the input.
Errors loudly when there's no migration registered for an
intermediate step: forward-only, never silently skip a hop. If the
user asks for a backward migration, error too schema versions
are append-only and we don't ship downgrades."""
if to_version < from_version:
sys.exit(
f"error: cannot migrate backward (from v{from_version} to "
f"v{to_version}). Schema versions are append-only — file a "
f"new bug + ship a forward migration instead."
)
current = from_version
out = deepcopy(config)
while current < to_version:
step = MIGRATIONS.get(current)
if step is None:
sys.exit(
f"error: no migration registered for v{current}"
f"v{current + 1}. Either add it to MIGRATIONS in "
f"scripts/migrate-template.py or pick a different --to."
)
out = step(out)
# Every migration MUST stamp the new version on its output —
# this assertion catches a class of bugs where a migration
# forgets to bump template_schema_version.
if out.get("template_schema_version") != current + 1:
sys.exit(
f"error: MIGRATIONS[{current}] did not stamp "
f"template_schema_version={current + 1} on its output. "
f"This is a bug in the migration function itself."
)
current += 1
return out
def _read_yaml(path: Path) -> dict:
with open(path) as f:
data = yaml.safe_load(f)
if not isinstance(data, dict):
sys.exit(f"error: {path} root is not a mapping (got {type(data).__name__})")
return data
def _write_yaml(path: Path, data: dict) -> None:
# Sort keys for stable diffs across migrations. This matches what
# `yaml.safe_dump` does when we write — consumer repos with
# custom orderings will see their config.yaml re-ordered, which is
# one of those round-trip lossy tradeoffs that's worth accepting:
# the migration moment is rare and the diff is reviewable.
with open(path, "w") as f:
yaml.safe_dump(data, f, sort_keys=True, default_flow_style=False)
# ──────────────────────────────────────────── CLI
def main(argv: list[str] | None = None) -> int:
parser = argparse.ArgumentParser(
description="Migrate a workspace template's config.yaml across schema versions."
)
parser.add_argument(
"template_dir",
type=Path,
help="Path to the template repo root (must contain config.yaml).",
)
parser.add_argument(
"--from",
dest="from_version",
type=int,
default=None,
help="Source schema version (defaults to whatever config.yaml declares).",
)
parser.add_argument(
"--to",
dest="to_version",
type=int,
default=None,
help="Target schema version (defaults to the highest reachable from MIGRATIONS).",
)
parser.add_argument(
"--dry-run",
action="store_true",
help="Print the migrated YAML to stdout without modifying the file.",
)
args = parser.parse_args(argv)
config_path = args.template_dir / "config.yaml"
if not config_path.is_file():
sys.exit(f"error: {config_path} does not exist")
config = _read_yaml(config_path)
from_version = args.from_version
if from_version is None:
from_version = _detect_current_version(config)
to_version = args.to_version
if to_version is None:
to_version = _latest_known_version()
if from_version == to_version:
print(
f"nothing to do: config.yaml is already at v{from_version}.",
file=sys.stderr,
)
return 0
migrated = migrate_config(config, from_version, to_version)
if args.dry_run:
yaml.safe_dump(migrated, sys.stdout, sort_keys=True, default_flow_style=False)
return 0
_write_yaml(config_path, migrated)
print(
f"✓ migrated {config_path} from v{from_version} → v{to_version}",
file=sys.stderr,
)
return 0
if __name__ == "__main__":
sys.exit(main())

View File

@ -0,0 +1,242 @@
"""Tests for migrate-template.py — pin the migration framework's
behavior so the FIRST real schema bump (the one that proves the system
end-to-end) doesn't have to discover semantics under deadline pressure.
The MIGRATIONS registry is empty today (we have only v1), so most
tests register a synthetic migration scoped to the test, exercise the
machinery, and unregister at teardown. This way the framework's
contract is locked in even before any real migration ships.
"""
from __future__ import annotations
import importlib.util
from pathlib import Path
import pytest
MIGRATOR_PATH = Path(__file__).resolve().parent / "migrate-template.py"
def _load_migrator():
"""Load migrate-template.py by path (its filename has a hyphen so
we can't `import migrate-template` directly)."""
spec = importlib.util.spec_from_file_location("migrator", MIGRATOR_PATH)
assert spec is not None and spec.loader is not None
mod = importlib.util.module_from_spec(spec)
spec.loader.exec_module(mod)
return mod
@pytest.fixture
def migrator():
"""Fresh migrator module per test. Registry is global module
state; tests that register synthetic migrations must clean up."""
mod = _load_migrator()
# Snapshot + restore MIGRATIONS so accidentally-leaked entries
# from one test don't poison the next.
snapshot = dict(mod.MIGRATIONS)
yield mod
mod.MIGRATIONS.clear()
mod.MIGRATIONS.update(snapshot)
def _v1_template_config() -> dict:
return {
"name": "test-template",
"runtime": "claude-code",
"template_schema_version": 1,
"description": "fixture",
"tier": 1,
}
# ─────────────────────────────────────── version detection
def test_detect_current_version_from_config(migrator):
config = _v1_template_config()
assert migrator._detect_current_version(config) == 1
def test_detect_missing_version_exits(migrator):
config = {"name": "t", "runtime": "claude-code"}
with pytest.raises(SystemExit) as exc:
migrator._detect_current_version(config)
assert "no `template_schema_version`" in str(exc.value)
def test_detect_non_int_version_exits(migrator):
config = {"name": "t", "runtime": "claude-code", "template_schema_version": "1"}
with pytest.raises(SystemExit) as exc:
migrator._detect_current_version(config)
assert "must be int" in str(exc.value)
# ─────────────────────────────────────── latest-version reachability
def test_latest_with_empty_registry_is_v1(migrator):
"""Floor case: every existing template is v1 even when no
migrations are registered. Latest reachable = v1, so a no-op
migration is the only valid action."""
migrator.MIGRATIONS.clear()
assert migrator._latest_known_version() == 1
def test_latest_with_one_migration_is_v2(migrator):
"""Adding a v1 → v2 migration moves the ceiling to v2. This is
what happens the first time a real schema bump ships."""
migrator.MIGRATIONS.clear()
migrator.MIGRATIONS[1] = lambda c: {**c, "template_schema_version": 2}
assert migrator._latest_known_version() == 2
def test_latest_chains_through_multiple_migrations(migrator):
"""Multi-step ceiling: v1 → v2 → v3 chain produces ceiling=3."""
migrator.MIGRATIONS.clear()
migrator.MIGRATIONS[1] = lambda c: {**c, "template_schema_version": 2}
migrator.MIGRATIONS[2] = lambda c: {**c, "template_schema_version": 3}
assert migrator._latest_known_version() == 3
# ─────────────────────────────────────── migrate_config core
def test_migrate_no_op_when_versions_match(migrator):
"""from == to → no migration step runs. Should not require any
MIGRATIONS entry to be defined."""
migrator.MIGRATIONS.clear()
out = migrator.migrate_config(_v1_template_config(), 1, 1)
assert out == _v1_template_config()
assert out is not _v1_template_config() # deep-copied, not aliased
def test_migrate_one_step_applies_function(migrator):
"""v1 → v2 with a registered migration produces the expected
output and stamps the new version."""
migrator.MIGRATIONS.clear()
migrator.MIGRATIONS[1] = lambda c: {**c, "template_schema_version": 2, "added_in_v2": True}
out = migrator.migrate_config(_v1_template_config(), 1, 2)
assert out["template_schema_version"] == 2
assert out["added_in_v2"] is True
# Pre-existing keys preserved.
assert out["name"] == "test-template"
def test_migrate_chains_v1_to_v3(migrator):
"""Two-step migration: v1 → v2 → v3. Each step applies in order."""
migrator.MIGRATIONS.clear()
migrator.MIGRATIONS[1] = lambda c: {**c, "template_schema_version": 2, "from_v1": True}
migrator.MIGRATIONS[2] = lambda c: {**c, "template_schema_version": 3, "from_v2": True}
out = migrator.migrate_config(_v1_template_config(), 1, 3)
assert out["template_schema_version"] == 3
assert out["from_v1"] is True
assert out["from_v2"] is True
def test_migrate_missing_step_exits(migrator):
"""If MIGRATIONS lacks the v<current> step, fail loud rather than
silently skip the version. Forward-only, never silent skip."""
migrator.MIGRATIONS.clear()
# No MIGRATIONS[1] registered.
with pytest.raises(SystemExit) as exc:
migrator.migrate_config(_v1_template_config(), 1, 2)
assert "no migration registered for v1 → v2" in str(exc.value)
def test_migrate_backward_exits(migrator):
"""Schema versions are append-only. Asking for v2 → v1 must
error, not silently downgrade."""
migrator.MIGRATIONS.clear()
config = {**_v1_template_config(), "template_schema_version": 2}
with pytest.raises(SystemExit) as exc:
migrator.migrate_config(config, 2, 1)
assert "cannot migrate backward" in str(exc.value)
def test_migration_must_stamp_new_version(migrator):
"""A migration function that forgets to bump
`template_schema_version` is a bug catch it at apply time so
the framework can never produce an inconsistent output."""
migrator.MIGRATIONS.clear()
# Buggy migration: doesn't update the version field.
migrator.MIGRATIONS[1] = lambda c: {**c, "added_in_v2": True}
with pytest.raises(SystemExit) as exc:
migrator.migrate_config(_v1_template_config(), 1, 2)
assert "did not stamp template_schema_version=2" in str(exc.value)
def test_migrate_does_not_mutate_input(migrator):
"""migrate_config returns a fresh dict; the caller's input is
untouched. Pin this so a shared-state migration can't accidentally
poison the caller's view of the original template."""
migrator.MIGRATIONS.clear()
migrator.MIGRATIONS[1] = lambda c: {**c, "template_schema_version": 2}
original = _v1_template_config()
snapshot = dict(original)
_ = migrator.migrate_config(original, 1, 2)
assert original == snapshot
# ─────────────────────────────────────── CLI smoke
def test_cli_writes_migrated_yaml(migrator, tmp_path):
"""End-to-end: --to migrates the file in place and exits 0."""
migrator.MIGRATIONS.clear()
migrator.MIGRATIONS[1] = lambda c: {**c, "template_schema_version": 2, "added": "v2-marker"}
cfg = tmp_path / "config.yaml"
cfg.write_text(
"name: t\n"
"runtime: claude-code\n"
"template_schema_version: 1\n"
)
rc = migrator.main([str(tmp_path), "--to", "2"])
assert rc == 0
written = cfg.read_text()
assert "template_schema_version: 2" in written
assert "added: v2-marker" in written
def test_cli_dry_run_does_not_modify_file(migrator, tmp_path, capsys):
"""--dry-run prints the migrated YAML to stdout but leaves the
on-disk file untouched."""
migrator.MIGRATIONS.clear()
migrator.MIGRATIONS[1] = lambda c: {**c, "template_schema_version": 2}
cfg = tmp_path / "config.yaml"
cfg.write_text(
"name: t\n"
"runtime: claude-code\n"
"template_schema_version: 1\n"
)
original_disk = cfg.read_text()
rc = migrator.main([str(tmp_path), "--to", "2", "--dry-run"])
assert rc == 0
assert cfg.read_text() == original_disk # untouched
captured = capsys.readouterr()
assert "template_schema_version: 2" in captured.out
def test_cli_no_op_when_already_at_target(migrator, tmp_path, capsys):
"""If the template is already at the target version, exit 0
without modifying the file. Not an error common when running
the migration script defensively in CI."""
migrator.MIGRATIONS.clear()
cfg = tmp_path / "config.yaml"
cfg.write_text(
"name: t\n"
"runtime: claude-code\n"
"template_schema_version: 1\n"
)
original = cfg.read_text()
rc = migrator.main([str(tmp_path), "--to", "1"])
assert rc == 0
assert cfg.read_text() == original
def test_cli_missing_config_exits(migrator, tmp_path):
"""If the target dir has no config.yaml, error clearly rather
than try to apply migrations to nothing."""
with pytest.raises(SystemExit) as exc:
migrator.main([str(tmp_path), "--to", "2"])
assert "config.yaml" in str(exc.value) and "does not exist" in str(exc.value)

View File

@ -0,0 +1,686 @@
"""Tests for validate-workspace-template.py — pin the drift contract.
Each test materialises a tiny template directory in a tmpdir, runs the
validator's check functions in-process, and asserts on the captured
ERRORS / WARNINGS lists. The 8 template repos in the wild are the
ground-truth integration test (CI runs this validator against each on
push), but those repos can change at any time. These tests pin the
contract itself so a refactor of the validator can't silently weaken
it.
Important: the validator was chosen to be import-safe (no top-level
side effects), so the test patches the cwd via os.chdir into tmpdirs.
The module's ERRORS/WARNINGS lists are reset at the start of each
test via _reset_validator_state().
"""
from __future__ import annotations
import importlib.util
import os
from pathlib import Path
import pytest
VALIDATOR_PATH = Path(__file__).resolve().parent / "validate-workspace-template.py"
def _load_validator():
"""Load the validator module by path (its filename has a hyphen so
we can't `import validate-workspace-template` directly)."""
spec = importlib.util.spec_from_file_location("validator", VALIDATOR_PATH)
assert spec is not None and spec.loader is not None
mod = importlib.util.module_from_spec(spec)
spec.loader.exec_module(mod)
return mod
@pytest.fixture
def validator(monkeypatch):
"""Fresh validator module per test, cwd pinned to tmpdir below."""
mod = _load_validator()
mod.ERRORS.clear()
mod.WARNINGS.clear()
return mod
def _good_dockerfile() -> str:
"""Canonical Dockerfile that should pass every check."""
return (
"FROM python:3.11-slim\n"
"ARG RUNTIME_VERSION=\n"
"RUN useradd -u 1000 -m -s /bin/bash agent\n"
"WORKDIR /app\n"
"COPY requirements.txt .\n"
'RUN pip install -r requirements.txt && \\\n'
' if [ -n "${RUNTIME_VERSION}" ]; then \\\n'
' pip install --upgrade "molecule-ai-workspace-runtime==${RUNTIME_VERSION}"; \\\n'
' fi\n'
'ENTRYPOINT ["molecule-runtime"]\n'
)
def _good_config_yaml() -> str:
return (
"name: test-template\n"
"runtime: claude-code\n"
"template_schema_version: 1\n"
"description: A test template\n"
"tier: 1\n"
)
def _good_requirements_txt() -> str:
return "molecule-ai-workspace-runtime>=0.1.0\n"
def _materialise(tmp_path: Path, dockerfile: str | None = None,
config_yaml: str | None = None,
requirements: str | None = None,
adapter_py: str | None = None) -> None:
if dockerfile is not None:
(tmp_path / "Dockerfile").write_text(dockerfile)
if config_yaml is not None:
(tmp_path / "config.yaml").write_text(config_yaml)
if requirements is not None:
(tmp_path / "requirements.txt").write_text(requirements)
if adapter_py is not None:
(tmp_path / "adapter.py").write_text(adapter_py)
# ───────────────────────────────────────────────────────── happy paths
def test_canonical_template_passes(validator, tmp_path, monkeypatch):
_materialise(
tmp_path,
dockerfile=_good_dockerfile(),
config_yaml=_good_config_yaml(),
requirements=_good_requirements_txt(),
)
monkeypatch.chdir(tmp_path)
validator.check_dockerfile()
validator.check_config_yaml()
validator.check_requirements()
validator.check_adapter()
assert validator.ERRORS == [], validator.ERRORS
def test_custom_entrypoint_script_passes_when_it_execs_runtime(validator, tmp_path, monkeypatch):
"""claude-code style: ENTRYPOINT [/entrypoint.sh] + entrypoint.sh
that exec's molecule-runtime at the end. Must pass."""
df = (
"FROM python:3.11-slim\n"
"ARG RUNTIME_VERSION=\n"
"RUN useradd -u 1000 -m -s /bin/bash agent\n"
"COPY requirements.txt .\n"
'RUN pip install -r requirements.txt && \\\n'
' if [ -n "${RUNTIME_VERSION}" ]; then \\\n'
' pip install --upgrade "molecule-ai-workspace-runtime==${RUNTIME_VERSION}"; \\\n'
' fi\n'
"COPY entrypoint.sh /entrypoint.sh\n"
'ENTRYPOINT ["/entrypoint.sh"]\n'
)
ep = (
"#!/bin/sh\n"
"set -e\n"
'# drop privileges then exec the runtime\n'
'exec gosu agent molecule-runtime "$@"\n'
)
_materialise(
tmp_path,
dockerfile=df,
config_yaml=_good_config_yaml(),
requirements=_good_requirements_txt(),
)
(tmp_path / "entrypoint.sh").write_text(ep)
monkeypatch.chdir(tmp_path)
validator.check_dockerfile()
assert validator.ERRORS == [], validator.ERRORS
# ───────────────────────────────────────────────────────── Dockerfile drift
def test_wrong_base_image_errors(validator, tmp_path, monkeypatch):
df = _good_dockerfile().replace("python:3.11-slim", "python:3.10-alpine")
_materialise(tmp_path, dockerfile=df, config_yaml=_good_config_yaml(),
requirements=_good_requirements_txt())
monkeypatch.chdir(tmp_path)
validator.check_dockerfile()
assert any("FROM python:3.11-slim" in e for e in validator.ERRORS)
def test_missing_arg_runtime_version_errors(validator, tmp_path, monkeypatch):
"""Without ARG RUNTIME_VERSION, the cascade rebuild silently ships
the previous runtime the cache trap that bit us 5x on 2026-04-27."""
df = _good_dockerfile().replace("ARG RUNTIME_VERSION=\n", "")
_materialise(tmp_path, dockerfile=df, config_yaml=_good_config_yaml(),
requirements=_good_requirements_txt())
monkeypatch.chdir(tmp_path)
validator.check_dockerfile()
assert any("ARG RUNTIME_VERSION" in e for e in validator.ERRORS)
def test_missing_runtime_version_in_run_block_errors(validator, tmp_path, monkeypatch):
"""ARG declared but NEVER referenced in a RUN — same cache-trap,
different shape. Pin both."""
df = (
"FROM python:3.11-slim\n"
"ARG RUNTIME_VERSION=\n"
"RUN useradd -u 1000 -m -s /bin/bash agent\n"
"RUN pip install molecule-ai-workspace-runtime\n"
'ENTRYPOINT ["molecule-runtime"]\n'
)
_materialise(tmp_path, dockerfile=df, config_yaml=_good_config_yaml(),
requirements=_good_requirements_txt())
monkeypatch.chdir(tmp_path)
validator.check_dockerfile()
assert any("RUNTIME_VERSION" in e and "RUN block" in e for e in validator.ERRORS)
def test_missing_agent_user_errors(validator, tmp_path, monkeypatch):
df = _good_dockerfile().replace("RUN useradd -u 1000 -m -s /bin/bash agent\n", "")
_materialise(tmp_path, dockerfile=df, config_yaml=_good_config_yaml(),
requirements=_good_requirements_txt())
monkeypatch.chdir(tmp_path)
validator.check_dockerfile()
assert any("agent" in e for e in validator.ERRORS)
def test_missing_entrypoint_errors(validator, tmp_path, monkeypatch):
df = _good_dockerfile().replace('ENTRYPOINT ["molecule-runtime"]\n', "")
_materialise(tmp_path, dockerfile=df, config_yaml=_good_config_yaml(),
requirements=_good_requirements_txt())
monkeypatch.chdir(tmp_path)
validator.check_dockerfile()
assert any("molecule-runtime" in e and ("ENTRYPOINT" in e or "entrypoint" in e)
for e in validator.ERRORS)
# ───────────────────────────────────────────────────────── config.yaml drift
def test_missing_required_keys_errors(validator, tmp_path, monkeypatch):
"""A config without template_schema_version short-circuits with a
SINGLE actionable error listing 'also name and runtime are
missing' is noise on top of the real problem (no version means the
validator can't pick a schema contract to enforce). Once the
version is present, the v1 dispatch will list the other missing
keys (next test pins that)."""
cfg = "description: only description, no name/runtime/version\n"
_materialise(tmp_path, dockerfile=_good_dockerfile(), config_yaml=cfg,
requirements=_good_requirements_txt())
monkeypatch.chdir(tmp_path)
validator.check_config_yaml()
missing_msgs = [e for e in validator.ERRORS if "missing required key" in e]
# Exactly one error: the missing version. v1 dispatch is skipped
# because we can't choose a contract without a version.
assert len(missing_msgs) == 1, missing_msgs
assert "template_schema_version" in missing_msgs[0]
def test_missing_required_keys_under_v1_dispatch_errors(validator, tmp_path, monkeypatch):
"""When `template_schema_version: 1` IS present but other required
keys are missing, the v1 dispatch fires and lists them. Pins that
the v1 contract still enforces name + runtime."""
cfg = (
"template_schema_version: 1\n"
"description: only the version + description\n"
)
_materialise(tmp_path, dockerfile=_good_dockerfile(), config_yaml=cfg,
requirements=_good_requirements_txt())
monkeypatch.chdir(tmp_path)
validator.check_config_yaml()
missing_msgs = [e for e in validator.ERRORS if "missing required key" in e]
keys = {e.split("`")[1] for e in missing_msgs}
assert "name" in keys, missing_msgs
assert "runtime" in keys, missing_msgs
def test_string_template_schema_version_errors(validator, tmp_path, monkeypatch):
cfg = (
"name: t\n"
"runtime: claude-code\n"
'template_schema_version: "1"\n' # str, not int
)
_materialise(tmp_path, dockerfile=_good_dockerfile(), config_yaml=cfg,
requirements=_good_requirements_txt())
monkeypatch.chdir(tmp_path)
validator.check_config_yaml()
assert any("template_schema_version must be int" in e for e in validator.ERRORS)
def test_unknown_runtime_warns_not_errors(validator, tmp_path, monkeypatch):
cfg = _good_config_yaml().replace("claude-code", "my-experimental-runtime")
_materialise(tmp_path, dockerfile=_good_dockerfile(), config_yaml=cfg,
requirements=_good_requirements_txt())
monkeypatch.chdir(tmp_path)
validator.check_config_yaml()
assert any("not in known set" in w for w in validator.WARNINGS)
assert validator.ERRORS == [] # custom runtimes are allowed
def test_unknown_top_level_keys_warn(validator, tmp_path, monkeypatch):
cfg = _good_config_yaml() + "weird_drift_key: something\n"
_materialise(tmp_path, dockerfile=_good_dockerfile(), config_yaml=cfg,
requirements=_good_requirements_txt())
monkeypatch.chdir(tmp_path)
validator.check_config_yaml()
assert any("unknown top-level keys" in w and "weird_drift_key" in w
for w in validator.WARNINGS)
# ───────────────────────────────────────────────────────── requirements.txt
def test_missing_runtime_in_requirements_errors(validator, tmp_path, monkeypatch):
_materialise(tmp_path, dockerfile=_good_dockerfile(), config_yaml=_good_config_yaml(),
requirements="fastapi\n")
monkeypatch.chdir(tmp_path)
validator.check_requirements()
assert any("molecule-ai-workspace-runtime" in e for e in validator.ERRORS)
# ───────────────────────────────────────────────────────── adapter.py
def test_legacy_molecule_ai_import_warns(validator, tmp_path, monkeypatch):
"""Pre-#87 package was named differently. Catch any laggards."""
adapter = "from molecule_ai.adapter_base import BaseAdapter\n"
_materialise(tmp_path, adapter_py=adapter)
monkeypatch.chdir(tmp_path)
validator.check_adapter()
assert any("molecule_ai" in w for w in validator.WARNINGS)
def test_modern_molecule_runtime_import_does_not_warn(validator, tmp_path, monkeypatch):
"""Regression cover: the original validator's warning ('don't import
molecule_runtime') was BACKWARDS — that's the canonical name now.
Pin that the new validator does NOT emit a false positive."""
adapter = "from molecule_runtime.adapter_base import BaseAdapter\n"
_materialise(tmp_path, adapter_py=adapter)
monkeypatch.chdir(tmp_path)
validator.check_adapter()
legacy_warnings = [w for w in validator.WARNINGS if "molecule_ai" in w]
assert legacy_warnings == [], legacy_warnings
# ──────────────────── adapter.py runtime-load (strong contract)
#
# These tests pin the contract that adapter.py must be importable AND
# define at least one BaseAdapter subclass — the same path the runtime
# uses at workspace boot. Skipped when molecule-ai-workspace-runtime
# isn't installed in the test environment (the validator's CI workflow
# guarantees it via `pip install -r requirements.txt` before invoking
# the validator; local pytest can run with or without it).
def _has_runtime_installed() -> bool:
"""True if molecule-ai-workspace-runtime is importable. Used to skip
the runtime-load tests when running pytest locally without the
runtime in the venv."""
try:
import molecule_runtime.adapters.base # noqa: F401, PLC0415
return True
except ImportError:
return False
_RUNTIME_AVAILABLE = _has_runtime_installed()
_skip_no_runtime = pytest.mark.skipif(
not _RUNTIME_AVAILABLE,
reason="molecule-ai-workspace-runtime not installed in test env",
)
def test_no_adapter_skips_runtime_load_silently(validator, tmp_path, monkeypatch):
"""No adapter.py = use default langgraph executor from the wheel.
That's policy, not drift, so runtime-load check should not fire."""
monkeypatch.chdir(tmp_path)
validator.check_adapter_runtime_load()
# No ERRORS, no runtime-load WARNINGS specifically.
runtime_load_warnings = [
w for w in validator.WARNINGS if "runtime-load check" in w
]
assert validator.ERRORS == [], validator.ERRORS
assert runtime_load_warnings == [], runtime_load_warnings
def _good_adapter_py() -> str:
"""A fully concrete BaseAdapter subclass — overrides every
abstract method BaseAdapter declares. Mirrors the shape of all 8
production templates so tests of the runtime-load check exercise
the same path the real templates do."""
return (
"from molecule_runtime.adapters.base import BaseAdapter\n"
"\n"
"class MyAdapter(BaseAdapter):\n"
" @staticmethod\n"
" def name(): return 'test-adapter'\n"
" @staticmethod\n"
" def display_name(): return 'Test'\n"
" @staticmethod\n"
" def description(): return 'fixture adapter'\n"
" def setup(self, config): pass\n"
" def create_executor(self, config): return None\n"
)
@_skip_no_runtime
def test_valid_baseadapter_subclass_passes(validator, tmp_path, monkeypatch):
"""The happy path: adapter.py defines a fully concrete class
inheriting from BaseAdapter. All 8 production templates match
this shape."""
_materialise(tmp_path, adapter_py=_good_adapter_py())
monkeypatch.chdir(tmp_path)
validator.check_adapter_runtime_load()
assert validator.ERRORS == [], validator.ERRORS
@_skip_no_runtime
def test_adapter_with_no_baseadapter_subclass_errors(validator, tmp_path, monkeypatch):
"""The most insidious silent-failure mode: adapter.py imports
cleanly, defines classes, but NONE inherit from BaseAdapter. The
runtime's class-discovery would silently skip this file and fall
through to the default executor workspace would 'work' but with
the wrong runtime. Must hard-error."""
adapter = (
"class JustSomePlainClass:\n"
" def run(self): pass\n"
)
_materialise(tmp_path, adapter_py=adapter)
monkeypatch.chdir(tmp_path)
validator.check_adapter_runtime_load()
assert any(
"no concrete class inheriting from" in e and "BaseAdapter" in e
for e in validator.ERRORS
), validator.ERRORS
@_skip_no_runtime
def test_abstract_intermediate_alone_does_not_count(validator, tmp_path, monkeypatch):
"""A locally-defined abstract subclass (e.g., a framework-level
intermediate that templates extend) must not satisfy the contract
on its own. The runtime needs a CONCRETE class to instantiate;
accepting an abstract one would let workspace boot fail at
instantiation time instead of validator time."""
adapter = (
"from abc import abstractmethod\n"
"from molecule_runtime.adapters.base import BaseAdapter\n"
"\n"
"class FrameworkAdapter(BaseAdapter):\n"
" @abstractmethod\n"
" def my_abstract_method(self): ...\n"
)
_materialise(tmp_path, adapter_py=adapter)
monkeypatch.chdir(tmp_path)
validator.check_adapter_runtime_load()
assert any(
"no concrete class inheriting from" in e
for e in validator.ERRORS
), validator.ERRORS
@_skip_no_runtime
def test_abstract_plus_concrete_passes_with_concrete_only(validator, tmp_path, monkeypatch):
"""The legitimate factoring pattern: define an abstract framework-
level intermediate, then a concrete leaf. Only the concrete leaf
counts toward the "at least one" requirement the framework
intermediate is filtered out by `inspect.isabstract`."""
adapter = (
"from abc import abstractmethod\n"
"from molecule_runtime.adapters.base import BaseAdapter\n"
"\n"
"class FrameworkAdapter(BaseAdapter):\n"
" @abstractmethod\n"
" def framework_specific_hook(self): ...\n"
"\n"
"class ConcreteAdapter(FrameworkAdapter):\n"
" def framework_specific_hook(self): pass\n"
" @staticmethod\n"
" def name(): return 'concrete'\n"
" @staticmethod\n"
" def display_name(): return 'Concrete'\n"
" @staticmethod\n"
" def description(): return 'leaf'\n"
" def setup(self, config): pass\n"
" def create_executor(self, config): return None\n"
)
_materialise(tmp_path, adapter_py=adapter)
monkeypatch.chdir(tmp_path)
validator.check_adapter_runtime_load()
assert validator.ERRORS == [], validator.ERRORS
@_skip_no_runtime
def test_multiple_concrete_baseadapter_subclasses_errors(validator, tmp_path, monkeypatch):
"""Two concrete BaseAdapter subclasses in the same file is a
silent ambiguity: the runtime's class-discovery picks one per
its own resolution rules, so the WRONG class might be loaded
after a future runtime refactor. Force the maintainer to either
mark intermediates abstract or split into separate modules."""
adapter = (
"from molecule_runtime.adapters.base import BaseAdapter\n"
"\n"
"class FirstConcreteAdapter(BaseAdapter):\n"
" @staticmethod\n"
" def name(): return 'first'\n"
" @staticmethod\n"
" def display_name(): return 'First'\n"
" @staticmethod\n"
" def description(): return 'first'\n"
" def setup(self, config): pass\n"
" def create_executor(self, config): return None\n"
"\n"
"class SecondConcreteAdapter(BaseAdapter):\n"
" @staticmethod\n"
" def name(): return 'second'\n"
" @staticmethod\n"
" def display_name(): return 'Second'\n"
" @staticmethod\n"
" def description(): return 'second'\n"
" def setup(self, config): pass\n"
" def create_executor(self, config): return None\n"
)
_materialise(tmp_path, adapter_py=adapter)
monkeypatch.chdir(tmp_path)
validator.check_adapter_runtime_load()
multi_errors = [
e for e in validator.ERRORS
if "multiple concrete BaseAdapter subclasses" in e
]
assert len(multi_errors) == 1, validator.ERRORS
# Both names should appear in the error so the operator knows
# exactly which classes are competing.
assert "FirstConcreteAdapter" in multi_errors[0]
assert "SecondConcreteAdapter" in multi_errors[0]
@_skip_no_runtime
def test_aliased_concrete_class_is_deduplicated(validator, tmp_path, monkeypatch):
"""Production templates often do `Adapter = ConcreteAdapter` as a
module-level alias for the runtime's class-discovery convention.
`vars(mod)` returns BOTH bindings pointing at the same class
object without identity-based dedup, the multi-concrete-class
error fires falsely (regression caught against the real langgraph
template during the Q3 fix). Pin that aliased templates pass."""
adapter = _good_adapter_py() + "\nAdapter = MyAdapter\n"
_materialise(tmp_path, adapter_py=adapter)
monkeypatch.chdir(tmp_path)
validator.check_adapter_runtime_load()
assert validator.ERRORS == [], validator.ERRORS
@_skip_no_runtime
def test_only_imported_baseadapter_subclass_does_not_count(validator, tmp_path, monkeypatch):
"""Re-exported imports do not satisfy the contract. If the only
BaseAdapter subclass in adapter.py is something `from
molecule_runtime.adapters.base import BaseAdapter` re-exports (or
a future abstract intermediate), the runtime's class-discovery
would correctly skip it and the validator must too. Without
this check, an `__module__`-filter regression would mask the
'no concrete subclass' case the gate exists to catch.
"""
adapter = (
# This file imports BaseAdapter but never SUBCLASSES it.
# `BaseAdapter` itself is in vars(mod) but it's already
# filtered by `obj is not BaseAdapter`. The new __module__
# filter ensures no third-party class slipping in via import
# is counted either.
"from molecule_runtime.adapters.base import BaseAdapter # noqa: F401\n"
)
_materialise(tmp_path, adapter_py=adapter)
monkeypatch.chdir(tmp_path)
validator.check_adapter_runtime_load()
assert any(
"no concrete class inheriting from" in e
for e in validator.ERRORS
), validator.ERRORS
@_skip_no_runtime
def test_adapter_with_syntax_error_errors(validator, tmp_path, monkeypatch):
"""SyntaxError at import is the same failure mode that crashes
workspace boot. Catch it here."""
adapter = "this is not valid python at all\n"
_materialise(tmp_path, adapter_py=adapter)
monkeypatch.chdir(tmp_path)
validator.check_adapter_runtime_load()
assert any("failed to import" in e for e in validator.ERRORS), validator.ERRORS
@_skip_no_runtime
def test_adapter_with_import_error_errors(validator, tmp_path, monkeypatch):
"""ImportError during adapter.py exec — same failure mode as
workspace boot. The error message should point the contributor at
requirements.txt as the right fix."""
adapter = (
"import this_package_definitely_does_not_exist_0xdeadbeef\n"
"from molecule_runtime.adapters.base import BaseAdapter\n"
)
_materialise(tmp_path, adapter_py=adapter)
monkeypatch.chdir(tmp_path)
validator.check_adapter_runtime_load()
assert any(
"failed to import" in e and "ModuleNotFoundError" in e
for e in validator.ERRORS
), validator.ERRORS
# ─────────────────────────────────────── schema-version dispatch
#
# Pin the contract that the validator routes to per-version checks
# based on `template_schema_version`, that unknown versions hard-fail,
# and that deprecated versions warn but pass.
def test_v1_is_in_known_schema_versions(validator):
"""Document the floor: v1 is always understood. Future bumps add
versions; v1 stays accepted (or deprecated) but the validator
never silently drops it."""
assert 1 in validator.KNOWN_SCHEMA_VERSIONS or 1 in validator.DEPRECATED_SCHEMA_VERSIONS
def test_unknown_schema_version_errors(validator, tmp_path, monkeypatch):
"""A template declaring template_schema_version=999 must hard-fail
silently allowing it would let drift land disguised as a
'future' version."""
cfg = (
"name: t\n"
"runtime: claude-code\n"
"template_schema_version: 999\n"
)
_materialise(tmp_path, dockerfile=_good_dockerfile(), config_yaml=cfg,
requirements=_good_requirements_txt())
monkeypatch.chdir(tmp_path)
validator.check_config_yaml()
assert any("template_schema_version=999 is unknown" in e
for e in validator.ERRORS), validator.ERRORS
def test_deprecated_schema_version_warns_but_passes(validator, tmp_path, monkeypatch):
"""During a deprecation window, v<N-1> templates still validate
(so the consumer can keep merging unrelated PRs while migrating)
but the warning surfaces the migration command."""
# Inject a fake deprecated version for the duration of this test —
# we don't have a real deprecated version yet (only v1 exists).
validator.KNOWN_SCHEMA_VERSIONS.add(2)
validator.DEPRECATED_SCHEMA_VERSIONS.add(1)
validator.SCHEMA_CHECKS[2] = lambda config: None # accept-all stub for v2
try:
cfg = (
"name: t\n"
"runtime: claude-code\n"
"template_schema_version: 1\n"
)
_materialise(tmp_path, dockerfile=_good_dockerfile(), config_yaml=cfg,
requirements=_good_requirements_txt())
monkeypatch.chdir(tmp_path)
validator.check_config_yaml()
# No errors — deprecation is warning-only.
assert validator.ERRORS == [], validator.ERRORS
assert any(
"template_schema_version=1 is deprecated" in w
and "migrate-template.py" in w
for w in validator.WARNINGS
), validator.WARNINGS
finally:
validator.KNOWN_SCHEMA_VERSIONS.discard(2)
validator.DEPRECATED_SCHEMA_VERSIONS.discard(1)
validator.SCHEMA_CHECKS.pop(2, None)
def test_per_version_dispatch_calls_correct_check(validator, tmp_path, monkeypatch):
"""Pin that SCHEMA_CHECKS[N] is the function called when a template
declares template_schema_version=N. Without this, the dispatch could
fire the wrong contract on a multi-version codebase."""
called: list[int] = []
validator.KNOWN_SCHEMA_VERSIONS.add(7)
validator.SCHEMA_CHECKS[7] = lambda config: called.append(7)
try:
cfg = (
"name: t\n"
"runtime: claude-code\n"
"template_schema_version: 7\n"
)
_materialise(tmp_path, dockerfile=_good_dockerfile(), config_yaml=cfg,
requirements=_good_requirements_txt())
monkeypatch.chdir(tmp_path)
validator.check_config_yaml()
assert called == [7], f"v7 dispatch was not invoked; called={called}"
finally:
validator.KNOWN_SCHEMA_VERSIONS.discard(7)
validator.SCHEMA_CHECKS.pop(7, None)
def test_runtime_not_installed_warns_not_errors(validator, tmp_path, monkeypatch):
"""If the validator runs in an env without molecule-ai-workspace-runtime,
we WARN (loud) but don't error — hard-erroring would say 'your adapter
is broken' when the actual issue is the CI infra. Mock the import to
simulate this regardless of what's installed locally."""
adapter = (
"from molecule_runtime.adapters.base import BaseAdapter\n"
"class A(BaseAdapter): pass\n"
)
_materialise(tmp_path, adapter_py=adapter)
monkeypatch.chdir(tmp_path)
# Force the runtime import to fail by hiding the module.
import sys
saved = {k: sys.modules.pop(k) for k in list(sys.modules)
if k.startswith("molecule_runtime")}
saved_meta = sys.meta_path[:]
class _Block:
def find_spec(self, name, path=None, target=None):
if name == "molecule_runtime" or name.startswith("molecule_runtime."):
raise ImportError(f"blocked for test: {name}")
return None
sys.meta_path.insert(0, _Block())
try:
validator.check_adapter_runtime_load()
finally:
sys.meta_path[:] = saved_meta
sys.modules.update(saved)
assert validator.ERRORS == [], validator.ERRORS
assert any(
"skipping runtime-load check" in w
for w in validator.WARNINGS
), validator.WARNINGS

View File

@ -1,47 +1,440 @@
#!/usr/bin/env python3
"""Validate a Molecule AI workspace template repo."""
import os, sys, yaml
"""Prototype of the beefed-up validate-workspace-template.py.
errors = []
Run from a template repo's root. Surfaces hard structural drift in
Dockerfile + config.yaml + requirements.txt against the canonical
contract. Replaces the existing soft-warnings-only validator at
molecule-ci/scripts/validate-workspace-template.py.
"""
import os, re, sys
import yaml
if not os.path.isfile("config.yaml"):
print("::error::config.yaml not found at repo root")
sys.exit(1)
ERRORS: list[str] = []
WARNINGS: list[str] = []
with open("config.yaml") as f:
config = yaml.safe_load(f)
def err(msg: str) -> None:
ERRORS.append(msg)
if not config.get("name"):
errors.append("Missing required field: name")
if not config.get("runtime"):
errors.append("Missing required field: runtime")
def warn(msg: str) -> None:
WARNINGS.append(msg)
known = {"langgraph", "claude-code", "crewai", "autogen", "deepagents", "hermes", "gemini-cli", "openclaw"}
runtime = config.get("runtime", "")
if runtime and runtime not in known:
print(f"::warning::Runtime '{runtime}' is not in the known set. OK for custom runtimes.")
# Check for legacy imports
if os.path.isfile("adapter.py"):
with open("adapter.py") as f:
content = f.read()
if "molecule_runtime" in content:
print("::warning::adapter.py imports 'molecule_runtime' — legacy import, use 'molecule_ai' or platform SDK")
# ───────────────────────────────────────────────────────────── Dockerfile
# Check for missing molecule-ai-workspace-runtime dependency hint
if os.path.isfile("Dockerfile"):
with open("Dockerfile") as f:
content = f.read()
if "molecule-ai-workspace-runtime" not in content:
print("::warning::Dockerfile does not reference 'molecule-ai-workspace-runtime' — may need base runtime package")
def check_dockerfile() -> None:
if not os.path.isfile("Dockerfile"):
warn("no Dockerfile — skipping container drift checks (library-only template?)")
return
df = open("Dockerfile").read()
sv = config.get("template_schema_version")
if sv is None:
errors.append("Missing template_schema_version (add: template_schema_version: 1)")
if not re.search(r"^FROM python:3\.11-slim\b", df, re.MULTILINE):
err("Dockerfile: must base on `FROM python:3.11-slim` — see contract doc")
if errors:
for e in errors:
if not re.search(r"^ARG RUNTIME_VERSION", df, re.MULTILINE):
err(
"Dockerfile: missing `ARG RUNTIME_VERSION=`. "
"This arg invalidates the pip-install cache when the cascade "
"publishes a new wheel; without it, the cascade silently ships "
"the previous runtime (cache trap observed 2026-04-27, 5x in a row)."
)
if "molecule-ai-workspace-runtime" not in df and not (
os.path.isfile("requirements.txt")
and "molecule-ai-workspace-runtime" in open("requirements.txt").read()
):
err("Dockerfile + requirements.txt: must install `molecule-ai-workspace-runtime`")
if "${RUNTIME_VERSION}" not in df and "$RUNTIME_VERSION" not in df:
err(
"Dockerfile: must reference `${RUNTIME_VERSION}` in a pip install RUN block. "
'Pattern: `if [ -n "${RUNTIME_VERSION}" ]; then '
'pip install --no-cache-dir --upgrade "molecule-ai-workspace-runtime==${RUNTIME_VERSION}"; fi`'
)
if not re.search(r"useradd[^\n]*\bagent\b", df):
err(
"Dockerfile: must create the `agent` user "
"(`RUN useradd -u 1000 -m -s /bin/bash agent`). "
"Runtime drops to uid 1000; without it, claude-code refuses "
"`--dangerously-skip-permissions` for safety."
)
has_direct_entrypoint = bool(
re.search(r'(ENTRYPOINT|CMD)\s*\[?\s*"?molecule-runtime"?', df)
)
has_custom_entrypoint = bool(
re.search(r'ENTRYPOINT\s*\[?\s*"?(/?[\w./-]*entrypoint\.sh|/?[\w./-]*start\.sh)', df)
)
if not has_direct_entrypoint and not has_custom_entrypoint:
err(
"Dockerfile: must end at `molecule-runtime` "
"(`ENTRYPOINT [\"molecule-runtime\"]` or via custom "
"entrypoint.sh / start.sh that exec's molecule-runtime)"
)
if has_custom_entrypoint:
m = re.search(r'ENTRYPOINT\s*\[?\s*"?(/?[\w./-]+)', df)
if m:
ep_in_image = m.group(1).lstrip("/")
ep_local = os.path.basename(ep_in_image)
if os.path.isfile(ep_local):
if "molecule-runtime" not in open(ep_local).read():
err(
f"Dockerfile uses ENTRYPOINT [{ep_in_image}] but "
f"{ep_local} does not exec `molecule-runtime`"
)
else:
warn(
f"Dockerfile points ENTRYPOINT at {ep_in_image} but "
f"{ep_local} not found in repo root — verify it's COPYed in"
)
# ───────────────────────────────────────────────────────────── config.yaml
KNOWN_RUNTIMES = {
"langgraph",
"claude-code",
"crewai",
"autogen",
"deepagents",
"hermes",
"gemini-cli",
"openclaw",
}
# ──────────────────────────────────────────── schema versioning
#
# `template_schema_version: int` in each template's config.yaml selects
# which contract this validator enforces. Versions are FROZEN once
# shipped — never edit a SCHEMA_V* constant in place. To bump:
#
# 1. Add `SCHEMA_V<N+1>_REQUIRED_KEYS` / `SCHEMA_V<N+1>_OPTIONAL_KEYS`
# describing the new contract.
# 2. Add `_check_schema_v<N+1>(config)` that enforces it.
# 3. Add the entry to SCHEMA_CHECKS below.
# 4. Move version N from KNOWN_SCHEMA_VERSIONS to
# DEPRECATED_SCHEMA_VERSIONS so existing v<N> templates warn but
# still pass — buys a deprecation window.
# 5. Ship a corresponding migration in scripts/migrate-template.py's
# MIGRATIONS table (key = N, value = callable that produces the
# v<N+1> dict from a v<N> dict).
# 6. Run migrate-template.py on each consumer template repo as a PR.
# 7. After all consumers migrate, drop version N from
# DEPRECATED_SCHEMA_VERSIONS in a follow-up PR.
#
# This discipline means a schema version always has exactly one valid
# enforcement function, never "branch on minor variants" — the whole
# point of versioning is to avoid that drift.
KNOWN_SCHEMA_VERSIONS: set[int] = {1}
DEPRECATED_SCHEMA_VERSIONS: set[int] = set()
# `template_schema_version` is part of the v1 contract and listed
# here for documentation, but the top-level `check_config_yaml`
# already verifies it's present and is an int before dispatching
# here — `_check_schema_v1` does NOT re-check it (would be dead
# defensive code). The key DOES need to appear in the union of
# required + optional so it isn't flagged as unknown drift in the
# `unknown top-level keys` warning at the end of `_check_schema_v1`.
SCHEMA_V1_REQUIRED_KEYS = ["name", "runtime", "template_schema_version"]
SCHEMA_V1_OPTIONAL_KEYS = [
"description",
"version",
"tier",
"model",
"models",
"runtime_config",
"env",
"skills",
"tools",
"a2a",
"delegation",
"prompt_files",
"bridge",
"governance",
]
def _check_schema_v1(config: dict) -> None:
"""v1 contract — the keys frozen as of monorepo task #90's Phase 2.
Currently every production template runs this version. Do NOT edit
in place; add v2 instead and migrate consumers (see header)."""
for key in SCHEMA_V1_REQUIRED_KEYS:
if key == "template_schema_version":
# Already verified present + int by the dispatcher; skip
# to avoid emitting a duplicate or contradictory error.
continue
if key not in config:
err(f"config.yaml: missing required key `{key}`")
runtime = config.get("runtime")
if runtime and runtime not in KNOWN_RUNTIMES:
warn(
f"config.yaml: runtime `{runtime}` not in known set "
f"{sorted(KNOWN_RUNTIMES)} — OK for custom runtimes; "
f"if canonical, add it to KNOWN_RUNTIMES in validate-workspace-template.py"
)
unknown = set(config.keys()) - set(SCHEMA_V1_REQUIRED_KEYS) - set(SCHEMA_V1_OPTIONAL_KEYS)
if unknown:
warn(
f"config.yaml: unknown top-level keys {sorted(unknown)}"
f"may be drift. If intentional, add them to SCHEMA_V1_OPTIONAL_KEYS."
)
SCHEMA_CHECKS = {
1: _check_schema_v1,
}
def check_config_yaml() -> None:
if not os.path.isfile("config.yaml"):
err("config.yaml: missing at repo root")
return
with open("config.yaml") as f:
try:
config = yaml.safe_load(f)
except yaml.YAMLError as e:
err(f"config.yaml: invalid YAML — {e}")
return
if not isinstance(config, dict):
err(f"config.yaml: root must be a mapping, got {type(config).__name__}")
return
# Schema-version dispatch. Validate the version field shape first
# so error messages are actionable.
sv = config.get("template_schema_version")
if sv is None:
err("config.yaml: missing required key `template_schema_version`")
# Can't dispatch without a version. Don't fall through to v1
# checks — that would mask the missing-version error.
return
if not isinstance(sv, int):
err(
f"config.yaml: template_schema_version must be int, "
f"got {type(sv).__name__}={sv!r}"
)
return
if sv in DEPRECATED_SCHEMA_VERSIONS:
latest = max(KNOWN_SCHEMA_VERSIONS)
warn(
f"config.yaml: template_schema_version={sv} is deprecated; "
f"migrate to v{latest} via "
f"`python3 scripts/migrate-template.py --to {latest} .`. "
f"Support for v{sv} will be removed in a future cycle."
)
elif sv not in KNOWN_SCHEMA_VERSIONS:
valid = sorted(KNOWN_SCHEMA_VERSIONS | DEPRECATED_SCHEMA_VERSIONS)
err(
f"config.yaml: template_schema_version={sv} is unknown — "
f"this validator understands {valid}. Either bump the "
f"validator (add a SCHEMA_V{sv} block) or correct the version."
)
return
SCHEMA_CHECKS[sv](config)
# ───────────────────────────────────────────────────────────── requirements.txt
def check_requirements() -> None:
if not os.path.isfile("requirements.txt"):
warn("no requirements.txt — Dockerfile must install runtime by other means")
return
reqs = open("requirements.txt").read()
if "molecule-ai-workspace-runtime" not in reqs:
err("requirements.txt: must declare `molecule-ai-workspace-runtime` as a dependency")
# ───────────────────────────────────────────────────────────── adapter.py
def check_adapter() -> None:
"""Static-text adapter checks. Fast — no imports."""
if not os.path.isfile("adapter.py"):
warn("no adapter.py — runtime will use the default langgraph executor from the wheel")
return
content = open("adapter.py").read()
# The original validator's warning ("don't import molecule_runtime") was
# backwards — that's the canonical package name. The previous check shipped
# for ~2 weeks producing false-positive warnings. Removed.
if re.search(r"\bfrom molecule_ai\b|\bimport molecule_ai\b", content):
warn(
"adapter.py imports `molecule_ai` — that's a pre-#87 package name; "
"use `molecule_runtime`"
)
def check_adapter_runtime_load() -> None:
"""Strong adapter contract: import adapter.py the same way the runtime
does at workspace boot, and assert at least one class in it inherits
from molecule_runtime.adapters.base.BaseAdapter.
The Docker build smoke test in validate-workspace-template.yml builds
the image but doesn't RUN it — adapter.py is only imported at
container startup. So a template with a syntactically-valid Dockerfile
+ a broken adapter.py (wrong base class, ImportError on a missing
framework dep, typo) builds clean and fails on first user prompt.
This check exercises the same class-resolution path the runtime uses,
so a passing validator means a passing workspace boot for the
adapter-load step.
Skip conditions:
- No adapter.py exists. Templates without one inherit the default
langgraph executor from the wheel (intentional, not drift).
- molecule-ai-workspace-runtime not importable in the validator
environment. That's a CI-config bug — the workflow that runs
this validator must `pip install molecule-ai-workspace-runtime`
first. Warn loudly so the misconfiguration surfaces, but don't
hard-fail (we'd be saying "your adapter is broken" when the
actual cause is missing infra). The `pip install -r
requirements.txt` step in validate-workspace-template.yml
normally satisfies this transitively.
Hard-error conditions:
- adapter.py raises any exception during import. The same
exception would crash workspace boot.
- No class in the module inherits from BaseAdapter. The runtime's
adapter-discovery would silently fall through to the default
executor, ignoring this file exactly the kind of human-error
mode this contract is supposed to eliminate.
"""
if not os.path.isfile("adapter.py"):
return # check_adapter() already warned; don't double-warn
try:
from molecule_runtime.adapters.base import BaseAdapter # noqa: PLC0415
except ImportError:
warn(
"adapter.py: skipping runtime-load check — "
"`molecule-ai-workspace-runtime` not installed in the validator "
"environment. The CI workflow that invokes this script must "
"`pip install molecule-ai-workspace-runtime` (or `pip install "
"-r requirements.txt`) first; otherwise this critical check is "
"silently bypassed."
)
return
# Load adapter.py as a module under a per-call-unique name so it
# doesn't collide with any installed `adapter` package OR with a
# previous invocation in the same Python process. The id() of the
# cwd-anchored absolute path is sufficient — we just need
# different invocations to land on different sys.modules keys so
# one invocation's lingering references can't bleed into the
# next's adapter discovery.
import importlib.util # noqa: PLC0415
import sys # noqa: PLC0415
abs_path = os.path.abspath("adapter.py")
module_name = f"_template_adapter_under_validation_{abs(hash(abs_path)):x}"
spec = importlib.util.spec_from_file_location(module_name, "adapter.py")
if spec is None or spec.loader is None:
err("adapter.py: cannot construct an import spec — file may be unreadable")
return
mod = importlib.util.module_from_spec(spec)
sys.modules[module_name] = mod # required so dataclass / pydantic refs resolve
try:
spec.loader.exec_module(mod)
except Exception as e:
err(
f"adapter.py: failed to import — `{type(e).__name__}: {e}`. "
f"This is the same failure mode that crashes workspace boot at "
f"runtime; the cure is to fix the adapter, not skip this check. "
f"If the import fails because a transitive dep isn't installed in "
f"this CI env, add it to the template's requirements.txt — that's "
f"what the workspace container does, and the validator job "
f"installs requirements.txt before running this check."
)
sys.modules.pop(module_name, None)
return
# Class discovery: only count CONCRETE classes DEFINED in
# adapter.py, not re-exported imports and not abstract
# intermediates. Three filter axes:
#
# 1. `__module__ == module_name` — defined HERE, not imported
# from molecule_runtime or a third-party framework.
# 2. `obj is not BaseAdapter` — BaseAdapter itself doesn't count.
# 3. `not inspect.isabstract(obj)` — abstract intermediates
# defined locally don't count. Catches the
# `class Framework(BaseAdapter): pass` + `class Concrete(Framework):`
# pattern where vars(mod) has BOTH and we'd otherwise count
# both as "real" adapters.
import inspect # noqa: PLC0415
# Deduplicate by class identity. Many production adapters do
# `Adapter = ConcreteAdapter` as a module-level alias for the
# runtime's discovery — `vars(mod)` returns both bindings
# (`Adapter` AND `ConcreteAdapter`) pointing at the same class
# object. Without dedup, the multiple-concrete-subclasses
# error fires falsely on every aliased template.
adapter_classes = list({
id(obj): obj
for name, obj in vars(mod).items()
if isinstance(obj, type)
and obj is not BaseAdapter
and issubclass(obj, BaseAdapter)
and getattr(obj, "__module__", None) == module_name
and not inspect.isabstract(obj)
}.values())
sys.modules.pop(module_name, None)
if not adapter_classes:
err(
"adapter.py: no concrete class inheriting from "
"`molecule_runtime.adapters.base.BaseAdapter` defined "
"in this file. The runtime resolves the adapter via "
"class discovery on adapter.py's own definitions — "
"imports of base classes from molecule_runtime do not "
"count, and abstract intermediates do not count. "
"Without a concrete subclass DEFINED here, workspace "
"boot falls through to the default langgraph executor "
"and ignores this file silently. If that's intentional, "
"delete adapter.py."
)
return
if len(adapter_classes) > 1:
names = sorted(c.__name__ for c in adapter_classes)
err(
f"adapter.py: multiple concrete BaseAdapter subclasses "
f"defined: {names}. The runtime's class-discovery picks "
f"one per its own resolution rules (typically last-defined "
f"or first-by-iteration), so shipping more than one is a "
f"silent ambiguity — the wrong class might be loaded after "
f"a future runtime refactor. Either keep exactly one "
f"concrete subclass + mark the others abstract via "
f"`abc.ABC` / abstract methods, or move them to separate "
f"importable modules."
)
def main() -> None:
# --static-only skips check_adapter_runtime_load(), which calls
# importlib's exec_module() on the template's adapter.py. That's
# untrusted code execution — fine on internal PRs and post-merge,
# unsafe on external fork PRs (#135). Static checks (file presence,
# YAML parse, regex/AST inspection) stay enabled in static mode.
static_only = "--static-only" in sys.argv
check_dockerfile()
check_config_yaml()
check_requirements()
check_adapter()
if not static_only:
check_adapter_runtime_load()
else:
print("::notice::skipping adapter.py import check (--static-only mode)")
for w in WARNINGS:
print(f"::warning::{w}")
for e in ERRORS:
print(f"::error::{e}")
sys.exit(1)
if ERRORS:
sys.exit(1)
suffix = " [static-only]" if static_only else ""
print(f"✓ Template validation passed ({len(WARNINGS)} warning(s)){suffix}")
print(f"✓ config.yaml valid: {config['name']} (runtime: {config.get('runtime')})")
if __name__ == "__main__":
main()