Commit Graph

26 Commits

Author SHA1 Message Date
Hongming Wang
d5caaac219
feat(reusable): disable auto-merge when a new commit is pushed (#10)
Reusable workflow that consumers call from their pr-guards.yml on
pull_request:synchronize. When a new commit is pushed to an open PR
that has auto-merge enabled, this disables auto-merge and posts a
comment so the operator must explicitly re-engage after verifying.

Background: on 2026-04-27, PR #2174 in molecule-core auto-merged
with only the first commit because the second commit was pushed
AFTER the merge queue had locked the PR's SHA. The second commit
ended up orphaned on a merged-and-deleted branch (the wider
"automatically delete head branches" repo setting now blocks the
push entirely; this workflow catches the race window where the PR
is queued but not yet merged).

Defense in depth — if both fixes are active:
1. Repo setting "delete branch on merge" prevents pushes to a
   merged branch (post-merge orphan case).
2. This workflow catches in-queue races (push lands while the
   queue is processing) by force-disabling auto-merge so the
   operator must re-engage explicitly.

Together they cover the full lifecycle of "auto-merge enabled →
new commits arrive" without relying on operator discipline.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 06:38:57 -07:00
Hongming Wang
0335ec169c
feat(lint): read runtime module list from wheel manifest, not inline (#9)
Switches the bare-imports lint from an inline RUNTIME_MODULES list
to the _runtime_modules.json manifest emitted by molecule-core's
build_runtime_package.py. Eliminates the third place the runtime
module list lived — now the build script is the single source of
truth.

Tonight surfaced that the same closed list lived in three places
that drifted independently. The build script's TOP_LEVEL_MODULES
went stale on transcript_auth, the smoke-test step here had a
hardcoded mirror that would have drifted next time a top-level
module was added, and runtime-pin-compat tested transitively via
import molecule_runtime.main (which only catches breakage, not
drift). One source of truth fixes all three at once.

Implementation:
- pip download molecule-ai-workspace-runtime --no-deps to /tmp
- unzip _runtime_modules.json from the wheel
- merge top_level_modules + subpackages into the regex alternation
  (subpackages can be bare-imported too — `from lib.pre_stop`)
- on any fetch failure (network, missing manifest in older wheel),
  fall back to the inline list with a workflow warning so the lint
  still runs but the operator knows to investigate

Two consequences:
- Templates rebuilt against runtime ≥ the version that ships the
  manifest get the always-fresh list automatically.
- Templates rebuilt against the old wheel (pre-manifest) still get
  the working inline list — no regression.

Future cleanup (separate PR after a few release cycles): once all
template repos have rebuilt at least once with the manifest path,
the inline fallback can shrink to a panic message.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 06:19:23 -07:00
Hongming Wang
7315460ff5
feat(publish-template-image): bare-import lint + import-every-app-py smoke (#8)
Two new gates that would have prevented today's
post-#87 template-extraction bug parade:

1. **Bare-import lint** — fail-fast pre-build check that grep's
   template *.py files for `from <runtime_module> import` (where
   <runtime_module> is in the closed list mirroring workspace/*.py
   basenames). When the runtime was bundled into workspace/, bare
   imports resolved against sibling files; in standalone template
   repos they explode at startup. Five separate templates shipped
   broken on 2026-04-27 because of this exact pattern (claude-code:
   plugins, executor_helpers, heartbeat, a2a_client, platform_auth;
   langgraph: agent, a2a_executor; deepagents: a2a_executor;
   gemini-cli: config, executor_helpers x2). The lint runs before
   docker login + buildx setup so a bad PR returns red in seconds.

2. **Import every /app/*.py at boot** (deeper smoke) — replaces
   `python -c "import adapter"` with a loop importing every Python
   module at /app/. The old single-import didn't traverse to
   sibling modules adapter.py imports lazily inside
   `create_executor()` (the executor.py family). That's why the
   hermes a2a-sdk migration bug and langgraph's bare a2a_executor
   import slipped through every prior gate even though the boot
   smoke "passed." Importing every module module-level forces all
   imports to resolve, including those in executor.py.

Both gates use the closed-list pattern (deliberate, easy to update,
no false-positives on legit third-party imports). The runtime module
list mirrors the equivalent in scripts/build_runtime_package.py;
both should be updated together when a new top-level workspace
module ships.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 05:33:10 -07:00
Hongming Wang
b6f43a1145
feat(publish-template-image): boot image and import adapter.py before pushing :latest (#7)
Today's incident: a template's adapter.py imported a symbol
(RuntimeCapabilities) from molecule_runtime that the published runtime
didn't yet export. The image built fine, the existing "smoke test"
inspected the entrypoint string and passed, and a broken :latest
shipped to GHCR. Every claude-code + hermes provision then hung in
"provisioning" status until the 10-min sweep marked them failed.

The old smoke test was named correctly but didn't actually exercise
anything — `docker inspect` doesn't catch ImportError. This change
splits the build/push step into three:

1. Build with `load: true, push: false` so the image lands on the
   runner's local docker.
2. Smoke test runs `docker run ... python -c "import adapter"` against
   the loaded image. This catches the version-skew class of bug
   (adapter.py imports a symbol the installed runtime doesn't export),
   plus syntax errors, missing files, and anything else that breaks
   import-time.
3. Push :latest + :sha-* only if the smoke test passes. The push step
   reuses the cached build, so it's fast.

Net cost: ~5s per publish (the docker run). Net benefit: broken images
can no longer poison :latest.

All 8 caller templates (claude-code, gemini-cli, hermes, langgraph,
crewai, autogen, deepagents, openclaw) inherit the gate automatically
since this is the reusable workflow they all call.

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-27 02:12:13 -07:00
Hongming Wang
d4e5540ed1 fix(ci): don't fetch into checked-out staging 2026-04-24 08:10:10 -07:00
Hongming Wang
3e9d1eb838 fix(ci): relax auto-promote — no-gates mode + already-ahead no-op 2026-04-24 08:06:23 -07:00
Hongming Wang
b415d91330
Merge pull request #6 from Molecule-AI/chore/add-auto-promote-staging
chore(ci): add auto-promote-staging workflow
2026-04-24 07:45:10 -07:00
Hongming Wang
47fcac2501 chore(ci): add auto-promote-staging workflow 2026-04-24 07:44:36 -07:00
Hongming Wang
d57b07dc2b
Merge pull request #5 from Molecule-AI/perf/publish-template-ubuntu-latest
perf: publish-template-image runs on ubuntu-latest (public caller repos)
2026-04-22 12:50:15 -07:00
Hongming Wang
09f0fbf9b0 perf(publish-template-image): move to ubuntu-latest for public-repo callers
All 8 template repos are public → GHA-hosted minutes are free, so
there's no cost incentive to stay on the self-hosted Mac mini. The
only reason we started there was to avoid GHA rate limits (memory
feedback_selfhosted_runner); that concern doesn't apply here because:

- Linux/amd64 builds go native on ubuntu-latest (no QEMU emulation
  from arm64 → amd64), so builds run ~2-3x faster.
- docker/login-action@v3 + GITHUB_TOKEN handles GHCR auth cleanly,
  no Keychain gymnastics needed.
- No queue wait when the Mac mini is busy publishing canvas/platform
  or running e2e.

Concretely this change:
- runs-on: [self-hosted, macos, arm64] → ubuntu-latest
- Drops the hand-rolled `auths` config step (macOS Keychain
  workaround) in favour of `docker/login-action@v3`.
- Drops `docker/setup-qemu-action` (unnecessary for a linux/amd64
  target on an amd64 runner).
- Uses setup-buildx@v3 to match the login-action major version.

Self-hosted Mac mini remains the runner for private-repo workflows
(follow-up PRs will migrate other public-repo workflows in
molecule-core).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 12:49:02 -07:00
Hongming Wang
b9f98a052f
Merge pull request #4 from Molecule-AI/feat/publish-template-image-workflow
feat(ci): reusable publish-template-image workflow
2026-04-22 12:10:59 -07:00
Hongming Wang
61bebd95cd feat(ci): reusable publish-template-image workflow for all template repos
Each Molecule-AI/molecule-ai-workspace-template-* repo currently has
no way to publish its Docker image. Tenants build locally via
workspace/rebuild-runtime-images.sh after a manual clone — which
means "merge template PR" and "template live on tenants" are two
separate manual steps per tenant.

This workflow is the `publish` half of that pipeline. Called from
each template repo via `uses: Molecule-AI/molecule-ci/.github/
workflows/publish-template-image.yml@main`, it:

- Derives runtime name from the caller repo (strip
  `molecule-ai-workspace-template-` prefix) so per-repo wrappers
  stay one-line.
- Builds linux/amd64 (self-hosted macOS arm64 runner + QEMU) and
  pushes to `ghcr.io/molecule-ai/workspace-template-<runtime>:latest`
  plus `:sha-<7>` for per-commit pinning.
- Uses the Keychain-avoiding GHCR auth pattern from canvas' publish
  workflow — osxkeychain write fails under the locked launchd keychain
  on the Mac mini runner; writing auths map directly works.
- Smoke-tests the pushed image by pulling and inspecting entrypoint.

Follow-up (not in this PR):
- Each template repo gets a ~10-line caller workflow.
- Monorepo provisioner.RuntimeImages map switches from bare
  `workspace-template:<runtime>` (local-only) to
  `ghcr.io/molecule-ai/workspace-template-<runtime>:latest`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 12:10:37 -07:00
e538690198 fix(CI): embed Python secrets scanner inline in workflow, drop nested checkout 2026-04-21 11:13:28 +00:00
d67dd489b5 fix(CI): default molecule-ci checkout to main branch (not calling repo sha) 2026-04-21 11:11:13 +00:00
3c76e0b3b9 fix(CI): fetch molecule-ci scripts before running, add second checkout step 2026-04-21 11:10:41 +00:00
487056d7dd ci: add .molecule-ci/scripts/ directory with all CI scripts 2026-04-21 11:09:16 +00:00
molecule-ai[bot]
c9344eabeb
Merge pull request #3 from Molecule-AI/fix/secrets-check-python
fix(CI): replace grep secrets check with Python scanner
2026-04-21 11:06:30 +00:00
b96821c885 fix(CI): replace grep secrets check with Python scanner
The grep-based secrets check matched literal credential patterns in
documentation (e.g., "sk-ant-..." in CLAUDE.md examples), causing
false-positive CI failures.

Replace with a Python script that:
- Skips .molecule-ci/ directory entirely
- Uses context-aware matching (requires quotes or assignment context)
- Filters out documentation examples with "..." or <example> markers
- Handles all three reusable workflows (plugin, workspace-template, org-template)
2026-04-21 11:04:51 +00:00
molecule-ai[bot]
425e94d516
Merge pull request #2 from Molecule-AI/ci-improvements
ci: streamline workflows, add timeouts, and cache pip
2026-04-20 23:13:37 +00:00
Molecule AI Platform Engineer
a89b14a76c ci: streamline workflows, add timeouts, and cache pip
- Remove redundant nested checkout of molecule-ci in workflow_call jobs
- Add timeout-minutes to prevent hung jobs (plugin: 10m, workspace: 15m)
- Add pip cache using requirements.txt
- Add missing SKILL.md heading check in validate-plugin
- Add legacy import and runtime dependency warnings in workspace validation

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-20 04:54:35 +00:00
Hongming Wang
e2ffbae82b
Merge pull request #1 from Molecule-AI/chore/credentials-gitignore
chore: gitignore credentials
2026-04-16 09:24:48 -07:00
rabbitblood
90ce5aa8ee chore: gitignore credentials for molecule-ci
Adds standard credential gitignore (.env / *.pem / .secrets/ / .auth_token).
Per-CEO directive 2026-04-16: every plugin and template repo should
gitignore credentials so self-hosters can't accidentally commit real
tokens to public repos.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-16 09:18:35 -07:00
Hongming Wang
33202cb334 fix: skip non-dict entries from !include in org validator 2026-04-16 04:52:45 -07:00
Hongming Wang
647fa1f774 fix: handle !include YAML tags in org template validator 2026-04-16 04:51:34 -07:00
Hongming Wang
6533e6eeac fix: use standalone Python scripts instead of heredocs in workflows
Heredocs in GitHub Actions YAML were being echoed as script text
instead of executed. Moving validation logic to scripts/ and running
via 'python3 .molecule-ci/scripts/validate-*.py' after checking out
the molecule-ci repo at .molecule-ci/ path.
2026-04-16 04:49:28 -07:00
Hongming Wang
f035b6e108 feat: reusable CI workflows for plugin, workspace template, and org template validation
Three reusable GitHub Actions workflows:
- validate-plugin.yml: plugin.yaml schema, content check, secrets scan
- validate-workspace-template.yml: config.yaml, adapter, Dockerfile build, secrets
- validate-org-template.yml: org.yaml hierarchy, files_dir references, secrets

Usage: `uses: Molecule-AI/molecule-ci/.github/workflows/validate-*.yml@main`
2026-04-16 04:42:16 -07:00