Supply-chain hardening for the CI pipeline. 23 workflow files
modified, 59 mutable-tag refs replaced with commit SHAs.
The risk
Every `uses:` reference in .github/workflows/*.yml was pinned to a
mutable tag (e.g., `actions/checkout@v4`). A maintainer of an
action — or a compromised maintainer account — can repoint that
tag to malicious code, and our pipelines silently pull it on the
next run. The tj-actions/changed-files compromise of March 2025 is
the canonical example: maintainer credential leak, attacker
repointed several `@v<N>` tags to a payload that exfiltrated
repository secrets. Repos that pinned to SHAs were unaffected.
The fix
Replace each `@v<N>` with `@<commit-sha> # v<N>`. The trailing
comment preserves human readability ("ah, this is v4"); the SHA
makes the reference immutable.
Actions covered (10 distinct):
actions/{checkout,setup-go,setup-python,setup-node,upload-artifact,github-script}
docker/{login-action,setup-buildx-action,build-push-action}
github/codeql-action/{init,autobuild,analyze}
dorny/paths-filter
imjasonh/setup-crane
pnpm/action-setup (already pinned in molecule-app, listed here for completeness)
Excluded:
Molecule-AI/molecule-ci/.github/workflows/disable-auto-merge-on-push.yml@main
— internal org reusable workflow; we control its repo, threat model
is different from third-party actions. Conventional to pin to @main
rather than SHA for internal reusables.
The maintenance cost
SHA pinning means upstream fixes require manual SHA bumps. Without
automation, pinned SHAs go stale. So this PR also enables Dependabot
across four ecosystems:
- github-actions (workflows)
- gomod (workspace-server)
- npm (canvas)
- pip (workspace runtime requirements)
Weekly cadence — the supply-chain attack window is "minutes between
repoint and pull"; weekly auto-bumps don't help with zero-days
regardless. The point is to pull in non-zero-day fixes without
operator effort.
Aligns with user-stated principle: "long-term, robust, fully-
automated, eliminate human error."
Companion PR: Molecule-AI/molecule-controlplane#308 (same pattern,
smaller surface).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
92 lines
4.2 KiB
YAML
92 lines
4.2 KiB
YAML
name: Runtime Pin Compatibility
|
|
|
|
# CI gate that prevents the 5-hour staging outage from 2026-04-24 from
|
|
# recurring (controlplane#253). The original failure mode:
|
|
# 1. molecule-ai-workspace-runtime 0.1.13 declared `a2a-sdk<1.0` in its
|
|
# requires_dist metadata (incorrect — it actually imports
|
|
# a2a.server.routes which only exists in a2a-sdk 1.0+)
|
|
# 2. `pip install molecule-ai-workspace-runtime` resolved cleanly
|
|
# 3. `from molecule_runtime.main import main_sync` raised ImportError
|
|
# 4. Every tenant workspace crashed; the canary tenant caught it but
|
|
# only after 5 hours of degraded staging
|
|
#
|
|
# This workflow installs the CURRENTLY PUBLISHED runtime from PyPI on
|
|
# top of `workspace/requirements.txt` and smoke-imports. Catches:
|
|
# - Upstream PyPI yanks
|
|
# - Bad re-releases of molecule-ai-workspace-runtime
|
|
# - Already-shipped wheels that stop importing because a transitive
|
|
# dep moved underneath
|
|
#
|
|
# This is the "PyPI artifact health" half of pin compatibility. The
|
|
# companion workflow `runtime-prbuild-compat.yml` covers the
|
|
# "PR-introduced breakage" half by building the wheel from THIS PR's
|
|
# workspace/ source. Splitting the two means each gets a narrow
|
|
# `paths:` filter — the pypi-latest job no longer fires on doc-only
|
|
# workspace/ edits whose content can't change what's currently on PyPI.
|
|
|
|
on:
|
|
push:
|
|
branches: [main, staging]
|
|
paths:
|
|
# Narrow filter: pypi-latest is sensitive only to changes that
|
|
# affect what we're INSTALLING (requirements.txt) or WHAT THE
|
|
# CHECK ITSELF DOES (this workflow file). Edits to workspace/
|
|
# source code don't change what's on PyPI right now, so they
|
|
# don't change this gate's verdict.
|
|
- 'workspace/requirements.txt'
|
|
- '.github/workflows/runtime-pin-compat.yml'
|
|
pull_request:
|
|
branches: [main, staging]
|
|
paths:
|
|
- 'workspace/requirements.txt'
|
|
- '.github/workflows/runtime-pin-compat.yml'
|
|
# Daily catch for upstream PyPI publishes that break the pin combo
|
|
# without any change in our repo (e.g. someone re-yanks an a2a-sdk
|
|
# release or molecule-ai-workspace-runtime publishes a bad bump).
|
|
schedule:
|
|
- cron: '0 13 * * *' # 06:00 PT
|
|
workflow_dispatch:
|
|
# Required-check support: when this becomes a branch-protection gate,
|
|
# merge_group runs let the queue green-check this in addition to PRs.
|
|
merge_group:
|
|
types: [checks_requested]
|
|
|
|
concurrency:
|
|
group: ${{ github.workflow }}-${{ github.ref }}
|
|
cancel-in-progress: true
|
|
|
|
jobs:
|
|
pypi-latest-install:
|
|
name: PyPI-latest install + import smoke
|
|
runs-on: ubuntu-latest
|
|
steps:
|
|
- uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
|
|
- uses: actions/setup-python@a26af69be951a213d495a4c3e4e4022e16d87065 # v5
|
|
with:
|
|
python-version: '3.11'
|
|
cache: pip
|
|
cache-dependency-path: workspace/requirements.txt
|
|
- name: Install runtime + workspace requirements
|
|
# Install order is load-bearing: install the runtime FIRST so pip
|
|
# honors whatever a2a-sdk constraint the runtime metadata declares
|
|
# (this is the surface that broke in 2026-04-24 — runtime declared
|
|
# `a2a-sdk<1.0` but actually needed >=1.0). The follow-up install
|
|
# of workspace/requirements.txt then upgrades a2a-sdk to the
|
|
# constraint our runtime image actually pins. The import smoke
|
|
# below verifies the upgraded combination is consistent.
|
|
run: |
|
|
python -m venv /tmp/venv
|
|
/tmp/venv/bin/pip install --upgrade pip
|
|
/tmp/venv/bin/pip install molecule-ai-workspace-runtime
|
|
/tmp/venv/bin/pip install -r workspace/requirements.txt
|
|
/tmp/venv/bin/pip show molecule-ai-workspace-runtime a2a-sdk \
|
|
| grep -E '^(Name|Version):'
|
|
- name: Smoke import — fail if metadata declares deps that don't satisfy real imports
|
|
# WORKSPACE_ID is validated at import time by platform_auth.py — EC2
|
|
# user-data sets it from the cloud-init template; set a placeholder
|
|
# here so the import smoke doesn't trip on the env-var guard.
|
|
env:
|
|
WORKSPACE_ID: 00000000-0000-0000-0000-000000000001
|
|
run: |
|
|
/tmp/venv/bin/python -c "from molecule_runtime.main import main_sync; print('runtime imports OK')"
|