Commit Graph

5 Commits

Author SHA1 Message Date
Hongming Wang
c77a88c247 chore(security): pin Actions to SHAs + enable Dependabot auto-bumps
Supply-chain hardening for the CI pipeline. 23 workflow files
modified, 59 mutable-tag refs replaced with commit SHAs.

The risk

Every `uses:` reference in .github/workflows/*.yml was pinned to a
mutable tag (e.g., `actions/checkout@v4`). A maintainer of an
action — or a compromised maintainer account — can repoint that
tag to malicious code, and our pipelines silently pull it on the
next run. The tj-actions/changed-files compromise of March 2025 is
the canonical example: maintainer credential leak, attacker
repointed several `@v<N>` tags to a payload that exfiltrated
repository secrets. Repos that pinned to SHAs were unaffected.

The fix

Replace each `@v<N>` with `@<commit-sha> # v<N>`. The trailing
comment preserves human readability ("ah, this is v4"); the SHA
makes the reference immutable.

Actions covered (10 distinct):
  actions/{checkout,setup-go,setup-python,setup-node,upload-artifact,github-script}
  docker/{login-action,setup-buildx-action,build-push-action}
  github/codeql-action/{init,autobuild,analyze}
  dorny/paths-filter
  imjasonh/setup-crane
  pnpm/action-setup (already pinned in molecule-app, listed here for completeness)

Excluded:
  Molecule-AI/molecule-ci/.github/workflows/disable-auto-merge-on-push.yml@main
    — internal org reusable workflow; we control its repo, threat model
    is different from third-party actions. Conventional to pin to @main
    rather than SHA for internal reusables.

The maintenance cost

SHA pinning means upstream fixes require manual SHA bumps. Without
automation, pinned SHAs go stale. So this PR also enables Dependabot
across four ecosystems:

  - github-actions (workflows)
  - gomod (workspace-server)
  - npm (canvas)
  - pip (workspace runtime requirements)

Weekly cadence — the supply-chain attack window is "minutes between
repoint and pull"; weekly auto-bumps don't help with zero-days
regardless. The point is to pull in non-zero-day fixes without
operator effort.

Aligns with user-stated principle: "long-term, robust, fully-
automated, eliminate human error."

Companion PR: Molecule-AI/molecule-controlplane#308 (same pattern,
smaller surface).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-28 15:37:06 -07:00
rabbitblood
5f3508fef0 ci: add merge_group trigger to ci + codeql
Pre-work for enabling GitHub merge queue on the staging branch (#TBD
follow-up issue). Without these triggers, the queue's pre-merge CI run
on the speculative `gh-readonly-queue/...` ref would never fire, every
queued PR would show false-green for the required checks, and queue
would merge things that don't actually pass on the rebased commit.

Adding the trigger now is **a no-op** — the `merge_group` event only
fires once the queue is enabled on a branch, which is a separate UI/API
toggle. So this PR is safe to land in isolation; merge-queue enablement
is the next step and reversible at the branch-protection level.

Why these two workflows:
- `ci.yml` provides 5 of the 8 required staging checks (Detect changes,
  Platform Go, Canvas Next.js, Python Lint & Test, Shellcheck E2E)
- `codeql.yml` provides the other 3 (Analyze go / js-ts / python)

Other workflows (e2e-staging-*, canary-*, publish-*) are not required
status checks and don't need the trigger to keep the queue working.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-23 21:24:53 -07:00
Hongming Wang
e298393df5 perf(ci): move all public-repo workflows to ubuntu-latest
molecule-core is a public repo — GHA-hosted minutes are free. The
self-hosted Mac mini was only in play to dodge GHA rate limits
(memory feedback_selfhosted_runner), but for these specific
workflows it came with real costs:

- Docker-push workflows emulated linux/amd64 from arm64 via QEMU —
  every canvas + platform image build ran ~2-3x slower than native.
- Six PRs worth of keychain-avoidance hacks in publish-* because
  `docker login` on macOS writes to osxkeychain unconditionally,
  and the Mac mini's launchd user-agent keychain is locked.
- Homebrew pin-down environment variables (HOMEBREW_NO_*) sprinkled
  everywhere to work around the shared /opt/homebrew symlink mess
  on the runner.
- Setup-python@v5 couldn't write to /Users/runner, so ci.yml
  python-lint resorted to a hand-rolled Homebrew python3.11 dance.
- Single runner → fan-out contention; CodeQL's 45-min analysis
  fought the canvas publish for the one slot.

Changes across the 7 workflows:

- runs-on: [self-hosted, macos, arm64] → ubuntu-latest (every job)
- publish-canvas-image + publish-workspace-server-image:
  drop the hand-rolled auths-map step + QEMU setup + buildx v4
  → docker/login-action@v3 + setup-buildx@v3. Linux + amd64
  target = native build.
- canary-verify + promote-latest: replace `brew install crane` +
  HOMEBREW_NO_* incantations with imjasonh/setup-crane@v0.4.
- codeql.yml: drop `brew install jq` — jq is preinstalled on
  ubuntu-latest.
- ci.yml shellcheck: drop the self-hosted existence check —
  shellcheck is preinstalled via apt.
- ci.yml python-lint: replace the Homebrew python3.11 path dance
  with actions/setup-python@v5 (which works fine on GHA-hosted),
  add requirements.txt caching while we're there.
- Remove stale comments referencing "the self-hosted runner",
  "Mac mini", keychain, osxkeychain etc.

The self-hosted Mac mini remains in service for private-repo
workflows only. Memory feedback_selfhosted_runner updated to
reflect the public-repo scope carve-out.

Net -96 lines across the 7 files.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-22 12:56:49 -07:00
molecule-ai[bot]
e6d48e6590 ci: add workflow-level concurrency to ci.yml and codeql.yml (#1242)
cancel-in-progress: false queues new runs so the single mac mini
runner doesn't fight itself when pushes stack during rebases or
cross-PR contention. Existing e2e-api.yml already has this pattern.

Fixes: 19 queued runs on single self-hosted runner (02:55 UTC snapshot)

Co-authored-by: Molecule AI Fullstack (floater) <fullstack-floater@agents.moleculesai.app>
2026-04-21 03:07:31 +00:00
Hongming Wang
07ec90a23c ci(codeql): cover main + staging via workflow
GitHub's UI-configured "Code quality" scan only fires on the default
branch (staging), which leaves every staging→main promotion PR
unscanned. The "On push and pull requests to" field in the UI has no
dropdown; multi-branch scanning on private repos without GHAS isn't
available there.

Workflow file gives us the control we can't get in the UI: triggers
on push + pull_request for both branches. Runs on the same
self-hosted mac mini via [self-hosted, macos, arm64].

upload: never — GHAS isn't enabled on this repo so the SARIF upload
API 403s. Keep results locally, filter to error+warning severity,
fail the PR check on findings, publish SARIF as a workflow artifact.
Flipping upload: never → always after GHAS is enabled (if ever) is
a one-line change.

Picks up the review-flagged improvements from the earlier closed PR:
  - jq install step (brew, no assumption it's present)
  - severity filter (error+warning only, drops noisy note-level)
  - set -euo pipefail
  - SARIF glob (file name doesn't match matrix language id)

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-19 14:34:04 -07:00