molecule-core is a public repo — GHA-hosted minutes are free. The self-hosted Mac mini was only in play to dodge GHA rate limits (memory feedback_selfhosted_runner), but for these specific workflows it came with real costs: - Docker-push workflows emulated linux/amd64 from arm64 via QEMU — every canvas + platform image build ran ~2-3x slower than native. - Six PRs worth of keychain-avoidance hacks in publish-* because `docker login` on macOS writes to osxkeychain unconditionally, and the Mac mini's launchd user-agent keychain is locked. - Homebrew pin-down environment variables (HOMEBREW_NO_*) sprinkled everywhere to work around the shared /opt/homebrew symlink mess on the runner. - Setup-python@v5 couldn't write to /Users/runner, so ci.yml python-lint resorted to a hand-rolled Homebrew python3.11 dance. - Single runner → fan-out contention; CodeQL's 45-min analysis fought the canvas publish for the one slot. Changes across the 7 workflows: - runs-on: [self-hosted, macos, arm64] → ubuntu-latest (every job) - publish-canvas-image + publish-workspace-server-image: drop the hand-rolled auths-map step + QEMU setup + buildx v4 → docker/login-action@v3 + setup-buildx@v3. Linux + amd64 target = native build. - canary-verify + promote-latest: replace `brew install crane` + HOMEBREW_NO_* incantations with imjasonh/setup-crane@v0.4. - codeql.yml: drop `brew install jq` — jq is preinstalled on ubuntu-latest. - ci.yml shellcheck: drop the self-hosted existence check — shellcheck is preinstalled via apt. - ci.yml python-lint: replace the Homebrew python3.11 path dance with actions/setup-python@v5 (which works fine on GHA-hosted), add requirements.txt caching while we're there. - Remove stale comments referencing "the self-hosted runner", "Mac mini", keychain, osxkeychain etc. The self-hosted Mac mini remains in service for private-repo workflows only. Memory feedback_selfhosted_runner updated to reflect the public-repo scope carve-out. Net -96 lines across the 7 files. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
140 lines
5.3 KiB
YAML
140 lines
5.3 KiB
YAML
name: E2E API Smoke Test
|
|
# Extracted from ci.yml so workflow-level concurrency can protect this job
|
|
# from run-level cancellation (issue #458).
|
|
#
|
|
# Problem: the job-level `concurrency.cancel-in-progress: false` in ci.yml
|
|
# prevented *sibling* E2E jobs from killing each other, but GitHub still
|
|
# cancelled the parent *workflow run* when a new push arrived. Since the job
|
|
# lived inside that run, it got cancelled too.
|
|
#
|
|
# Fix: a dedicated workflow gets its own concurrency group at the workflow
|
|
# level. New pushes to the same branch queue here instead of cancelling.
|
|
# Fast jobs (platform-build, canvas-build, etc.) stay in ci.yml and continue
|
|
# to benefit from run-level cancellation for quick feedback.
|
|
|
|
on:
|
|
push:
|
|
branches: [main]
|
|
paths:
|
|
- 'workspace-server/**'
|
|
- 'tests/e2e/**'
|
|
- '.github/workflows/e2e-api.yml'
|
|
pull_request:
|
|
branches: [main]
|
|
paths:
|
|
- 'workspace-server/**'
|
|
- 'tests/e2e/**'
|
|
- '.github/workflows/e2e-api.yml'
|
|
|
|
# Workflow-level concurrency: new runs queue rather than cancel.
|
|
# `cancel-in-progress: false` is load-bearing — without it GitHub would still
|
|
# cancel this run when the next push arrives, defeating the whole fix.
|
|
# The group key includes github.ref so PRs don't compete with main.
|
|
concurrency:
|
|
group: e2e-api-${{ github.ref }}
|
|
cancel-in-progress: false
|
|
|
|
jobs:
|
|
e2e-api:
|
|
name: E2E API Smoke Test
|
|
runs-on: ubuntu-latest
|
|
timeout-minutes: 15
|
|
# Postgres + Redis run as sibling containers via `docker run`. Could
|
|
# switch to a `services:` block now that we're on Linux, but the
|
|
# explicit start-and-wait gives us pg_isready / PING readiness checks
|
|
# that match the 30-tick timeouts the rest of the job expects. Ports
|
|
# 15432/16379 avoid collision with anything the host may already have
|
|
# on the standard ports.
|
|
env:
|
|
DATABASE_URL: postgres://dev:dev@localhost:15432/molecule?sslmode=disable
|
|
REDIS_URL: redis://localhost:16379
|
|
PORT: "8080"
|
|
PG_CONTAINER: molecule-ci-postgres
|
|
REDIS_CONTAINER: molecule-ci-redis
|
|
steps:
|
|
- uses: actions/checkout@v4
|
|
- uses: actions/setup-go@v5
|
|
with:
|
|
go-version: 'stable'
|
|
cache: true
|
|
cache-dependency-path: workspace-server/go.sum
|
|
- name: Start Postgres (docker)
|
|
run: |
|
|
docker rm -f "$PG_CONTAINER" 2>/dev/null || true
|
|
docker run -d --name "$PG_CONTAINER" \
|
|
-e POSTGRES_USER=dev \
|
|
-e POSTGRES_PASSWORD=dev \
|
|
-e POSTGRES_DB=molecule \
|
|
-p 15432:5432 \
|
|
postgres:16
|
|
for i in $(seq 1 30); do
|
|
if docker exec "$PG_CONTAINER" pg_isready -U dev >/dev/null 2>&1; then
|
|
echo "Postgres ready after ${i}s"
|
|
exit 0
|
|
fi
|
|
sleep 1
|
|
done
|
|
echo "::error::Postgres did not become ready in 30s"
|
|
docker logs "$PG_CONTAINER" || true
|
|
exit 1
|
|
- name: Start Redis (docker)
|
|
run: |
|
|
docker rm -f "$REDIS_CONTAINER" 2>/dev/null || true
|
|
docker run -d --name "$REDIS_CONTAINER" -p 16379:6379 redis:7
|
|
for i in $(seq 1 15); do
|
|
if docker exec "$REDIS_CONTAINER" redis-cli ping 2>/dev/null | grep -q PONG; then
|
|
echo "Redis ready after ${i}s"
|
|
exit 0
|
|
fi
|
|
sleep 1
|
|
done
|
|
echo "::error::Redis did not become ready in 15s"
|
|
exit 1
|
|
- name: Build platform
|
|
working-directory: workspace-server
|
|
run: go build -o platform-server ./cmd/server
|
|
- name: Start platform (background)
|
|
working-directory: workspace-server
|
|
run: |
|
|
./platform-server > platform.log 2>&1 &
|
|
echo $! > platform.pid
|
|
- name: Wait for /health
|
|
run: |
|
|
for i in $(seq 1 30); do
|
|
if curl -sf http://localhost:8080/health > /dev/null; then
|
|
echo "Platform up after ${i}s"
|
|
exit 0
|
|
fi
|
|
sleep 1
|
|
done
|
|
echo "::error::Platform did not become healthy in 30s"
|
|
cat workspace-server/platform.log || true
|
|
exit 1
|
|
- name: Assert migrations applied
|
|
# Migrations auto-run at platform boot. Fail fast if they silently
|
|
# didn't — catches future migration-author mistakes before the E2E run.
|
|
run: |
|
|
tables=$(docker exec "$PG_CONTAINER" psql -U dev -d molecule -tAc "SELECT count(*) FROM information_schema.tables WHERE table_schema='public' AND table_name='workspaces'")
|
|
if [ "$tables" != "1" ]; then
|
|
echo "::error::Migrations did not apply — 'workspaces' table missing"
|
|
cat workspace-server/platform.log || true
|
|
exit 1
|
|
fi
|
|
echo "Migrations OK (workspaces table present)"
|
|
- name: Run E2E API tests
|
|
run: bash tests/e2e/test_api.sh
|
|
- name: Dump platform log on failure
|
|
if: failure()
|
|
run: cat workspace-server/platform.log || true
|
|
- name: Stop platform
|
|
if: always()
|
|
run: |
|
|
if [ -f workspace-server/platform.pid ]; then
|
|
kill "$(cat workspace-server/platform.pid)" 2>/dev/null || true
|
|
fi
|
|
- name: Stop service containers
|
|
if: always()
|
|
run: |
|
|
docker rm -f "$PG_CONTAINER" 2>/dev/null || true
|
|
docker rm -f "$REDIS_CONTAINER" 2>/dev/null || true
|