Merge pull request #2827 from Molecule-AI/feat/e2e-required-on-staging-1777950000

ci: e2e coverage matrix + branch-protection-as-code (closes #9)
This commit is contained in:
Hongming Wang 2026-05-05 03:24:54 +00:00 committed by GitHub
commit a345adacad
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
4 changed files with 277 additions and 0 deletions

View File

@ -0,0 +1,42 @@
name: branch-protection drift check
# Catches out-of-band edits to branch protection (UI clicks, manual gh
# api PATCH from a one-off ops session) by comparing live state against
# tools/branch-protection/apply.sh's desired state every day. Fails the
# workflow when they drift; the failure is the signal.
#
# When it fails: re-run apply.sh to put the live state back to the
# script's intent, OR update apply.sh to encode the new intent and
# commit. Either way the script is the source of truth.
on:
schedule:
# 14:00 UTC daily. Off-hours for most teams; gives a fresh signal
# at the start of every working day.
- cron: '0 14 * * *'
workflow_dispatch:
pull_request:
branches: [staging, main]
paths:
- 'tools/branch-protection/**'
- '.github/workflows/branch-protection-drift.yml'
permissions:
contents: read
jobs:
drift:
name: Branch protection drift
runs-on: ubuntu-latest
timeout-minutes: 5
steps:
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
- name: Run drift check
env:
# GH_TOKEN_FOR_ADMIN_API — repo-admin scope, needed for the
# /branches/:b/protection endpoint. Falls back to GITHUB_TOKEN
# for read-only mode (gh api on the protection endpoint
# returns 200 to maintainers, 404 to read-only). Configure at
# Settings → Secrets and variables → Actions.
GH_TOKEN: ${{ secrets.GH_TOKEN_FOR_ADMIN_API || secrets.GITHUB_TOKEN }}
run: bash tools/branch-protection/drift_check.sh

58
docs/e2e-coverage.md Normal file
View File

@ -0,0 +1,58 @@
# E2E coverage matrix
This document is the source of truth for which E2E suites guard which surfaces and which gates are wired up where. Read this before adding a new E2E or moving a check between branches.
## Suites
| Workflow file | Job (= required-check name) | What it covers | Cron |
|---|---|---|---|
| `e2e-api.yml` | `E2E API Smoke Test` | A2A handshake, registry/register, /workspaces/:id/a2a forward, structured-event emission. Lightweight enough to run on every PR. | — |
| `e2e-staging-canvas.yml` | `Canvas tabs E2E` | Canvas-tab Playwright UX checks against staging — config tab, secrets tab, agent-card tab, Activity hydration. | weekly Sun 08:00 UTC |
| `e2e-staging-saas.yml` | `E2E Staging SaaS` | Full lifecycle: org creation → workspace provision (CP path) → A2A delegation → status/heartbeat → workspace delete → EC2 termination. The integration test that catches the silent-drop bug class (#2486 / #2811 / #2813 / #2814). | daily 07:00 UTC |
| `e2e-staging-external.yml` | `E2E Staging External Runtime` | External-runtime registration + heartbeat staleness sweep + `/registry/peers` resolution. Validates the OSS-templated workspace path. | daily 07:30 UTC |
| `e2e-staging-sanity.yml` | `Intentional-failure teardown sanity` | Inverted assertion — the run MUST fail. Validates the leak-detection self-check itself; not for general gating. | weekly Mon 06:00 UTC |
| `continuous-synth-e2e.yml` | `Synthetic E2E against staging` | Standing background coverage between PR runs. Catches drift in production-like staging that PR-time E2Es miss. | every 15 min |
## Required-check status (branch protection)
| Suite | staging required | main required |
|---|---|---|
| `E2E API Smoke Test` | ✅ this PR | ✅ |
| `Canvas tabs E2E` | ✅ this PR | (see follow-up) |
| `E2E Staging SaaS` | ❌ — needs always-emit refactor | ❌ |
| `E2E Staging External Runtime` | ❌ — needs always-emit refactor | ❌ |
| `Intentional-failure teardown sanity` | ❌ inverted assertion, never required | ❌ |
| `Synthetic E2E against staging` | ❌ cron-only, not a per-PR gate | ❌ |
## Why the always-emit pattern matters
Branch protection requires a *check name* to land at SUCCESS for every PR. Workflows with `paths:` filters that exclude a PR never run, so the check name never appears, and the PR sits BLOCKED forever.
The pattern that supports being required is:
1. Workflow always triggers on push/PR to the protected branch.
2. A `detect-changes` job uses `dorny/paths-filter` to decide if real work runs.
3. The protected job runs unconditionally and either (a) does real work when paths matched, or (b) emits a no-op SUCCESS step when paths skipped.
`e2e-api.yml` and `e2e-staging-canvas.yml` already have this shape. `e2e-staging-saas.yml` and `e2e-staging-external.yml` use plain `paths:` filters and need the refactor before they can be required (filed as follow-up).
## Adding a new E2E suite
1. Pick a verb: smoke test, full lifecycle, fault-injection, drift detection. Pre-existing suites split along these lines.
2. Use the always-emit shape so the check name can be made required.
3. Add a row to the matrix above.
4. Decide cron cadence based on cost + how fast drift would otherwise be caught.
5. If you want it required, add to the relevant branch protection via `tools/branch-protection/apply.sh` (this PR adds the script).
## When to break glass — temporarily skip a required E2E
Don't. If an E2E is intermittently flaky, fix the test or move it out of required. The point of a required check is that it's load-bearing; bypassing one with admin override teaches the next operator the gate is optional.
If a Production incident requires bypassing, document the override in the incident postmortem with a same-week followup to either fix the test or rip the check out of required.
## Related issues / PRs
- #2486 — silent-drop bug class that the SaaS E2E now catches
- PR #2811`provisionWorkspaceAuto` consolidation (org-import SaaS gate)
- PR #2824`StopWorkspaceAuto` mirror (closes #2813 + #2814)
- Follow-up: refactor `e2e-staging-saas` + `e2e-staging-external` to always-emit (so they can be required)

133
tools/branch-protection/apply.sh Executable file
View File

@ -0,0 +1,133 @@
#!/usr/bin/env bash
# tools/branch-protection/apply.sh — idempotently apply branch
# protection to molecule-core's `staging` and `main` branches.
#
# Why a script: GitHub's branch protection lives in repo settings, so
# changes are usually clicked through the UI and lost between admins.
# This script makes the config reproducible — diff it against the live
# state, change the file, run it, done. Single source of truth that
# shows up in code review.
#
# Usage:
# tools/branch-protection/apply.sh # apply both branches
# tools/branch-protection/apply.sh --dry-run # show payload only
# tools/branch-protection/apply.sh --branch staging
#
# Requires: gh CLI authenticated as a repo admin. The script uses gh's
# token (no separate PAT needed).
set -euo pipefail
REPO="Molecule-AI/molecule-core"
DRY_RUN=0
ONLY_BRANCH=""
while [[ $# -gt 0 ]]; do
case "$1" in
--dry-run) DRY_RUN=1; shift ;;
--branch) ONLY_BRANCH="$2"; shift 2 ;;
-h|--help)
echo "Usage: $0 [--dry-run] [--branch <name>]"
exit 0
;;
*) echo "Unknown arg: $1" >&2; exit 1 ;;
esac
done
# Required-check matrices. Each branch's set is the canonical list of
# check NAMES (from each workflow's job-name). Adding/removing a check
# here is the place to do it. Match docs/e2e-coverage.md.
#
# Why staging gets E2E API + Canvas E2E (this PR's addition): both
# already use the always-emit pattern (path-filter no-ops emit SUCCESS),
# so making them required can't deadlock a PR that doesn't touch their
# paths. The other E2Es (SaaS, External) need a refactor to that
# pattern before they can be required — tracked as follow-up.
read -r -d '' STAGING_CHECKS <<'EOF' || true
Analyze (go)
Analyze (javascript-typescript)
Analyze (python)
Canvas (Next.js)
Canvas tabs E2E
Detect changes
E2E API Smoke Test
Platform (Go)
Python Lint & Test
Scan diff for credential-shaped strings
Shellcheck (E2E scripts)
EOF
read -r -d '' MAIN_CHECKS <<'EOF' || true
Analyze (go)
Analyze (javascript-typescript)
Analyze (python)
Canvas (Next.js)
Canvas tabs E2E
Detect changes
E2E API Smoke Test
PR-built wheel + import smoke
Platform (Go)
Python Lint & Test
Scan diff for credential-shaped strings
Shellcheck (E2E scripts)
EOF
build_payload() {
local checks="$1"
local require_reviews="$2" # true / false
local checks_json
checks_json=$(printf '%s\n' "$checks" | jq -Rs '
split("\n")
| map(select(length > 0))
| map({context: ., app_id: -1})
')
jq -n \
--argjson checks "$checks_json" \
--argjson reviews "$require_reviews" \
'{
required_status_checks: {
strict: false,
checks: $checks
},
enforce_admins: false,
required_pull_request_reviews: (
if $reviews then
{ required_approving_review_count: 1, dismiss_stale_reviews: true }
else null end
),
restrictions: null,
allow_deletions: false,
allow_force_pushes: false,
block_creations: false,
required_conversation_resolution: true,
required_linear_history: false,
lock_branch: false,
allow_fork_syncing: true
}'
}
apply_branch() {
local branch="$1"
local checks="$2"
local require_reviews="$3"
local payload
payload=$(build_payload "$checks" "$require_reviews")
if [[ "$DRY_RUN" -eq 1 ]]; then
echo "=== branch: $branch ==="
echo "$payload" | jq .
return
fi
echo "Applying branch protection on $branch..."
printf '%s' "$payload" | gh api -X PUT \
"repos/$REPO/branches/$branch/protection" \
--input -
echo "Applied: $branch"
}
if [[ -z "$ONLY_BRANCH" || "$ONLY_BRANCH" == "staging" ]]; then
apply_branch staging "$STAGING_CHECKS" true
fi
if [[ -z "$ONLY_BRANCH" || "$ONLY_BRANCH" == "main" ]]; then
apply_branch main "$MAIN_CHECKS" true
fi

View File

@ -0,0 +1,44 @@
#!/usr/bin/env bash
# tools/branch-protection/drift_check.sh — compare the live branch
# protection on staging + main against what apply.sh would set. Used
# by branch-protection-drift.yml (cron) to catch out-of-band UI edits.
#
# Exit codes:
# 0 — live state matches the script
# 1 — drift detected (output shows the diff)
# 2 — gh API call failed
set -euo pipefail
REPO="Molecule-AI/molecule-core"
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
EXIT_CODE=0
check_branch() {
local branch="$1"
local want
want=$(bash "$SCRIPT_DIR/apply.sh" --dry-run --branch "$branch" 2>&1 |
sed -n '/^{$/,/^}$/p' |
jq -S '.required_status_checks.checks | map(.context) | sort')
local have
if ! have=$(gh api "repos/$REPO/branches/$branch/protection/required_status_checks" 2>/dev/null |
jq -S '.checks | map(.context) | sort'); then
echo "drift_check: FAIL to fetch $branch protection (gh API error)"
return 2
fi
if [[ "$want" != "$have" ]]; then
echo "=== DRIFT on $branch ==="
echo "want:"; echo "$want"
echo "have:"; echo "$have"
diff <(echo "$want") <(echo "$have") || true
return 1
fi
echo "OK: $branch matches desired state"
}
for b in staging main; do
if ! check_branch "$b"; then
EXIT_CODE=1
fi
done
exit "$EXIT_CODE"