Every staging push run for the last 4 SHAs was cancelled by the
matching pull_request run because both fired into the same
concurrency group:
group: ${{ github.workflow }}-${{ ...sha }}
Same SHA → same group → cancel-in-progress=true means the second
arrival cancels the first. Empirically the push run lost the race;
staging branch-protection then saw a CANCELLED required check and
the auto-promote chain stalled.
Fix: include github.event_name in the group key. push and
pull_request runs for the same SHA now hash to different groups,
both complete, both report SUCCESS to branch protection.
Pattern of the bug:
10:46 sha=1e8d7ae1 ev=pull_request conclusion=success
10:46 sha=1e8d7ae1 ev=push conclusion=cancelled
10:45 sha=ecf5f6fb ev=pull_request conclusion=success
10:45 sha=ecf5f6fb ev=push conclusion=cancelled
10:28 sha=471dff25 ev=pull_request conclusion=success
10:28 sha=471dff25 ev=push conclusion=cancelled
10:12 sha=9e678ccd ev=pull_request conclusion=success
10:12 sha=9e678ccd ev=push conclusion=cancelled
Same drift class as the 2026-04-28 auto-promote-staging incident
(memory: feedback_concurrency_group_per_sha.md) — globally-scoped
groups silently cancel runs in matched-SHA scenarios.
This is the only workflow in .github/workflows/ that uses the
narrow per-sha shape without event_name. Others either don't use
concurrency at all, or use ${{ github.ref }} which is event-
neutral.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
153 lines
7.4 KiB
YAML
153 lines
7.4 KiB
YAML
name: Runtime PR-Built Compatibility
|
|
|
|
# Companion to `runtime-pin-compat.yml`. That workflow tests what's
|
|
# CURRENTLY PUBLISHED on PyPI; this workflow tests what WOULD BE
|
|
# PUBLISHED if THIS PR merges.
|
|
#
|
|
# Why two workflows: the chicken-and-egg #128 fix added a "PR-built
|
|
# wheel" job to the original runtime-pin-compat.yml, but both jobs
|
|
# shared a `paths:` filter that was the union of their needs
|
|
# (`workspace/**`). That meant the PyPI-latest job ran on every doc
|
|
# edit even though the upstream PyPI artifact can't change with our
|
|
# workspace/ source. Splitting the two means each gets a narrow
|
|
# `paths:` filter that matches the inputs it actually depends on.
|
|
#
|
|
# Catches the failure mode where a PR adds an import requiring a newer
|
|
# SDK than `workspace/requirements.txt` pins:
|
|
# 1. Pip resolves the existing PyPI wheel + the old SDK pin → smoke
|
|
# passes (it imports the OLD main.py from the wheel, not the PR's
|
|
# new main.py).
|
|
# 2. Merge → publish-runtime.yml ships a wheel WITH the new import.
|
|
# 3. Tenant images redeploy → all crash on first boot with
|
|
# ImportError.
|
|
#
|
|
# By building from the PR's source and smoke-importing THAT wheel, we
|
|
# fail at PR-time instead of after publish.
|
|
#
|
|
# Required-check shape (2026-05-01): the workflow runs on EVERY push +
|
|
# PR + merge_group event with no top-level `paths:` filter, then uses a
|
|
# detect-changes job + per-step `if:` gates inside ONE always-running
|
|
# job named `PR-built wheel + import smoke`. PRs that don't touch
|
|
# wheel-relevant paths get a no-op SUCCESS check run, satisfying branch
|
|
# protection without re-running the heavy build. Same pattern as
|
|
# e2e-api.yml — see its comment for the full rationale + the 2026-04-29
|
|
# PR #2264 incident that motivated the always-run-with-if-gates shape.
|
|
|
|
on:
|
|
push:
|
|
branches: [main, staging]
|
|
pull_request:
|
|
branches: [main, staging]
|
|
workflow_dispatch:
|
|
merge_group:
|
|
types: [checks_requested]
|
|
|
|
concurrency:
|
|
# Include event_name so a PR sync (event=pull_request) and the
|
|
# subsequent staging push (event=push) on the SAME merge SHA don't
|
|
# collide in one group. Without event_name, both runs hashed to
|
|
# the same key and cancel-in-progress=true cancelled whichever
|
|
# arrived second — usually the push run, which staging branch-
|
|
# protection then sees as a CANCELLED required check and refuses
|
|
# to mark merged. Caught 2026-05-05 across PR #2869's runs (run
|
|
# ids 25371863455 / 25371811486 / 25371078157 / 25370403142 — every
|
|
# staging push run cancelled, every matching PR run green).
|
|
#
|
|
# Per memory `feedback_concurrency_group_per_sha.md` — same drift
|
|
# class that broke auto-promote-staging on 2026-04-28. Pin invariant:
|
|
# event_name + sha is the minimum unique key for these workflows.
|
|
group: ${{ github.workflow }}-${{ github.event_name }}-${{ github.event.pull_request.head.sha || github.sha }}
|
|
cancel-in-progress: true
|
|
|
|
jobs:
|
|
detect-changes:
|
|
runs-on: ubuntu-latest
|
|
outputs:
|
|
wheel: ${{ steps.decide.outputs.wheel }}
|
|
steps:
|
|
- uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
|
|
- uses: dorny/paths-filter@fbd0ab8f3e69293af611ebaee6363fc25e6d187d # v4.0.1
|
|
id: filter
|
|
with:
|
|
filters: |
|
|
wheel:
|
|
- 'workspace/**'
|
|
- 'scripts/build_runtime_package.py'
|
|
- 'scripts/wheel_smoke.py'
|
|
- '.github/workflows/runtime-prbuild-compat.yml'
|
|
- id: decide
|
|
# Always run real work for manual dispatch + merge_group — no
|
|
# diff-against-base in those contexts, and the gate exists to
|
|
# validate the to-be-merged state regardless of which paths it
|
|
# touched (paths-filter would default to "no changes" which is
|
|
# the wrong answer when the queue is composing many PRs).
|
|
run: |
|
|
if [ "${{ github.event_name }}" = "workflow_dispatch" ] || [ "${{ github.event_name }}" = "merge_group" ]; then
|
|
echo "wheel=true" >> "$GITHUB_OUTPUT"
|
|
else
|
|
echo "wheel=${{ steps.filter.outputs.wheel }}" >> "$GITHUB_OUTPUT"
|
|
fi
|
|
|
|
# ONE job (no job-level `if:`) that always runs and reports under the
|
|
# required-check name `PR-built wheel + import smoke`. Real work is
|
|
# gated per-step on `needs.detect-changes.outputs.wheel`. Same shape
|
|
# as e2e-api.yml's e2e-api job — see its comment block for the full
|
|
# rationale (SKIPPED check runs block branch protection even with
|
|
# SUCCESS siblings; collapsing to one always-run job emits exactly
|
|
# one SUCCESS check run).
|
|
local-build-install:
|
|
needs: detect-changes
|
|
name: PR-built wheel + import smoke
|
|
runs-on: ubuntu-latest
|
|
steps:
|
|
- name: No-op pass (paths filter excluded this commit)
|
|
if: needs.detect-changes.outputs.wheel != 'true'
|
|
run: |
|
|
echo "No workspace/ / scripts/{build_runtime_package,wheel_smoke}.py / workflow changes — wheel gate satisfied without rebuilding."
|
|
echo "::notice::PR-built wheel + import smoke no-op pass (paths filter excluded this commit)."
|
|
- if: needs.detect-changes.outputs.wheel == 'true'
|
|
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
|
|
- if: needs.detect-changes.outputs.wheel == 'true'
|
|
uses: actions/setup-python@a309ff8b426b58ec0e2a45f0f869d46889d02405 # v6.2.0
|
|
with:
|
|
python-version: '3.11'
|
|
cache: pip
|
|
cache-dependency-path: workspace/requirements.txt
|
|
- name: Install build tooling
|
|
if: needs.detect-changes.outputs.wheel == 'true'
|
|
run: pip install build
|
|
- name: Build wheel from PR source (mirrors publish-runtime.yml)
|
|
if: needs.detect-changes.outputs.wheel == 'true'
|
|
# Use a fixed test version so the wheel filename is predictable.
|
|
# Doesn't reach PyPI — this build is local-only for the smoke.
|
|
# Use the SAME build script with the SAME args as
|
|
# publish-runtime.yml's build step. The temp dir path differs
|
|
# (`/tmp/runtime-build` here vs `${{ runner.temp }}/runtime-build`
|
|
# in publish-runtime.yml — they coincide on ubuntu-latest but
|
|
# the call sites are not byte-identical). The smoke import is
|
|
# also intentionally narrower than publish's: this gate exists
|
|
# to catch SDK-version-import drift specifically; full invariant
|
|
# coverage lives in publish-runtime.yml's own pre-PyPI smoke.
|
|
run: |
|
|
python scripts/build_runtime_package.py \
|
|
--version "0.0.0.dev0+pin-compat" \
|
|
--out /tmp/runtime-build
|
|
cd /tmp/runtime-build && python -m build
|
|
- name: Install built wheel + workspace requirements
|
|
if: needs.detect-changes.outputs.wheel == 'true'
|
|
run: |
|
|
python -m venv /tmp/venv-built
|
|
/tmp/venv-built/bin/pip install --upgrade pip
|
|
/tmp/venv-built/bin/pip install /tmp/runtime-build/dist/*.whl
|
|
/tmp/venv-built/bin/pip install -r workspace/requirements.txt
|
|
/tmp/venv-built/bin/pip show molecule-ai-workspace-runtime a2a-sdk \
|
|
| grep -E '^(Name|Version):'
|
|
- name: Smoke import the PR-built wheel
|
|
if: needs.detect-changes.outputs.wheel == 'true'
|
|
# Same script publish-runtime.yml runs against the to-be-PyPI wheel.
|
|
# Closes the PR-time vs publish-time gap: a PR adding a new SDK
|
|
# call-shape no longer passes here (narrow `import main_sync`) only
|
|
# to fail post-merge in publish-runtime's broader smoke.
|
|
run: |
|
|
/tmp/venv-built/bin/python "$GITHUB_WORKSPACE/scripts/wheel_smoke.py"
|