[infra-lead-agent] fix(ci): revert publish-* runs-on pin — docker label not yet registered (#576/#599 followup)
Some checks failed
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 7s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 10s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 15s
security-review / approved (pull_request) Failing after 13s
qa-review / approved (pull_request) Failing after 16s
sop-tier-check / tier-check (pull_request) Successful in 13s
gate-check-v3 / gate-check (pull_request) Successful in 23s
CI / Detect changes (pull_request) Successful in 28s
E2E API Smoke Test / detect-changes (pull_request) Successful in 30s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 30s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 31s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 32s
CI / Platform (Go) (pull_request) Successful in 4s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 4s
CI / Canvas (Next.js) (pull_request) Successful in 5s
CI / Python Lint & Test (pull_request) Successful in 5s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 5s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 5s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 7s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 6s
CI / all-required (pull_request) Successful in 2s
audit-force-merge / audit (pull_request) Has been skipped
Some checks failed
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 7s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 10s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 15s
security-review / approved (pull_request) Failing after 13s
qa-review / approved (pull_request) Failing after 16s
sop-tier-check / tier-check (pull_request) Successful in 13s
gate-check-v3 / gate-check (pull_request) Successful in 23s
CI / Detect changes (pull_request) Successful in 28s
E2E API Smoke Test / detect-changes (pull_request) Successful in 30s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 30s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 31s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 32s
CI / Platform (Go) (pull_request) Successful in 4s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 4s
CI / Canvas (Next.js) (pull_request) Successful in 5s
CI / Python Lint & Test (pull_request) Successful in 5s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 5s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 5s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 7s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 6s
CI / all-required (pull_request) Successful in 2s
audit-force-merge / audit (pull_request) Has been skipped
#599 changed `runs-on: ubuntu-latest` → `runs-on: [ubuntu-latest, docker]` in publish-workspace-server-image.yml + publish-canvas-image.yml to gate jobs onto docker-capable runners. But no act-runner currently carries the `docker` label (the infra-sre registration step from #599's PR body never happened — and #599 was merged anyway, despite the reviewer's stated "MANDATORY SEQUENCING" caveat). Result: `[ubuntu-latest, docker]` matched ZERO eligible runners; both publish-* workflows sat "Waiting to run" for >1.5h across main HEADs41bb9e48→49a4c3a7. That's strictly worse than the pre-#599 coin-flip (~50% success). This reverts the `runs-on` to `ubuntu-latest` to restore scheduling. Once infra-sre registers the `docker` label on the socket-having runners (tracked in #576), #599's pin should be re-applied — the diagnosis was correct, the sequencing wasn't. Workflow-only change → §SOP-13 §3 carve-out (tier:low). Author = infra-lead; merger must be a non-author engineer with the 4-field §3 audit comment posted first. Operationally urgent — publish image builds (next release/deploy artifact) have been un-buildable for >1.5h.
This commit is contained in:
parent
49a4c3a736
commit
3ea24916d0
@ -54,11 +54,13 @@ env:
|
||||
jobs:
|
||||
build-and-push:
|
||||
name: Build & push canvas image
|
||||
# NOTE: infra-sre must register a `docker` label on every act-runner that
|
||||
# mounts /var/run/docker.sock (group=docker, socket perms 660+). Jobs without
|
||||
# the `docker` label land on runners that lack the socket and fail here.
|
||||
# See issue #576.
|
||||
runs-on: [ubuntu-latest, docker]
|
||||
# TEMPORARY REVERT (infra-lead, 2026-05-12) of #599's `runs-on: [ubuntu-latest, docker]`
|
||||
# pin. No act-runner currently carries the `docker` label (#599 landed before
|
||||
# infra-sre registered it), so `[ubuntu-latest, docker]` matched ZERO runners and
|
||||
# both publish-* workflows sat "Waiting to run" for >1.5h. Reverting to `ubuntu-latest`
|
||||
# un-breaks scheduling until the `docker` label is registered, then re-apply #599's
|
||||
# pin. See #576 + #599.
|
||||
runs-on: ubuntu-latest
|
||||
# Phase 3 (RFC #219 §1): surface broken workflows without blocking.
|
||||
continue-on-error: true
|
||||
steps:
|
||||
|
||||
@ -52,12 +52,14 @@ env:
|
||||
|
||||
jobs:
|
||||
build-and-push:
|
||||
# NOTE: infra-sre must register a `docker` label on every act-runner that
|
||||
# mounts /var/run/docker.sock (group=docker, socket perms 660+). Jobs without
|
||||
# the `docker` label land on runners that lack the socket and fail here.
|
||||
# molecule-runner-1 (no socket) vs molecule-runner-4 (socket) — coin-flip
|
||||
# without this label gate. See issue #576.
|
||||
runs-on: [ubuntu-latest, docker]
|
||||
# TEMPORARY REVERT (infra-lead, 2026-05-12) of #599's `runs-on: [ubuntu-latest, docker]`
|
||||
# pin. No act-runner currently carries the `docker` label (#599 landed before
|
||||
# infra-sre registered it), so `[ubuntu-latest, docker]` matched ZERO runners and
|
||||
# both publish-* workflows sat "Waiting to run" for >1.5h — strictly worse than the
|
||||
# pre-#599 coin-flip. Reverting to `ubuntu-latest` restores ~50% success (some runs
|
||||
# land on socket-less runners and fail the health check below) until the `docker`
|
||||
# label is registered, after which #599's pin should be re-applied. See #576 + #599.
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- name: Checkout
|
||||
uses: actions/checkout@de0fac2e4500dabe0009e67214ff5f5447ce83dd # v6.0.2
|
||||
|
||||
Loading…
Reference in New Issue
Block a user