Merge pull request 'ci: fix publish Docker healthcheck pipefail' (#952) from fix/publish-healthcheck-pipefail into main
Some checks failed
Block internal-flavored paths / Block forbidden paths (pull_request) Waiting to run
CI / Detect changes (pull_request) Waiting to run
CI / Platform (Go) (pull_request) Blocked by required conditions
CI / Canvas (Next.js) (pull_request) Blocked by required conditions
CI / Shellcheck (E2E scripts) (pull_request) Blocked by required conditions
CI / Canvas Deploy Reminder (pull_request) Blocked by required conditions
CI / Python Lint & Test (pull_request) Blocked by required conditions
CI / all-required (pull_request) Blocked by required conditions
E2E API Smoke Test / detect-changes (pull_request) Waiting to run
E2E API Smoke Test / E2E API Smoke Test (pull_request) Blocked by required conditions
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Waiting to run
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Blocked by required conditions
Handlers Postgres Integration / detect-changes (pull_request) Waiting to run
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Blocked by required conditions
lint-required-no-paths / lint-required-no-paths (pull_request) Waiting to run
Runtime PR-Built Compatibility / detect-changes (pull_request) Waiting to run
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Blocked by required conditions
Secret scan / Scan diff for credential-shaped strings (pull_request) Waiting to run
gate-check-v3 / gate-check (pull_request) Waiting to run
qa-review / approved (pull_request) Waiting to run
security-review / approved (pull_request) Waiting to run
sop-checklist / all-items-acked (pull_request) Waiting to run
sop-tier-check / tier-check (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (push) Successful in 11s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (push) Successful in 12s
E2E API Smoke Test / detect-changes (push) Successful in 40s
Handlers Postgres Integration / detect-changes (push) Successful in 35s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 39s
CI / Detect changes (push) Successful in 42s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 15s
Runtime PR-Built Compatibility / detect-changes (push) Successful in 29s
Sweep stale Cloudflare DNS records / Sweep CF orphans (push) Successful in 20s
CI / Platform (Go) (push) Successful in 13s
Ops Scripts Tests / Ops scripts (unittest) (push) Successful in 1m22s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (push) Successful in 1m40s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 15s
CI / Canvas (Next.js) (push) Successful in 9s
CI / Shellcheck (E2E scripts) (push) Successful in 6s
CI / Python Lint & Test (push) Successful in 7s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (push) Successful in 2m19s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 14s
CI / Canvas Deploy Reminder (push) Successful in 5s
CI / all-required (push) Successful in 6s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 4m38s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Successful in 2m35s
ci-required-drift / drift (push) Successful in 2m37s
publish-workspace-server-image / build-and-push (push) Successful in 8m42s
publish-workspace-server-image / Production auto-deploy (push) Failing after 2m4s
status-reaper / reap (push) Has started running
Sweep stale e2e-* orgs (staging) / Sweep e2e orgs (push) Successful in 12s
gitea-merge-queue / queue (push) Successful in 30s
Staging SaaS smoke (every 30 min) / Staging SaaS smoke (push) Successful in 4m34s
Continuous synthetic E2E (staging) / Synthetic E2E against staging (push) Successful in 4m53s
redeploy-tenants-on-main / redeploy (push) Compensated by status-reaper (workflow has no push: trigger; Gitea 1.22.6 hardcoded-suffix bug — see .gitea/scripts/status-reaper.py)
Some checks failed
Block internal-flavored paths / Block forbidden paths (pull_request) Waiting to run
CI / Detect changes (pull_request) Waiting to run
CI / Platform (Go) (pull_request) Blocked by required conditions
CI / Canvas (Next.js) (pull_request) Blocked by required conditions
CI / Shellcheck (E2E scripts) (pull_request) Blocked by required conditions
CI / Canvas Deploy Reminder (pull_request) Blocked by required conditions
CI / Python Lint & Test (pull_request) Blocked by required conditions
CI / all-required (pull_request) Blocked by required conditions
E2E API Smoke Test / detect-changes (pull_request) Waiting to run
E2E API Smoke Test / E2E API Smoke Test (pull_request) Blocked by required conditions
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Waiting to run
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Blocked by required conditions
Handlers Postgres Integration / detect-changes (pull_request) Waiting to run
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Blocked by required conditions
lint-required-no-paths / lint-required-no-paths (pull_request) Waiting to run
Runtime PR-Built Compatibility / detect-changes (pull_request) Waiting to run
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Blocked by required conditions
Secret scan / Scan diff for credential-shaped strings (pull_request) Waiting to run
gate-check-v3 / gate-check (pull_request) Waiting to run
qa-review / approved (pull_request) Waiting to run
security-review / approved (pull_request) Waiting to run
sop-checklist / all-items-acked (pull_request) Waiting to run
sop-tier-check / tier-check (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (push) Successful in 11s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (push) Successful in 12s
E2E API Smoke Test / detect-changes (push) Successful in 40s
Handlers Postgres Integration / detect-changes (push) Successful in 35s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 39s
CI / Detect changes (push) Successful in 42s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 15s
Runtime PR-Built Compatibility / detect-changes (push) Successful in 29s
Sweep stale Cloudflare DNS records / Sweep CF orphans (push) Successful in 20s
CI / Platform (Go) (push) Successful in 13s
Ops Scripts Tests / Ops scripts (unittest) (push) Successful in 1m22s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (push) Successful in 1m40s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 15s
CI / Canvas (Next.js) (push) Successful in 9s
CI / Shellcheck (E2E scripts) (push) Successful in 6s
CI / Python Lint & Test (push) Successful in 7s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (push) Successful in 2m19s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 14s
CI / Canvas Deploy Reminder (push) Successful in 5s
CI / all-required (push) Successful in 6s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 4m38s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Successful in 2m35s
ci-required-drift / drift (push) Successful in 2m37s
publish-workspace-server-image / build-and-push (push) Successful in 8m42s
publish-workspace-server-image / Production auto-deploy (push) Failing after 2m4s
status-reaper / reap (push) Has started running
Sweep stale e2e-* orgs (staging) / Sweep e2e orgs (push) Successful in 12s
gitea-merge-queue / queue (push) Successful in 30s
Staging SaaS smoke (every 30 min) / Staging SaaS smoke (push) Successful in 4m34s
Continuous synthetic E2E (staging) / Synthetic E2E against staging (push) Successful in 4m53s
redeploy-tenants-on-main / redeploy (push) Compensated by status-reaper (workflow has no push: trigger; Gitea 1.22.6 hardcoded-suffix bug — see .gitea/scripts/status-reaper.py)
This commit is contained in:
commit
38d12c6d41
@ -36,6 +36,9 @@ Rules (4 fatal + 1 fatal cross-file + 1 heuristic-warn):
|
||||
raw `.error` fields into CI logs/summaries.
|
||||
9. Production deploy/redeploy workflows must expose an operational control:
|
||||
kill switch for auto deploys or rollback tag for manual deploys.
|
||||
10. Docker health checks must not run `docker info | head` under pipefail.
|
||||
`head` closes the pipe early, `docker info` can exit nonzero from
|
||||
SIGPIPE, and the step can falsely report Docker daemon failure.
|
||||
|
||||
Per `feedback_smoke_test_vendor_truth_not_shape_match`: fixtures used to
|
||||
validate this lint must mirror real Gitea 1.22.6 YAML semantics, not
|
||||
@ -225,6 +228,24 @@ def _iter_uses(doc: Any) -> Iterable[str]:
|
||||
yield step["uses"]
|
||||
|
||||
|
||||
def _iter_run_blocks(doc: Any) -> Iterable[str]:
|
||||
"""Yield every shell `run:` block from job steps in a workflow document."""
|
||||
if not isinstance(doc, dict):
|
||||
return
|
||||
jobs = doc.get("jobs")
|
||||
if not isinstance(jobs, dict):
|
||||
return
|
||||
for job in jobs.values():
|
||||
if not isinstance(job, dict):
|
||||
continue
|
||||
steps = job.get("steps")
|
||||
if not isinstance(steps, list):
|
||||
continue
|
||||
for step in steps:
|
||||
if isinstance(step, dict) and isinstance(step.get("run"), str):
|
||||
yield step["run"]
|
||||
|
||||
|
||||
def check_cross_repo_uses(filename: str, doc: Any) -> list[str]:
|
||||
"""Return per-violation error lines for cross-repo `uses:` references."""
|
||||
errors: list[str] = []
|
||||
@ -264,6 +285,10 @@ GITHUB_API_REF_RE = re.compile(
|
||||
|
||||
PROD_CP_URL_RE = re.compile(r"https://api\.moleculesai\.app\b")
|
||||
REDEPLOY_FLEET_RE = re.compile(r"\b/cp/admin/tenants/redeploy-fleet\b")
|
||||
RUN_SETS_PIPEFAIL_RE = re.compile(r"(?m)^\s*set\s+-[^\n]*o\s+pipefail\b")
|
||||
DOCKER_INFO_HEAD_PIPE_RE = re.compile(
|
||||
r"(?m)^\s*docker\s+info\b[^\n|]*\|\s*head\b"
|
||||
)
|
||||
RAW_CP_RESPONSE_RE = re.compile(
|
||||
r"""(?x)
|
||||
(?:\bjq\s+\.\s+["']?\$HTTP_RESPONSE["']?)
|
||||
@ -383,6 +408,30 @@ def check_production_operational_control(filename: str, raw: str) -> list[str]:
|
||||
return errors
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Rule 10 — docker info piped to head under pipefail
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def check_docker_info_head_pipefail(filename: str, doc: Any) -> list[str]:
|
||||
errors: list[str] = []
|
||||
for run_block in _iter_run_blocks(doc):
|
||||
if not (
|
||||
RUN_SETS_PIPEFAIL_RE.search(run_block)
|
||||
and DOCKER_INFO_HEAD_PIPE_RE.search(run_block)
|
||||
):
|
||||
continue
|
||||
errors.append(
|
||||
f"::error file={filename}::Rule 10 (FATAL): workflow runs "
|
||||
f"`docker info | head` after enabling `pipefail`. `head` can "
|
||||
f"close the pipe early, making `docker info` exit nonzero and "
|
||||
f"falsely fail the Docker daemon health check. Capture "
|
||||
f"`docker_info=\"$(docker info 2>&1)\"` first, then print a "
|
||||
f"bounded preview with `printf ... | sed -n '1,5p'`."
|
||||
)
|
||||
break
|
||||
return errors
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Driver
|
||||
# ---------------------------------------------------------------------------
|
||||
@ -436,6 +485,7 @@ def main(argv: list[str] | None = None) -> int:
|
||||
fatal_errors.extend(check_production_concurrency(rel, doc, raw))
|
||||
fatal_errors.extend(check_production_raw_response_logging(rel, raw))
|
||||
fatal_errors.extend(check_production_operational_control(rel, raw))
|
||||
fatal_errors.extend(check_docker_info_head_pipefail(rel, doc))
|
||||
warnings.extend(check_github_server_url_missing(rel, doc, raw))
|
||||
|
||||
# Cross-file checks
|
||||
|
||||
@ -68,12 +68,14 @@ jobs:
|
||||
set -euo pipefail
|
||||
echo "::group::Docker daemon health check"
|
||||
echo "Runner: ${HOSTNAME:-unknown}"
|
||||
docker info 2>&1 | head -5 || {
|
||||
docker_info="$(docker info 2>&1)" || {
|
||||
echo "::error::Docker daemon is not accessible at /var/run/docker.sock"
|
||||
echo "::error::Runner: ${HOSTNAME:-unknown}"
|
||||
printf '%s\n' "${docker_info}"
|
||||
echo "::error::Check: (1) daemon is running, (2) runner user is in docker group, (3) sock permissions are 660+"
|
||||
exit 1
|
||||
}
|
||||
printf '%s\n' "${docker_info}" | sed -n '1,5p'
|
||||
echo "Docker daemon OK"
|
||||
echo "::endgroup::"
|
||||
|
||||
|
||||
@ -545,6 +545,70 @@ def test_rule9_prod_manual_deploy_allows_rollback_control(tmp_path):
|
||||
assert r.returncode == 0, f"stdout={r.stdout}\nstderr={r.stderr}"
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Rule 10 — docker info piped to head under pipefail
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
DOCKER_INFO_HEAD_BAD = """
|
||||
name: docker-info-head-bad
|
||||
on: [push]
|
||||
jobs:
|
||||
build:
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- run: |
|
||||
set -euo pipefail
|
||||
docker info 2>&1 | head -5 || exit 1
|
||||
"""
|
||||
|
||||
DOCKER_INFO_CAPTURE_OK = """
|
||||
name: docker-info-capture-ok
|
||||
on: [push]
|
||||
jobs:
|
||||
build:
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- run: |
|
||||
set -euo pipefail
|
||||
docker_info="$(docker info 2>&1)" || exit 1
|
||||
printf '%s\\n' "${docker_info}" | sed -n '1,5p'
|
||||
"""
|
||||
|
||||
DOCKER_INFO_SEPARATE_STEP_OK = """
|
||||
name: docker-info-separate-step-ok
|
||||
on: [push]
|
||||
jobs:
|
||||
build:
|
||||
runs-on: ubuntu-latest
|
||||
steps:
|
||||
- run: |
|
||||
set -euo pipefail
|
||||
echo setup
|
||||
- run: |
|
||||
docker info 2>&1 | head -5 || true
|
||||
"""
|
||||
|
||||
|
||||
def test_rule10_docker_info_head_under_pipefail_detects_violation(tmp_path):
|
||||
_write(tmp_path, "bad.yml", DOCKER_INFO_HEAD_BAD)
|
||||
r = _run_lint(tmp_path)
|
||||
assert r.returncode == 1
|
||||
assert "docker info" in r.stdout.lower()
|
||||
assert "pipefail" in r.stdout.lower()
|
||||
|
||||
|
||||
def test_rule10_docker_info_capture_passes(tmp_path):
|
||||
_write(tmp_path, "ok.yml", DOCKER_INFO_CAPTURE_OK)
|
||||
r = _run_lint(tmp_path)
|
||||
assert r.returncode == 0, f"stdout={r.stdout}\nstderr={r.stderr}"
|
||||
|
||||
|
||||
def test_rule10_docker_info_head_in_separate_step_without_pipefail_passes(tmp_path):
|
||||
_write(tmp_path, "ok.yml", DOCKER_INFO_SEPARATE_STEP_OK)
|
||||
r = _run_lint(tmp_path)
|
||||
assert r.returncode == 0, f"stdout={r.stdout}\nstderr={r.stderr}"
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# CI change detector fanout — workflow-only PRs keep required contexts without
|
||||
# running Go/Canvas/Python/shellcheck heavy steps.
|
||||
|
||||
Loading…
Reference in New Issue
Block a user