fix(ci): repair docker-host guardrail follow-up #1561

Merged
hongming merged 1 commits from fix/ci-docker-host-guardrail-red into main 2026-05-19 03:39:51 +00:00
Owner

Fixes the current main red introduced by the docker-host guardrail follow-up.

Evidence from the 19:42 PDT triage:

  • lint-required-workflows-docker-host-pinned failed on molecule-core/main@c6e89219e110 with panic: unclosed string because the Python heredoc contained a literal Gitea expression marker.
  • After removing that parser trigger, the guardrail exposed a second bug: it scanned the whole workflow file and marked harmless jobs if any sibling job used Docker.
  • CI / Shellcheck (E2E scripts) failed on three SC2034 warnings in peer-visibility map/verdict plumbing.

Fix:

  • Spell the expression marker indirectly in the heredoc.
  • Scan Docker usage per job body instead of per workflow file.
  • Pin the remaining Docker-bound jobs the guardrail now correctly reports.
  • Suppress intentional SC2034 assignments in the peer-visibility scripts.

Local verification:

  • Extracted and executed the guardrail heredoc: OK: all docker-bound jobs are pinned to docker-host or publish.
  • shellcheck --severity=warning tests/e2e/lib/peer_visibility_assert.sh tests/e2e/test_peer_visibility_mcp_local.sh
  • python3 .gitea/scripts/lint-workflow-yaml.py --workflow-dir .gitea/workflows
  • git diff --check
Fixes the current main red introduced by the docker-host guardrail follow-up. Evidence from the 19:42 PDT triage: - `lint-required-workflows-docker-host-pinned` failed on `molecule-core/main@c6e89219e110` with `panic: unclosed string` because the Python heredoc contained a literal Gitea expression marker. - After removing that parser trigger, the guardrail exposed a second bug: it scanned the whole workflow file and marked harmless jobs if any sibling job used Docker. - `CI / Shellcheck (E2E scripts)` failed on three SC2034 warnings in peer-visibility map/verdict plumbing. Fix: - Spell the expression marker indirectly in the heredoc. - Scan Docker usage per job body instead of per workflow file. - Pin the remaining Docker-bound jobs the guardrail now correctly reports. - Suppress intentional SC2034 assignments in the peer-visibility scripts. Local verification: - Extracted and executed the guardrail heredoc: `OK: all docker-bound jobs are pinned to docker-host or publish.` - `shellcheck --severity=warning tests/e2e/lib/peer_visibility_assert.sh tests/e2e/test_peer_visibility_mcp_local.sh` - `python3 .gitea/scripts/lint-workflow-yaml.py --workflow-dir .gitea/workflows` - `git diff --check`
hongming added 1 commit 2026-05-19 02:50:51 +00:00
fix(ci): repair docker-host guardrail follow-up
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 6s
CI / Detect changes (pull_request) Successful in 16s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 12s
E2E API Smoke Test / detect-changes (pull_request) Successful in 10s
E2E Chat / detect-changes (pull_request) Successful in 20s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Has been skipped
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 10s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 8s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 6s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 11s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Failing after 1m10s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 31s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Successful in 1m35s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Successful in 35s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (pull_request) Successful in 4s
lint-mask-pr-atomicity / lint-mask-pr-atomicity (pull_request) Successful in 1m46s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 6s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 5s
gate-check-v3 / gate-check (pull_request) Successful in 4s
qa-review / approved (pull_request) Failing after 4s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 29s
security-review / approved (pull_request) Failing after 17s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m9s
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, local-postgres-e2
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-tier-check / tier-check (pull_request) Successful in 7s
E2E Chat / E2E Chat (pull_request) Successful in 9s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 7s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 6s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 7s
CI / Platform (Go) (pull_request) Successful in 4m58s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Failing after 1m12s
CI / Canvas (Next.js) (pull_request) Successful in 6m13s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / Python Lint & Test (pull_request) Successful in 6m56s
CI / all-required (pull_request) Successful in 7m11s
audit-force-merge / audit (pull_request) Successful in 5s
00351b4551
hongming added the area/cikind/infrastructuretier:high labels 2026-05-19 02:51:33 +00:00
core-devops approved these changes 2026-05-19 03:37:54 +00:00
Dismissed
core-devops left a comment
Member

Five-axis pass (core-devops):

  1. Correctness: docker-host lint script previously scanned the entire workflow body for docker-exec usage and matched any single runs-on it found — silently broken for multi-job workflows. Fix scans raw_lines[j.line - 1:j.end] per job: correct slicing, end defaults to len(raw_lines) for the last job, then re-set to i - 1 when a new header appears. Sound.
  2. runs-on: ubuntu-latestdocker-host (or publish for the canvas image build) migration on 7 workflows aligns with mc#1529 + internal#512 + feedback_cp_workspaces_must_run_as_docker_not_native_systemd. Docker-bound jobs that drift onto the Windows act_runner break non-deterministically — pinning to the dedicated host label is the durable fix.
  3. expression_marker = '$' + '{{' indirection sidesteps the Gitea-1.22.6-hostile workflow-yaml linter literal that would parse the embedded Python heredoc. Cheap, local, correct.
  4. shellcheck SC2034 disables on indirect-eval map vars (WS_IDS_MAP / VERDICT_MAP / PV_VERDICT) are accurate — these ARE read through portable bash-3.2 eval shims; suppression is targeted, not file-wide.
  5. CI CI / all-required is green on the head commit (id=80, 2026-05-19T02:58:21Z), proving the new lint logic passes on the very workflows it migrates.

LGTM. Unblocks the shared upstream failure for mc#1559 + mc#1563.

Five-axis pass (core-devops): 1. Correctness: docker-host lint script previously scanned the entire workflow body for docker-exec usage and matched any single `runs-on` it found — silently broken for multi-job workflows. Fix scans `raw_lines[j.line - 1:j.end]` per job: correct slicing, `end` defaults to `len(raw_lines)` for the last job, then re-set to `i - 1` when a new header appears. Sound. 2. `runs-on: ubuntu-latest` → `docker-host` (or `publish` for the canvas image build) migration on 7 workflows aligns with mc#1529 + internal#512 + feedback_cp_workspaces_must_run_as_docker_not_native_systemd. Docker-bound jobs that drift onto the Windows act_runner break non-deterministically — pinning to the dedicated host label is the durable fix. 3. `expression_marker = '$' + '{{'` indirection sidesteps the Gitea-1.22.6-hostile workflow-yaml linter literal that would parse the embedded Python heredoc. Cheap, local, correct. 4. shellcheck SC2034 disables on indirect-eval map vars (`WS_IDS_MAP` / `VERDICT_MAP` / `PV_VERDICT`) are accurate — these ARE read through portable bash-3.2 eval shims; suppression is targeted, not file-wide. 5. CI `CI / all-required` is green on the head commit (id=80, 2026-05-19T02:58:21Z), proving the new lint logic passes on the very workflows it migrates. LGTM. Unblocks the shared upstream failure for mc#1559 + mc#1563.
core-devops approved these changes 2026-05-19 03:38:09 +00:00
Dismissed
core-devops left a comment
Member

Five-axis pass (core-devops):

  1. Correctness: docker-host lint script previously scanned the entire workflow body for docker-exec usage and matched any single runs-on it found — silently broken for multi-job workflows. Fix scans raw_lines[j.line - 1:j.end] per job: correct slicing, end defaults to len(raw_lines) for the last job, then re-set to i - 1 when a new header appears. Sound.
  2. runs-on: ubuntu-latest -> docker-host (or publish for the canvas image build) migration on 7 workflows aligns with mc#1529 + internal#512 + feedback_cp_workspaces_must_run_as_docker_not_native_systemd. Docker-bound jobs that drift onto the Windows act_runner break non-deterministically — pinning to the dedicated host label is the durable fix.
  3. expression_marker = '$' + '{{' indirection sidesteps the Gitea-1.22.6-hostile workflow-yaml linter literal that would parse the embedded Python heredoc. Cheap, local, correct.
  4. shellcheck SC2034 disables on indirect-eval map vars (WS_IDS_MAP / VERDICT_MAP / PV_VERDICT) are accurate — these ARE read through portable bash-3.2 eval shims; suppression is targeted, not file-wide.
  5. CI CI / all-required is green on the head commit (id=80, 2026-05-19T02:58:21Z), proving the new lint logic passes on the very workflows it migrates.

LGTM. Unblocks the shared upstream failure for mc#1559 + mc#1563.

Five-axis pass (core-devops): 1. Correctness: docker-host lint script previously scanned the entire workflow body for docker-exec usage and matched any single `runs-on` it found — silently broken for multi-job workflows. Fix scans `raw_lines[j.line - 1:j.end]` per job: correct slicing, `end` defaults to `len(raw_lines)` for the last job, then re-set to `i - 1` when a new header appears. Sound. 2. `runs-on: ubuntu-latest` -> `docker-host` (or `publish` for the canvas image build) migration on 7 workflows aligns with mc#1529 + internal#512 + feedback_cp_workspaces_must_run_as_docker_not_native_systemd. Docker-bound jobs that drift onto the Windows act_runner break non-deterministically — pinning to the dedicated host label is the durable fix. 3. `expression_marker = '$' + '{{'` indirection sidesteps the Gitea-1.22.6-hostile workflow-yaml linter literal that would parse the embedded Python heredoc. Cheap, local, correct. 4. shellcheck SC2034 disables on indirect-eval map vars (`WS_IDS_MAP` / `VERDICT_MAP` / `PV_VERDICT`) are accurate — these ARE read through portable bash-3.2 eval shims; suppression is targeted, not file-wide. 5. CI `CI / all-required` is green on the head commit (id=80, 2026-05-19T02:58:21Z), proving the new lint logic passes on the very workflows it migrates. LGTM. Unblocks the shared upstream failure for mc#1559 + mc#1563.
core-devops approved these changes 2026-05-19 03:38:49 +00:00
Dismissed
core-devops left a comment
Member

Five-axis pass (core-devops):

  1. Correctness: docker-host lint script previously scanned the entire workflow body for docker-exec usage and matched any single runs-on it found — silently broken for multi-job workflows. Fix scans raw_lines[j.line - 1:j.end] per job: correct slicing, end defaults to len(raw_lines) for the last job, then re-set to i - 1 when a new header appears. Sound.
  2. runs-on: ubuntu-latest -> docker-host (or publish for the canvas image build) migration on 7 workflows aligns with mc#1529 + internal#512 + feedback_cp_workspaces_must_run_as_docker_not_native_systemd. Docker-bound jobs that drift onto the Windows act_runner break non-deterministically — pinning to the dedicated host label is the durable fix.
  3. expression_marker = '$' + '{{' indirection sidesteps the Gitea-1.22.6-hostile workflow-yaml linter literal that would parse the embedded Python heredoc. Cheap, local, correct.
  4. shellcheck SC2034 disables on indirect-eval map vars (WS_IDS_MAP / VERDICT_MAP / PV_VERDICT) are accurate — these ARE read through portable bash-3.2 eval shims; suppression is targeted, not file-wide.
  5. CI CI / all-required is green on the head commit (id=80, 2026-05-19T02:58:21Z), proving the new lint logic passes on the very workflows it migrates.

LGTM. Unblocks the shared upstream failure for mc#1559 + mc#1563.

Five-axis pass (core-devops): 1. Correctness: docker-host lint script previously scanned the entire workflow body for docker-exec usage and matched any single `runs-on` it found — silently broken for multi-job workflows. Fix scans `raw_lines[j.line - 1:j.end]` per job: correct slicing, `end` defaults to `len(raw_lines)` for the last job, then re-set to `i - 1` when a new header appears. Sound. 2. `runs-on: ubuntu-latest` -> `docker-host` (or `publish` for the canvas image build) migration on 7 workflows aligns with mc#1529 + internal#512 + feedback_cp_workspaces_must_run_as_docker_not_native_systemd. Docker-bound jobs that drift onto the Windows act_runner break non-deterministically — pinning to the dedicated host label is the durable fix. 3. `expression_marker = '$' + '{{'` indirection sidesteps the Gitea-1.22.6-hostile workflow-yaml linter literal that would parse the embedded Python heredoc. Cheap, local, correct. 4. shellcheck SC2034 disables on indirect-eval map vars (`WS_IDS_MAP` / `VERDICT_MAP` / `PV_VERDICT`) are accurate — these ARE read through portable bash-3.2 eval shims; suppression is targeted, not file-wide. 5. CI `CI / all-required` is green on the head commit (id=80, 2026-05-19T02:58:21Z), proving the new lint logic passes on the very workflows it migrates. LGTM. Unblocks the shared upstream failure for mc#1559 + mc#1563.
core-devops approved these changes 2026-05-19 03:38:54 +00:00
core-devops left a comment
Member

Five-axis pass (core-devops):

  1. Correctness: docker-host lint script previously scanned the entire workflow body for docker-exec usage and matched any single runs-on it found — silently broken for multi-job workflows. Fix scans raw_lines[j.line - 1:j.end] per job: correct slicing, end defaults to len(raw_lines) for the last job, then re-set to i - 1 when a new header appears. Sound.
  2. runs-on: ubuntu-latest -> docker-host (or publish for the canvas image build) migration on 7 workflows aligns with mc#1529 + internal#512 + feedback_cp_workspaces_must_run_as_docker_not_native_systemd. Docker-bound jobs that drift onto the Windows act_runner break non-deterministically — pinning to the dedicated host label is the durable fix.
  3. expression_marker = '$' + '{{' indirection sidesteps the Gitea-1.22.6-hostile workflow-yaml linter literal that would parse the embedded Python heredoc. Cheap, local, correct.
  4. shellcheck SC2034 disables on indirect-eval map vars (WS_IDS_MAP / VERDICT_MAP / PV_VERDICT) are accurate — these ARE read through portable bash-3.2 eval shims; suppression is targeted, not file-wide.
  5. CI CI / all-required is green on the head commit (id=80, 2026-05-19T02:58:21Z), proving the new lint logic passes on the very workflows it migrates.

LGTM. Unblocks the shared upstream failure for mc#1559 + mc#1563.

Five-axis pass (core-devops): 1. Correctness: docker-host lint script previously scanned the entire workflow body for docker-exec usage and matched any single `runs-on` it found — silently broken for multi-job workflows. Fix scans `raw_lines[j.line - 1:j.end]` per job: correct slicing, `end` defaults to `len(raw_lines)` for the last job, then re-set to `i - 1` when a new header appears. Sound. 2. `runs-on: ubuntu-latest` -> `docker-host` (or `publish` for the canvas image build) migration on 7 workflows aligns with mc#1529 + internal#512 + feedback_cp_workspaces_must_run_as_docker_not_native_systemd. Docker-bound jobs that drift onto the Windows act_runner break non-deterministically — pinning to the dedicated host label is the durable fix. 3. `expression_marker = '$' + '{{'` indirection sidesteps the Gitea-1.22.6-hostile workflow-yaml linter literal that would parse the embedded Python heredoc. Cheap, local, correct. 4. shellcheck SC2034 disables on indirect-eval map vars (`WS_IDS_MAP` / `VERDICT_MAP` / `PV_VERDICT`) are accurate — these ARE read through portable bash-3.2 eval shims; suppression is targeted, not file-wide. 5. CI `CI / all-required` is green on the head commit (id=80, 2026-05-19T02:58:21Z), proving the new lint logic passes on the very workflows it migrates. LGTM. Unblocks the shared upstream failure for mc#1559 + mc#1563.
core-security approved these changes 2026-05-19 03:39:17 +00:00
core-security left a comment
Member

Five-axis pass (core-security):

  1. Secret exposure: zero. No env-var reads, no token plumbing, no logging additions that touch credentials. The 7 runs-on swaps and 2 shellcheck disable comments don't change any secrets surface.
  2. Injection-shaped shell construction: none. The two added # shellcheck disable=SC2034 annotations are scoped to assignment lines for variables read through portable bash-3.2 eval-based map shims (WS_IDS_MAP, VERDICT_MAP, PV_VERDICT). The eval paths are unchanged in this PR — disable is targeted and accurate, not papering over a real injection.
  3. The Python lint script edit (.gitea/workflows/lint-required-workflows-docker-host-pinned.yml) takes its inputs from workflow filenames under .gitea/workflows / .github/workflows via os.listdir and reads file contents with open(...). No user-supplied data flows into a shell or eval; no string interpolation into a command. The expression_marker = '$' + '{{' indirection is a literal-string defense (avoids the workflow yaml linter parsing the heredoc), not an injection vector.
  4. runs-on: ubuntu-latestdocker-host / publish migrations move workloads from a hosted-runner label to a self-hosted label. Self-hosted runners DO inherit any secrets the workflow declares — but no new secrets.* references are introduced in this PR, and the dedicated docker-host / publish runners are the canonical surface for these jobs per fleet policy (mc#1529 / internal#512). Migration does not change the secrets blast radius.
  5. CI CI / all-required is green on head commit 00351b4 (id=80, 2026-05-19T02:58:21Z). Secret-scan context (Secret scan / Scan diff for credential-shaped strings) passed alongside.

LGTM from a security review standpoint. No new attack surface.

Five-axis pass (core-security): 1. Secret exposure: zero. No env-var reads, no token plumbing, no logging additions that touch credentials. The 7 `runs-on` swaps and 2 shellcheck disable comments don't change any secrets surface. 2. Injection-shaped shell construction: none. The two added `# shellcheck disable=SC2034` annotations are scoped to assignment lines for variables read through portable bash-3.2 `eval`-based map shims (`WS_IDS_MAP`, `VERDICT_MAP`, `PV_VERDICT`). The `eval` paths are unchanged in this PR — disable is targeted and accurate, not papering over a real injection. 3. The Python lint script edit (`.gitea/workflows/lint-required-workflows-docker-host-pinned.yml`) takes its inputs from workflow filenames under `.gitea/workflows` / `.github/workflows` via `os.listdir` and reads file contents with `open(...)`. No user-supplied data flows into a shell or eval; no string interpolation into a command. The `expression_marker = '$' + '{{'` indirection is a literal-string defense (avoids the workflow yaml linter parsing the heredoc), not an injection vector. 4. `runs-on: ubuntu-latest` → `docker-host` / `publish` migrations move workloads from a hosted-runner label to a self-hosted label. Self-hosted runners DO inherit any secrets the workflow declares — but no new `secrets.*` references are introduced in this PR, and the dedicated `docker-host` / `publish` runners are the canonical surface for these jobs per fleet policy (mc#1529 / internal#512). Migration does not change the secrets blast radius. 5. CI `CI / all-required` is green on head commit 00351b4 (id=80, 2026-05-19T02:58:21Z). Secret-scan context (`Secret scan / Scan diff for credential-shaped strings`) passed alongside. LGTM from a security review standpoint. No new attack surface.
hongming merged commit 517327aa1e into main 2026-05-19 03:39:51 +00:00
Sign in to join this conversation.
3 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: molecule-ai/molecule-core#1561