ci(lint): guard actions/setup-go cache on self-hosted fleet #2539
Reference in New Issue
Block a user
Delete Branch "ci/guard-setup-go-cache"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Static workflow-shape lint that makes the actions/setup-go cache bind-mount corruption (2026-06-09/10 rollout) impossible to reintroduce.
Forbidden shape: any actions/setup-go step with caching enabled — cache:true explicit, OR cache-dependency-path/cache-key set with no cache: (defaults true), OR a bare setup-go with no cache: (still default-true). The only safe value on our self-hosted fleet is explicit cache:false.
Why: runners bind-mount a host-shared GOCACHE/GOMODCACHE (/var/cache/ci-go-{build,mod}, operator-config ops/runners/config.dedicated.yaml). actions/cache untars OVER that bind mount -> File exists -> partial cache -> test -race link / go-arch-lint failures.
Phase / coordination: lands continue-on-error: true (advisory). The sweep PR #2524 (fix/setup-go-cache-vs-bind-mount) removes the remaining cache:true hits in 8 e2e workflows; THIS PR additionally removes the 4 default-true hits #2524 does not touch (ci.yml, ci-arm64-advisory.yml, handlers-postgres-integration.yml, weekly-platform-go.yml). FOLLOW-UP: after #2524 merges + main clean 3 days, flip continue-on-error -> false.
Fixture-catch proof: 6 pytest cases in tests/test_lint_setup_go_cache.py (cache:true, default-true+dep, bare default-true all caught; cache:false + no-setup-go clean) — all pass.
Guard class 1 of the cross-repo CI-bug-class lint set.
Skipping — genuine required-context failures (not just the SOP-ceremony gap).
lint-continue-on-error-trackingfails on the PR's own diff:.gitea/workflows/lint-setup-go-cache.yml line 64hascontinue-on-error: truewith no# mc#NNNN / # internal#NNNNtracker comment within 2 lines — add a tracker ref to satisfy the 14-day renewal rule.Ops Scripts Testsalso fails (4 pytest tests in test_gitea_merge_queue/test_status_reaper hit empty/repos///URLs → DNS error; env/owner-repo not set in the test). Plus the core full SOP gate is unacked (qa-review/security-review/all-items-acked pending). Reviewer-persona approvals do not flip the SOP statuses. Fix the lint tracker comment + ops-test env, then ack the checklist. Not forcing.re-approve rebased head (main merged for #2551 Ops-Scripts fix); change unchanged
re-approve rebased head (main merged for #2551 Ops-Scripts fix); change unchanged
New commits pushed, approval review dismissed automatically according to repository settings
New commits pushed, approval review dismissed automatically according to repository settings
core-qa re-approve on head
1e0de9c7(prior approval dismissed by dismiss_stale on the main-merge push).Verified:
git merge origin/mainresolved cleanly; the only conflict was handlers-postgres-integration.yml and BOTH sides keptcache: false— resolution preserves cache:false (the whole point of the guard). Ran lint_setup_go_cache.py locally: "OK: every actions/setup-go step sets cache: false", exit 0.# internal#881(open, 0d old, <=14d). All 32 continue-on-error:true directives have valid trackers.1e0de9c7: lint-setup-go-cache, lint-continue-on-error-tracking, Ops Scripts Tests, CI/all-required, lint-no-coe-on-required all green.QA: PASS.
core-security re-approve on head
1e0de9c7(prior approval dismissed by dismiss_stale on the main-merge push).Security review:
permissions: contents: read.cache: falseon the self-hosted bind-mount fleet, which is the integrity control (prevents corrupt restored GOCACHE), not a security regression.Security: PASS.