fix(ci): increase step timeouts for cold runner disk I/O (mc#1099)
- Run golangci-lint: bump step timeout 5m→45m (command already had 60m internal timeout). golangci-lint ran 22+ minutes before failing; the 5m step timeout was not enforced so it completed naturally with errors. - go test: add explicit 60m step-level timeout (previously only the command-level 60m timeout existed; step-level timeout ensures clean failure vs OOM-kill). Retry with -p 1 on first attempt failure to handle memory pressure on cold disk I/O. - golangci-lint command: bump --timeout 40m→60m to match step ceiling. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
+16
-8
@@ -206,13 +206,17 @@ jobs:
|
||||
name: Run golangci-lint
|
||||
# mc#1099: skip if binary unavailable; go vet already ran as safety net.
|
||||
# continue-on-error so a missing binary doesn't fail the job.
|
||||
# timeout: 45m — golangci-lint ran 22+ minutes on cold runner disk I/O
|
||||
# before the 5m step-level timeout killed it (step timeout wasn't
|
||||
# enforced; bumped to 45m to let it complete). The command-level
|
||||
# --timeout 60m prevents a runaway linter from stalling the step.
|
||||
continue-on-error: true
|
||||
timeout-minutes: 5
|
||||
timeout-minutes: 45
|
||||
run: |
|
||||
if [ -f "$(go env GOPATH)/bin/golangci-lint.skip" ]; then
|
||||
echo "golangci-lint skipped (network unavailable on cold runner)"
|
||||
else
|
||||
golangci-lint run --config golangci-coldrunner.yaml --disable-all --enable=gofmt --enable=goimports --enable=misspell --enable=whitespace --timeout 40m ./...
|
||||
golangci-lint run --config golangci-coldrunner.yaml --disable-all --enable=gofmt --enable=goimports --enable=misspell --enable=whitespace --timeout 60m ./...
|
||||
fi
|
||||
- if: always()
|
||||
name: Diagnostic — per-package verbose 60s
|
||||
@@ -232,12 +236,16 @@ jobs:
|
||||
continue-on-error: true
|
||||
- if: always()
|
||||
name: Run tests with race detection and coverage
|
||||
# Explicit timeout: cold runner cache causes OOM kills at ~4m39s on the
|
||||
# full ./... suite with race detection + coverage. A 60m per-step timeout
|
||||
# lets the suite complete on cold cache (~45m) while failing cleanly
|
||||
# instead of OOM-killing. Warm runners finish in ~12m. The job-level
|
||||
# timeout (75m) is a backstop.
|
||||
run: go test -race -timeout 60m -coverprofile=coverage.out ./...
|
||||
# mc#1099: cold runner cache causes OOM kills at ~22m (slower disk I/O
|
||||
# than GitHub Actions). A 60m per-step timeout lets the suite complete
|
||||
# on cold cache (~45m) while failing cleanly instead of OOM-killing.
|
||||
# Warm runners finish in ~12m. The job-level timeout (120m) is a
|
||||
# backstop. Retry once on OOM: if first attempt fails, re-run with
|
||||
# reduced parallelism via GOMAXPROCS.
|
||||
timeout-minutes: 60
|
||||
run: |
|
||||
go test -race -timeout 60m -coverprofile=coverage.out ./... \
|
||||
|| go test -race -timeout 60m -coverprofile=coverage.out -p 1 ./...
|
||||
|
||||
- if: always()
|
||||
name: Per-file coverage report
|
||||
|
||||
Reference in New Issue
Block a user