## Problem External audit flagged critical security-path files at 0% coverage: - workspace-server/handlers/tokens.go 0% (target 90%+) - workspace-server/handlers/workspace_provision 0% (target 75%+) - workspace-server/middleware/wsauth ~48% (target 90%+) Tests *exist* for these files (tokens_test.go is 200 lines, workspace_ provision_test.go is 1138 lines) — they just don't exercise the critical branches where auth/provisioning decisions happen. CI's existing coverage step measured total coverage (floor 25%) but never checked per-file, so any single file could drop to 0% and CI stayed green. ## Fix — Layer 1 of #1823 (strictly additive) 1. **Per-file coverage report** — advisory step prints every source file with its coverage, sorted worst-first. Reviewers see the gap at a glance. Does not fail the build. 2. **Critical-path per-file gate** — if any non-test source file in a security-sensitive directory (tokens, workspace_provision, a2a_proxy, registry, secrets, wsauth, crypto) has coverage ≤10%, CI fails with a specific error message pointing at the file + #1823. 3. **Unchanged: total floor stays at 25%** — ratcheting is a separate PR so this one has zero risk of breaking existing coverage. Ratchet plan lives in COVERAGE_FLOOR.md (monthly schedule through Oct 2026 to reach 70% total / 70% critical). ## Why this specifically "Tell devs to write tests" doesn't fix this — the prompts already require tests ("Write tests for every handler, every query, every edge case"), and the engineers mostly do. The gap is mechanical: CI generates coverage.out and throws it away without checking per-file distribution. This gate makes "no untested security path merges" a property of the CI, not a property of QA agents who (as of today's incident) can go phantom- busy for hours. ## Smoke test Local awk-logic verification with synthetic coverage.out: - tokens.go at 2.5% (critical path, ≤10%) → correctly FAILS - noncritical.go at 0.0% (not in critical list) → correctly PASSES - wsauth_middleware.go at 65% (critical, above 10%) → correctly PASSES - crypto/kek.go at 85% (critical, above 10%) → correctly PASSES Regex bug caught and fixed: go tool cover -func emits file.go:LINE.COL:FUNC PERCENT The stripper needed :[0-9]+\..* not :[0-9]+:.* ## Follow-up (not in this PR) - Layer 2 (issue #1823): per-changed-file delta gate via diff-cover, enforcing the prompt rule ">80% on changed files" - Add these two new steps to branch protection required checks - Canvas (Next.js) equivalent with vitest --coverage + threshold Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
79 lines
3.0 KiB
Markdown
79 lines
3.0 KiB
Markdown
# Coverage Floor
|
|
|
|
CI enforces three coverage gates on `workspace-server` (Go). All defined in
|
|
`.github/workflows/ci.yml` → `platform-build` job.
|
|
|
|
## Current floors (2026-04-23)
|
|
|
|
| Gate | Threshold | What fails |
|
|
|---|---|---|
|
|
| **Total floor** | `25%` | `go tool cover -func` reports total below floor |
|
|
| **Critical-path per-file floor** | `10%` | Any non-test source file in a security-critical path with coverage ≤10% |
|
|
| **Per-file report** | advisory | Printed in CI log, sorted worst-first, does not fail |
|
|
|
|
Total floor starts at 25% (unchanged from pre-#1823 to keep this PR strictly
|
|
additive). The new protection is the critical-path per-file floor, which
|
|
directly closes the gap that prompted the issue. Ratchet plan below begins
|
|
the month after to let the team first observe the gate in action.
|
|
|
|
## Security-critical paths (Gate 2)
|
|
|
|
Changes to these paths have historically introduced security issues (CWE-22,
|
|
CWE-78, KI-005, SSRF) or billing/auth risk. Coverage must not drop to zero.
|
|
|
|
- `internal/handlers/tokens*`
|
|
- `internal/handlers/workspace_provision*`
|
|
- `internal/handlers/a2a_proxy*`
|
|
- `internal/handlers/registry*`
|
|
- `internal/handlers/secrets*`
|
|
- `internal/middleware/wsauth*`
|
|
- `internal/crypto*`
|
|
|
|
## Ratchet plan
|
|
|
|
Floor ratchets upward on a fixed cadence. Any ratchet is a PR — reviewable,
|
|
reversible, and creates history. The table below is the intended schedule.
|
|
|
|
| Date | Total floor | Critical-path floor | Notes |
|
|
|---|---|---|---|
|
|
| 2026-04-23 | 25% | 10% | Initial gate (this file). |
|
|
| 2026-05-23 | 30% | 20% | First ratchet |
|
|
| 2026-06-23 | 40% | 30% | |
|
|
| 2026-07-23 | 50% | 40% | |
|
|
| 2026-08-23 | 55% | 50% | |
|
|
| 2026-09-23 | 60% | 60% | |
|
|
| 2026-10-23 | 70% | 70% | Target steady-state |
|
|
|
|
The target end-state matches the per-role QA prompts which specify
|
|
"coverage >80% on changed files". CI enforces the floor; reviewers still
|
|
enforce the per-PR bar.
|
|
|
|
## Exceptions
|
|
|
|
If a critical-path file genuinely cannot have coverage above the floor (e.g.
|
|
thin wrapper around a third-party SDK with no branches to test), add an entry
|
|
here with:
|
|
|
|
1. **File**: `internal/handlers/example.go`
|
|
2. **Reason**: Why coverage can't hit the floor
|
|
3. **Tracking issue**: GitHub issue for the real fix
|
|
4. **Expiry**: 14 days from entry date; after expiry either coverage is fixed
|
|
or the issue is closed as "accepted technical debt"
|
|
|
|
### Active exceptions
|
|
|
|
*(none — add here if you need to land code that legitimately can't clear the floor)*
|
|
|
|
## Why this gate exists
|
|
|
|
Issue #1823: an external audit found critical files at 0% coverage despite
|
|
test files existing with hundreds of lines. The existing CI step measured
|
|
coverage but didn't enforce a meaningful threshold. Any file could go from
|
|
80% → 0% and CI stayed green, because the single gate (total ≥25%) ignored
|
|
per-file distribution.
|
|
|
|
This gate makes "no untested critical paths merged" a mechanical property of
|
|
the CI, not a behavioural property of QA agents or individual reviewers —
|
|
which is the only way to make it survive fleet outages, agent rotations, or
|
|
QA process changes.
|