The GitHub org Molecule-AI was suspended on 2026-05-06; canonical SCM is now Gitea at https://git.moleculesai.app/molecule-ai/. Stale github.com/Molecule-AI/... URLs return 404 and break tooling that clones / pip-installs / curls them. This bundles all non-Go-module URL fixes for this repo into a single PR. Go module path references (in *.go, go.mod, go.sum) are out of scope here -- tracked separately under Task #140. Token-auth clone URLs also flip ${GITHUB_TOKEN} -> ${GITEA_TOKEN} since the GitHub token does not auth against Gitea. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
6.4 KiB
Testing Strategy
Status: Policy. Update when tier definitions or thresholds change. Audience: Everyone writing or reviewing code in this repo. Cross-refs: backends.md, pr-hygiene.md, postmortem-2026-04-23-boot-event-401.md
The short version
- Don't chase 100% coverage. The last 15-20% costs as much as the first 80% and mostly adds brittle tests of trivial getters, error branches that can't fire, and stdlib wrappers.
- Different code classes have different floors. Auth at 80% is scarier than a DTO at 50%. Match the test investment to the risk.
- Tests should pay rent. A test that runs lines but asserts nothing meaningful isn't catching bugs — it's just dragging refactors down.
Tiered coverage floors
Every Go package, every TypeScript module, every Python module fits one of these tiers. The tier determines the minimum acceptable coverage — and the review standard.
| Tier | Examples | Line floor | Branch floor | Review standard |
|---|---|---|---|---|
| 1. Auth / secrets / crypto | tokens, session_auth, wsauth_middleware, crypto/envelope, cp_tenant_auth |
90% | 85% | Every branch tested. Adversarial scenarios (cross-tenant, expired token, null origin, malformed header). Timing considered. |
| 2. Handlers with side effects | workspace_provision, workspace_crud, container_files, terminal, registry |
75% | 70% | Happy + main error paths. DB mocks. Ownership / tenant-isolation checks. |
| 3. State machines + workers | scheduler, provisioner, healthsweep, orphan-sweeper, boot_ready |
75% | 70% | Every state transition tested, plus the transitions that shouldn't fire. |
| 4. Config / business logic | budget, orgtoken (validation), templates, derive-provider, redaction |
70% | 65% | Standard unit-test territory. Table-driven preferred. |
| 5. Plain DTOs / generated | models/*, proto-generated Go, TypeScript interfaces |
none | none | Writing tests here is theatre. Don't. |
| 6. CLI glue / cmd/* | cmd/server, cmd/molecli |
smoke only | — | Integration tests / E2E cover these. One startup-smoke test per binary. |
| 7. Third-party wrappers | awsapi, cloudflareapi, stripeapi, neonapi |
integration | — | Unit tests mock vendor shape, not behavior. Real behavior covered by staging integration. |
Why a blanket percentage is wrong
- A
models/package at 90% means you wrote tests forfunc (w Workspace) ID() string { return w.id }. No bugs caught, but coverage number is green. - A
tokenspackage at 75% means some rejection branch isn't covered. Maybe the exact branch that lets a revoked token still authenticate. - Blanket targets make the first case look equivalent to the second. They aren't.
Current state (as of 2026-04-23)
Run go test ./... -cover in each repo for up-to-date numbers. Snapshot:
workspace-server (Go)
| Package | Actual | Tier | Target | Gap |
|---|---|---|---|---|
internal/handlers/tokens.go |
0% | 1 | 90% | 90 |
internal/handlers/workspace_provision.go |
0% | 2 | 75% | 75 |
internal/middleware/wsauth_middleware.go |
~48% | 1 | 90% | 42 |
internal/provisioner |
45% | 3 | 75% | 30 |
internal/scheduler |
49% | 3 | 75% | 26 |
internal/channels |
40% | 4 | 70% | 30 |
internal/orgtoken |
88% | 4 | 70% | — |
internal/crypto |
91% | 1 | 90% | — |
internal/supervised |
93% | 3 | 75% | — |
internal/plugins |
94% | 4 | 70% | — |
internal/envx |
100% | 5 | none | — |
molecule-controlplane (Go)
| Package | Actual | Tier | Target | Gap |
|---|---|---|---|---|
internal/awsapi |
18% | 7 | integration | — |
internal/provisioner |
48% | 3 | 75% | 27 |
internal/handlers |
60% | 2 | 75% | 15 |
internal/billing |
60% | 4 | 70% | 10 |
internal/crypto |
68-80% | 1 | 90% | 10-22 |
internal/auth |
96% | 1 | 90% | — |
internal/middleware |
97% | 1 | 90% | — |
internal/reserved |
100% | 5 | none | — |
internal/httpx |
100% | 4 | 70% | — |
canvas (TypeScript)
No coverage instrumentation today. 900 tests / 58 files pass, but coverage isn't measured. See issue #1815 for the fix: set a 70% line floor in vitest.config.ts and gate CI on it.
workspace (Python)
No pytest/coverage config. See issue #1818: set up pytest-cov with --cov-fail-under=75 (ratchet from current baseline over 2-3 weeks).
Writing a good test
A good test:
- Asserts a specific outcome, not that a function runs without error.
- Covers the exact branch that bugs would live in — cross-tenant access, revoked-but-cached token, race on state transition.
- Uses table-driven patterns when the code is a dispatch with N cases. One test row per case.
- Mocks at system boundaries (DB, HTTP, time), not at internal package boundaries.
- Survives refactors — tests behavior, not internal state.
A bad test:
- Tests a getter that just returns a field.
- Mocks the function under test itself.
- Relies on
time.Sleepor clock timing to assert order. - Asserts
nil == nilto boost coverage.
Enforcement
CI gates
- Go:
go test ./... -cover+ a pre-commit script that compares coverage to.coverage-baselineand fails on drops > 2 points in a tier-1 package. - TypeScript:
vitest --coveragewith thresholds invitest.config.ts. Fails CI if below. - Python:
pytest --cov-fail-under=75in the Python CI job.
Review expectations
- Any PR touching a tier-1 package that lowers its coverage needs an explicit reviewer sign-off and justification.
- New code should arrive at or above its tier's floor.
- Untested files in tier-1 or tier-2 should be flagged in review, not waved through.
Related
- Issue #1821 — policy tracking issue
- Issue #1815 — Canvas coverage instrumentation
- Issue #1818 — Python pytest-cov
- Issue #1814 — workspace_provision_test.go unblock
- Issue #1816 — tokens.go coverage
- Issue #1819 — wsauth_middleware coverage