Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 16s

Details

Check merge_group trigger on required workflows / Required workflows have merge_group trigger (pull_request) Successful in 22s

Details

CI / Detect changes (pull_request) Successful in 24s

Details

E2E API Smoke Test / detect-changes (pull_request) Successful in 20s

Details

E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 21s

Details

pr-guards / disable-auto-merge-on-push (pull_request) Failing after 9s

Details

Handlers Postgres Integration / detect-changes (pull_request) Successful in 44s

Details

Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 38s

Details

Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 35s

Details

Harness Replays / detect-changes (pull_request) Successful in 44s

Details

Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 27s

Details

Ops Scripts Tests / Ops scripts (unittest) (pull_request) Successful in 56s

Details

CodeQL / Analyze (${{ matrix.language }}) (go) (pull_request) Failing after 2m1s

Details

CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (pull_request) Failing after 2m34s

Details

CodeQL / Analyze (${{ matrix.language }}) (python) (pull_request) Failing after 2m34s

Details

CI / Shellcheck (E2E scripts) (pull_request) Successful in 23s

Details

Harness Replays / Harness Replays (pull_request) Failing after 1m12s

Details

Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 2m51s

Details

E2E API Smoke Test / E2E API Smoke Test (pull_request) Failing after 5m37s

Details

Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 6m15s

Details

E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 6m34s

Details

CI / Python Lint & Test (pull_request) Successful in 8m20s

Details

CI / Canvas (Next.js) (pull_request) Successful in 9m46s

Details

CI / Canvas Deploy Reminder (pull_request) Has been skipped

Details

CI / Platform (Go) (pull_request) Failing after 13m23s

Details

fix(post-suspension): migrate github.com/Molecule-AI refs to git.moleculesai.app (Class G #168 )

The GitHub org Molecule-AI was suspended on 2026-05-06; canonical SCM
is now Gitea at https://git.moleculesai.app/molecule-ai/. Stale
github.com/Molecule-AI/... URLs return 404 and break tooling that
clones / pip-installs / curls them.

This bundles all non-Go-module URL fixes for this repo into a single PR.
Go module path references (in *.go, go.mod, go.sum) are out of scope
here -- tracked separately under Task #140.

Token-auth clone URLs also flip ${GITHUB_TOKEN} -> ${GITEA_TOKEN} since
the GitHub token does not auth against Gitea.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-05-07 13:08:15 -07:00

6.4 KiB

Raw Blame History

Testing Strategy

Status: Policy. Update when tier definitions or thresholds change. Audience: Everyone writing or reviewing code in this repo. Cross-refs: backends.md, pr-hygiene.md, postmortem-2026-04-23-boot-event-401.md

The short version

Don't chase 100% coverage. The last 15-20% costs as much as the first 80% and mostly adds brittle tests of trivial getters, error branches that can't fire, and stdlib wrappers.
Different code classes have different floors. Auth at 80% is scarier than a DTO at 50%. Match the test investment to the risk.
Tests should pay rent. A test that runs lines but asserts nothing meaningful isn't catching bugs — it's just dragging refactors down.

Tiered coverage floors

Every Go package, every TypeScript module, every Python module fits one of these tiers. The tier determines the minimum acceptable coverage — and the review standard.

Tier	Examples	Line floor	Branch floor	Review standard
1. Auth / secrets / crypto	`tokens`, `session_auth`, `wsauth_middleware`, `crypto/envelope`, `cp_tenant_auth`	90%	85%	Every branch tested. Adversarial scenarios (cross-tenant, expired token, null origin, malformed header). Timing considered.
2. Handlers with side effects	`workspace_provision`, `workspace_crud`, `container_files`, `terminal`, `registry`	75%	70%	Happy + main error paths. DB mocks. Ownership / tenant-isolation checks.
3. State machines + workers	`scheduler`, `provisioner`, `healthsweep`, `orphan-sweeper`, `boot_ready`	75%	70%	Every state transition tested, plus the transitions that shouldn't fire.
4. Config / business logic	`budget`, `orgtoken` (validation), `templates`, `derive-provider`, `redaction`	70%	65%	Standard unit-test territory. Table-driven preferred.
5. Plain DTOs / generated	`models/*`, proto-generated Go, TypeScript interfaces	none	none	Writing tests here is theatre. Don't.
6. CLI glue / cmd/*	`cmd/server`, `cmd/molecli`	smoke only	—	Integration tests / E2E cover these. One startup-smoke test per binary.
7. Third-party wrappers	`awsapi`, `cloudflareapi`, `stripeapi`, `neonapi`	integration	—	Unit tests mock vendor shape, not behavior. Real behavior covered by staging integration.

Why a blanket percentage is wrong

A models/ package at 90% means you wrote tests for func (w Workspace) ID() string { return w.id }. No bugs caught, but coverage number is green.
A tokens package at 75% means some rejection branch isn't covered. Maybe the exact branch that lets a revoked token still authenticate.
Blanket targets make the first case look equivalent to the second. They aren't.

Current state (as of 2026-04-23)

Run go test ./... -cover in each repo for up-to-date numbers. Snapshot:

workspace-server (Go)

Package	Actual	Tier	Target	Gap
`internal/handlers/tokens.go`	0%	1	90%	90
`internal/handlers/workspace_provision.go`	0%	2	75%	75
`internal/middleware/wsauth_middleware.go`	~48%	1	90%	42
`internal/provisioner`	45%	3	75%	30
`internal/scheduler`	49%	3	75%	26
`internal/channels`	40%	4	70%	30
`internal/orgtoken`	88%	4	70%	—
`internal/crypto`	91%	1	90%	—
`internal/supervised`	93%	3	75%	—
`internal/plugins`	94%	4	70%	—
`internal/envx`	100%	5	none	—

molecule-controlplane (Go)

Package	Actual	Tier	Target	Gap
`internal/awsapi`	18%	7	integration	—
`internal/provisioner`	48%	3	75%	27
`internal/handlers`	60%	2	75%	15
`internal/billing`	60%	4	70%	10
`internal/crypto`	68-80%	1	90%	10-22
`internal/auth`	96%	1	90%	—
`internal/middleware`	97%	1	90%	—
`internal/reserved`	100%	5	none	—
`internal/httpx`	100%	4	70%	—

canvas (TypeScript)

No coverage instrumentation today. 900 tests / 58 files pass, but coverage isn't measured. See issue #1815 for the fix: set a 70% line floor in vitest.config.ts and gate CI on it.

workspace (Python)

No pytest/coverage config. See issue #1818: set up pytest-cov with --cov-fail-under=75 (ratchet from current baseline over 2-3 weeks).

Writing a good test

A good test:

Asserts a specific outcome, not that a function runs without error.
Covers the exact branch that bugs would live in — cross-tenant access, revoked-but-cached token, race on state transition.
Uses table-driven patterns when the code is a dispatch with N cases. One test row per case.
Mocks at system boundaries (DB, HTTP, time), not at internal package boundaries.
Survives refactors — tests behavior, not internal state.

A bad test:

Tests a getter that just returns a field.
Mocks the function under test itself.
Relies on time.Sleep or clock timing to assert order.
Asserts nil == nil to boost coverage.

Enforcement

CI gates

Go: go test ./... -cover + a pre-commit script that compares coverage to .coverage-baseline and fails on drops > 2 points in a tier-1 package.
TypeScript: vitest --coverage with thresholds in vitest.config.ts. Fails CI if below.
Python: pytest --cov-fail-under=75 in the Python CI job.

Review expectations

Any PR touching a tier-1 package that lowers its coverage needs an explicit reviewer sign-off and justification.
New code should arrive at or above its tier's floor.
Untested files in tier-1 or tier-2 should be flagged in review, not waved through.

Issue #1821 — policy tracking issue
Issue #1815 — Canvas coverage instrumentation
Issue #1818 — Python pytest-cov
Issue #1814 — workspace_provision_test.go unblock
Issue #1816 — tokens.go coverage
Issue #1819 — wsauth_middleware coverage

6.4 KiB Raw Blame History

Testing Strategy

The short version

Tiered coverage floors

Why a blanket percentage is wrong

Current state (as of 2026-04-23)

workspace-server (Go)

molecule-controlplane (Go)

canvas (TypeScript)

workspace (Python)

Writing a good test

Enforcement

CI gates

Review expectations

Related

6.4 KiB

Raw Blame History