molecule-ai/molecule-core

Fork 2

claude-ceo-assistant 3501e6bfd7

CodeQL / Analyze (${{ matrix.language }}) (go) (pull_request) Successful in 13s

Details

CodeQL / Analyze (${{ matrix.language }}) (python) (pull_request) Successful in 11s

Details

CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (pull_request) Successful in 12s

Details

Check merge_group trigger on required workflows / Required workflows have merge_group trigger (pull_request) Successful in 15s

Details

Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 27s

Details

CI / Detect changes (pull_request) Successful in 20s

Details

Retarget main PRs to staging / Retarget to staging (pull_request) Has been skipped

Details

Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 15s

Details

E2E API Smoke Test / detect-changes (pull_request) Successful in 51s

Details

E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 51s

Details

Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 39s

Details

Handlers Postgres Integration / detect-changes (pull_request) Successful in 51s

Details

Harness Replays / detect-changes (pull_request) Successful in 53s

Details

Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 48s

Details

Ops Scripts Tests / Ops scripts (unittest) (pull_request) Successful in 1m7s

Details

CI / Shellcheck (E2E scripts) (pull_request) Successful in 31s

Details

Harness Replays / Harness Replays (pull_request) Failing after 1m18s

Details

E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 2m19s

Details

Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 3m14s

Details

Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 6m1s

Details

E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 6m47s

Details

CI / Python Lint & Test (pull_request) Successful in 8m16s

Details

CI / Canvas (Next.js) (pull_request) Failing after 9m36s

Details

CI / Canvas Deploy Reminder (pull_request) Has been skipped

Details

CI / Platform (Go) (pull_request) Successful in 12m18s

Details

fix(post-suspension): vanity import paths go.moleculesai.app/core/{platform,tests/harness/cp-stub} (closes molecule-ai/internal#71 phase 2)

Migrates the two Go modules under molecule-core off the dead
github.com/Molecule-AI/molecule-monorepo/... identity onto the vanity
host go.moleculesai.app. Also fixes the historical naming
inconsistency where the Gitea repo is molecule-core but the Go module
path said molecule-monorepo.

Module changes:
- workspace-server/go.mod:
    github.com/Molecule-AI/molecule-monorepo/platform
    -> go.moleculesai.app/core/platform
- tests/harness/cp-stub/go.mod:
    github.com/Molecule-AI/molecule-monorepo/tests/harness/cp-stub
    -> go.moleculesai.app/core/tests/harness/cp-stub

Surfaces touched
- 174 *.go files (374 import lines) — every import under
  workspace-server/ + tests/harness/cp-stub/
- 2 Dockerfiles (workspace-server/Dockerfile + Dockerfile.tenant) —
  -ldflags strings updated in lockstep with the module rename so
  buildinfo.GitSHA injection still resolves correctly
- README + docs + scripts + comment URLs to git.moleculesai.app form
- NEW workspace-server/internal/lint/import_path_lint_test.go —
  structural lint gate rejecting future github.com/Molecule-AI/ or
  Molecule-AI/molecule-monorepo references. Identical template to the
  other migration PRs (plugin-gh-identity#3, molecule-cli#2,
  molecule-controlplane#32).

Cross-repo dep allowlist (documented in lint gate)
workspace-server requires molecule-ai-plugin-gh-identity, whose own
vanity migration is PR molecule-ai-plugin-gh-identity#3. Until that PR
merges + a tag is cut at go.moleculesai.app/plugin/gh-identity, the
two locations referencing the legacy github.com path
(workspace-server/go.mod require, cmd/server/main.go import) remain
allowlisted. Follow-up PR drops the allowlist + updates both refs in
one shot once gh-identity is fully migrated.

Test plan
- go build ./... clean for both modules
- go test ./... green except two pre-existing failures
  (TestStartSweeper_RecordsMetricsOnSuccess flaky-on-suite,
  TestLocalResolver_BubblesUpCopyFailure relies on read-only fs perms
  but runs as root on operator host) — both reproduce identically on
  baseline main pre-migration; NOT regressions of this PR
- Mutation-tested: lint gate fails on canaries in .go + .md;
  allowlist correctly suppresses cross-repo dep references in go.mod
  while still flagging unrelated additions

Open dependency
- go.moleculesai.app responder must be deployed before fresh-clone
  external builds resolve the vanity path. Existing CI / Docker builds
  ride pinned go.sum + self-referential module path + responder is
  not on critical path for those.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

2026-05-07 22:37:42 +00:00

6.4 KiB

Raw Permalink Blame History

Testing Strategy

Status: Policy. Update when tier definitions or thresholds change. Audience: Everyone writing or reviewing code in this repo. Cross-refs: backends.md, pr-hygiene.md, postmortem-2026-04-23-boot-event-401.md

The short version

Don't chase 100% coverage. The last 15-20% costs as much as the first 80% and mostly adds brittle tests of trivial getters, error branches that can't fire, and stdlib wrappers.
Different code classes have different floors. Auth at 80% is scarier than a DTO at 50%. Match the test investment to the risk.
Tests should pay rent. A test that runs lines but asserts nothing meaningful isn't catching bugs — it's just dragging refactors down.

Tiered coverage floors

Every Go package, every TypeScript module, every Python module fits one of these tiers. The tier determines the minimum acceptable coverage — and the review standard.

Tier	Examples	Line floor	Branch floor	Review standard
1. Auth / secrets / crypto	`tokens`, `session_auth`, `wsauth_middleware`, `crypto/envelope`, `cp_tenant_auth`	90%	85%	Every branch tested. Adversarial scenarios (cross-tenant, expired token, null origin, malformed header). Timing considered.
2. Handlers with side effects	`workspace_provision`, `workspace_crud`, `container_files`, `terminal`, `registry`	75%	70%	Happy + main error paths. DB mocks. Ownership / tenant-isolation checks.
3. State machines + workers	`scheduler`, `provisioner`, `healthsweep`, `orphan-sweeper`, `boot_ready`	75%	70%	Every state transition tested, plus the transitions that shouldn't fire.
4. Config / business logic	`budget`, `orgtoken` (validation), `templates`, `derive-provider`, `redaction`	70%	65%	Standard unit-test territory. Table-driven preferred.
5. Plain DTOs / generated	`models/*`, proto-generated Go, TypeScript interfaces	none	none	Writing tests here is theatre. Don't.
6. CLI glue / cmd/*	`cmd/server`, `cmd/molecli`	smoke only	—	Integration tests / E2E cover these. One startup-smoke test per binary.
7. Third-party wrappers	`awsapi`, `cloudflareapi`, `stripeapi`, `neonapi`	integration	—	Unit tests mock vendor shape, not behavior. Real behavior covered by staging integration.

Why a blanket percentage is wrong

A models/ package at 90% means you wrote tests for func (w Workspace) ID() string { return w.id }. No bugs caught, but coverage number is green.
A tokens package at 75% means some rejection branch isn't covered. Maybe the exact branch that lets a revoked token still authenticate.
Blanket targets make the first case look equivalent to the second. They aren't.

Current state (as of 2026-04-23)

Run go test ./... -cover in each repo for up-to-date numbers. Snapshot:

workspace-server (Go)

Package	Actual	Tier	Target	Gap
`internal/handlers/tokens.go`	0%	1	90%	90
`internal/handlers/workspace_provision.go`	0%	2	75%	75
`internal/middleware/wsauth_middleware.go`	~48%	1	90%	42
`internal/provisioner`	45%	3	75%	30
`internal/scheduler`	49%	3	75%	26
`internal/channels`	40%	4	70%	30
`internal/orgtoken`	88%	4	70%	—
`internal/crypto`	91%	1	90%	—
`internal/supervised`	93%	3	75%	—
`internal/plugins`	94%	4	70%	—
`internal/envx`	100%	5	none	—

molecule-controlplane (Go)

Package	Actual	Tier	Target	Gap
`internal/awsapi`	18%	7	integration	—
`internal/provisioner`	48%	3	75%	27
`internal/handlers`	60%	2	75%	15
`internal/billing`	60%	4	70%	10
`internal/crypto`	68-80%	1	90%	10-22
`internal/auth`	96%	1	90%	—
`internal/middleware`	97%	1	90%	—
`internal/reserved`	100%	5	none	—
`internal/httpx`	100%	4	70%	—

canvas (TypeScript)

No coverage instrumentation today. 900 tests / 58 files pass, but coverage isn't measured. See issue #1815 for the fix: set a 70% line floor in vitest.config.ts and gate CI on it.

workspace (Python)

No pytest/coverage config. See issue #1818: set up pytest-cov with --cov-fail-under=75 (ratchet from current baseline over 2-3 weeks).

Writing a good test

A good test:

Asserts a specific outcome, not that a function runs without error.
Covers the exact branch that bugs would live in — cross-tenant access, revoked-but-cached token, race on state transition.
Uses table-driven patterns when the code is a dispatch with N cases. One test row per case.
Mocks at system boundaries (DB, HTTP, time), not at internal package boundaries.
Survives refactors — tests behavior, not internal state.

A bad test:

Tests a getter that just returns a field.
Mocks the function under test itself.
Relies on time.Sleep or clock timing to assert order.
Asserts nil == nil to boost coverage.

Enforcement

CI gates

Go: go test ./... -cover + a pre-commit script that compares coverage to .coverage-baseline and fails on drops > 2 points in a tier-1 package.
TypeScript: vitest --coverage with thresholds in vitest.config.ts. Fails CI if below.
Python: pytest --cov-fail-under=75 in the Python CI job.

Review expectations

Any PR touching a tier-1 package that lowers its coverage needs an explicit reviewer sign-off and justification.
New code should arrive at or above its tier's floor.
Untested files in tier-1 or tier-2 should be flagged in review, not waved through.

Issue #1821 — policy tracking issue
Issue #1815 — Canvas coverage instrumentation
Issue #1818 — Python pytest-cov
Issue #1814 — workspace_provision_test.go unblock
Issue #1816 — tokens.go coverage
Issue #1819 — wsauth_middleware coverage

6.4 KiB Raw Permalink Blame History

Testing Strategy

The short version

Tiered coverage floors

Why a blanket percentage is wrong

Current state (as of 2026-04-23)

workspace-server (Go)

molecule-controlplane (Go)

canvas (TypeScript)

workspace (Python)

Writing a good test

Enforcement

CI gates

Review expectations

Related

6.4 KiB

Raw Permalink Blame History