6a9c4fb89b
RCA #1769 Finding 1: add local invariant rationale to lint/type suppressions that lack a local explanation. - sop-checklist.py:640: import yaml — type: ignore[import-not-found] justified: yaml is optional dep; fallback _load_config_minimal covers the same shape, so the ignore is safe when dep absent. - sop-checklist.py:660: _parse_minimal_yaml — noqa: C901 replaced with docstring note: function is necessarily long (finite- state YAML subset parser); no utility refactor meaningfully reduces length; all branches tested in test_parse_minimal_yaml.py. - sop-checklist.py:1030,1037: client._req / _team_id_cache — noqa: SLF001 justified inline: _req is an internal helper called from loop context in the caller; _team_id_cache is a write-through cache. - check_migration_collisions.py:94: urlopen — noqa: S310 justified inline: this function IS the outbound HTTP client for Gitea API calls; the call is intentional and controlled; timeout=20s prevents indefinite hangs. wheel_smoke.py F401 suppressions are intentionally excluded: the module docstring documents the regression class (0.1.16 main_sync incident) and each `# noqa: F401` is paired with an `assert callable()` that validates the name is present at runtime. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
scripts/
Operational and one-off scripts for molecule-core. Most are self-documenting — see the header comments in each file.
RFC #2251 coordinator task-bound harnesses
There are three related scripts; pick the right one:
| Script | Purpose | Targets |
|---|---|---|
measure-coordinator-task-bounds.sh |
Canonical v1 harness for the RFC #2251 / Issue 4 reproduction. Provisions a PM coordinator + Researcher child via claude-code-default + claude-code templates, sends a synthesis-heavy A2A kickoff, observes elapsed time + activity trace. |
OSS-shape platform — localhost or any /workspaces-shaped endpoint. Has tenant/admin-token guards for non-localhost runs. |
measure-coordinator-task-bounds-runner.sh |
Generalised runner for the same measurement contract but with arbitrary template + secret + model combinations (Hermes/MiniMax, etc.). Useful for cross-runtime variants without modifying the canonical harness. | Same as above (local or SaaS via MODE=saas). |
measure-coordinator-task-bounds.sh (in molecule-controlplane) |
Production-shape variant that bootstraps a real staging tenant via POST /cp/admin/orgs, then runs the same measurement against <slug>.staging.moleculesai.app. |
Staging controlplane only — refuses to run against production. |
See reference_harness_pair_pattern (auto-memory) for when to use which
and the cross-repo design rationale.
Common safety pattern across all three
- Cleanup trap on EXIT/INT/TERM auto-deletes provisioned resources.
DRY_RUN=1prints plan + auth fingerprint, exits before any state mutation. Run this before pointing at staging or any shared infrastructure.- Non-target guard refuses arbitrary endpoints (the controlplane
variant is locked to
staging-api.moleculesai.app; the OSS variant requires explicit auth + tenant scoping for non-localhost PLATFORM). - Cleanup failures emit
cleanup_*_failedevents with remediation hints; no silenced curl. ADMIN_TOKEN expiring mid-run surfaces as a structured event rather than a silent leak.
Activity trace caveat
If activity_trace.raw == "<endpoint_unavailable>", the per-workspace
/activity endpoint isn't wired on the target build — the bound
measurement is INCONCLUSIVE on the platform-ceiling question. Either
wire the endpoint or replace with the equivalent Datadog query. Note
that /activity accepts a since_secs query parameter; see the
endpoint handler for the supported range.
Other scripts
cleanup-rogue-workspaces.sh— emergency teardown for leaked workspaces. Prompts for confirmation. Pair with the harnesses if a cleanup trap fails (seecleanup_*_failedevents).staging-smoke.sh— quick smoke test for the staging canary fleet (formerlycanary-smoke.sh).dev-start.sh— local-dev platform bring-up.
The rest are self-documenting in their header comments.