Two docs covering load-bearing patterns from today's work that weren't previously discoverable: 1. workspace/platform_tools/README.md — explains the ToolSpec single-source-of-truth pattern (#2240), the CLI-block alignment gap that hand-maintained generation can't close (#2258), the snapshot golden files + LF-pinning (#2260), and the add/rename/ remove playbook. The next reader who lands in workspace/platform_tools/ now has the design rationale + the safe-edit procedure colocated with the code. 2. scripts/README.md — disambiguates the three measure-coordinator- task-bounds.sh files that now exist across two repos: - scripts/measure-coordinator-task-bounds.sh (canonical OSS, this repo) - scripts/measure-coordinator-task-bounds-runner.sh (Hermes/MiniMax variant, this repo) - scripts/measure-coordinator-task-bounds.sh (production-shape, in molecule-controlplane) Cross-references reference_harness_pair_pattern (auto-memory) for the cross-repo design rationale. Documents the common safety pattern (cleanup trap, DRY_RUN, non-target guard, cleanup_*_failed events) and the heartbeat-trace caveat. Refs: #2240, #2254, #2257, #2258, #2259, #2260; molecule-controlplane#321. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
48 lines
2.8 KiB
Markdown
48 lines
2.8 KiB
Markdown
# scripts/
|
|
|
|
Operational and one-off scripts for molecule-core. Most are
|
|
self-documenting — see the header comments in each file.
|
|
|
|
## RFC #2251 coordinator task-bound harnesses
|
|
|
|
There are three related scripts; pick the right one:
|
|
|
|
| Script | Purpose | Targets |
|
|
|---|---|---|
|
|
| `measure-coordinator-task-bounds.sh` | **Canonical** v1 harness for the RFC #2251 / Issue 4 reproduction. Provisions a PM coordinator + Researcher child via `claude-code-default` + `langgraph` templates, sends a synthesis-heavy A2A kickoff, observes elapsed time + heartbeat trace. | OSS-shape platform — localhost or any `/workspaces`-shaped endpoint. Has tenant/admin-token guards for non-localhost runs. |
|
|
| `measure-coordinator-task-bounds-runner.sh` | Generalised runner for the same measurement contract but with **arbitrary template + secret + model combinations** (Hermes/MiniMax, etc.). Useful for cross-runtime variants without modifying the canonical harness. | Same as above (local or SaaS via `MODE=saas`). |
|
|
| `measure-coordinator-task-bounds.sh` (in [molecule-controlplane](https://github.com/Molecule-AI/molecule-controlplane)) | **Production-shape** variant that bootstraps a real staging tenant via `POST /cp/admin/orgs`, then runs the same measurement against `<slug>.staging.moleculesai.app`. | Staging controlplane only — refuses to run against production. |
|
|
|
|
See `reference_harness_pair_pattern` (auto-memory) for when to use which
|
|
and the cross-repo design rationale.
|
|
|
|
### Common safety pattern across all three
|
|
|
|
- **Cleanup trap** on EXIT/INT/TERM auto-deletes provisioned resources.
|
|
- **`DRY_RUN=1`** prints plan + auth fingerprint, exits before any
|
|
state mutation. Run this before pointing at staging or any shared
|
|
infrastructure.
|
|
- **Non-target guard** refuses arbitrary endpoints (the controlplane
|
|
variant is locked to `staging-api.moleculesai.app`; the OSS variant
|
|
requires explicit auth + tenant scoping for non-localhost PLATFORM).
|
|
- **Cleanup failures emit `cleanup_*_failed` events** with remediation
|
|
hints; no silenced curl. ADMIN_TOKEN expiring mid-run surfaces as a
|
|
structured event rather than a silent leak.
|
|
|
|
### Heartbeat trace caveat
|
|
|
|
If `heartbeat_trace.raw == "<endpoint_unavailable>"`, the per-workspace
|
|
`/heartbeat-history` endpoint isn't wired on the target build — the
|
|
bound measurement is INCONCLUSIVE on the platform-ceiling question.
|
|
Either wire the endpoint or replace with the equivalent Datadog query.
|
|
|
|
## Other scripts
|
|
|
|
- `cleanup-rogue-workspaces.sh` — emergency teardown for leaked
|
|
workspaces. Prompts for confirmation. Pair with the harnesses if a
|
|
cleanup trap fails (see `cleanup_*_failed` events).
|
|
- `canary-smoke.sh` — quick smoke test for canary releases.
|
|
- `dev-start.sh` — local-dev platform bring-up.
|
|
|
|
The rest are self-documenting in their header comments.
|