Two PRs targeting staging can each add a migration with the same numeric prefix (e.g. 044_*.up.sql). Each passes CI independently. They collide at merge time. Worst case: second migration silently doesn't apply and prod schema drifts from what the code expects. Caught manually 2026-04-30 during PR #2276 rebase: 044_runtime_image_pins collided with 044_platform_inbound_secret from RFC #2312. This workflow makes that detection automatic at PR-open time. How it works: scripts/ops/check_migration_collisions.py runs on every PR that touches workspace-server/migrations/**. For each new/modified migration filename, extracts the numeric prefix and checks: 1. Does the base branch already have a DIFFERENT migration file with the same prefix? (PR branched off an old base, base advanced and another PR landed the same number — needs rebase.) 2. Is another OPEN PR (not this one) also adding a migration with the same prefix? (Race-window collision — both pass CI separately, would collide at merge time.) Either case → exit 1 with a clear ::error:: message naming the conflicting PR(s) so the author knows what to renumber. Implementation notes: - Uses git ls-tree (not working-tree walk) so it works against any base ref without checkout. - Uses gh pr diff --name-only per open PR, bounded by `gh pr list --limit 100`. ~30s worst case for a busy repo, <5s normally. - --diff-filter=AM picks up Added or Modified — renaming a migration in place is also flagged (intentional; renaming migrations isn't safe). - Same filename in both PR and base = no collision (PR is editing in-place, fine). Tests: scripts/ops/test_check_migration_collisions.py — 9 cases on the regex classifier (the load-bearing piece). End-to-end git/gh path is exercised by running the workflow against real PRs. Hard-gates Tier 1 item 1 (#2341). Cheapest, cleanest gate. Catches one specific class of merge-time foot-gun automatically. Refs hard-gates discussion 2026-04-30. Tier 1 of 4 (others tracked in #2342, #2343, #2344). |
||
|---|---|---|
| .. | ||
| ops | ||
| build_runtime_package.py | ||
| build-images.sh | ||
| bundle-compile.sh | ||
| canary-smoke.sh | ||
| cleanup-rogue-workspaces.sh | ||
| clone-manifest.sh | ||
| dev-start.sh | ||
| import-agent.sh | ||
| lockdown-tenant-sg.sh | ||
| measure-coordinator-task-bounds-runner.sh | ||
| measure-coordinator-task-bounds.sh | ||
| nuke-and-rebuild.sh | ||
| post-rebuild-setup.sh | ||
| README.md | ||
| refresh-workspace-images.sh | ||
| rollback-latest.sh | ||
| test-a2a-cross-runtime.sh | ||
| test-all-adapters.sh | ||
| test-all.sh | ||
| test-cross-agent-chat.sh | ||
| test-nuke-and-rebuild.sh | ||
| test-team-e2e.sh | ||
scripts/
Operational and one-off scripts for molecule-core. Most are self-documenting — see the header comments in each file.
RFC #2251 coordinator task-bound harnesses
There are three related scripts; pick the right one:
| Script | Purpose | Targets |
|---|---|---|
measure-coordinator-task-bounds.sh |
Canonical v1 harness for the RFC #2251 / Issue 4 reproduction. Provisions a PM coordinator + Researcher child via claude-code-default + langgraph templates, sends a synthesis-heavy A2A kickoff, observes elapsed time + activity trace. |
OSS-shape platform — localhost or any /workspaces-shaped endpoint. Has tenant/admin-token guards for non-localhost runs. |
measure-coordinator-task-bounds-runner.sh |
Generalised runner for the same measurement contract but with arbitrary template + secret + model combinations (Hermes/MiniMax, etc.). Useful for cross-runtime variants without modifying the canonical harness. | Same as above (local or SaaS via MODE=saas). |
measure-coordinator-task-bounds.sh (in molecule-controlplane) |
Production-shape variant that bootstraps a real staging tenant via POST /cp/admin/orgs, then runs the same measurement against <slug>.staging.moleculesai.app. |
Staging controlplane only — refuses to run against production. |
See reference_harness_pair_pattern (auto-memory) for when to use which
and the cross-repo design rationale.
Common safety pattern across all three
- Cleanup trap on EXIT/INT/TERM auto-deletes provisioned resources.
DRY_RUN=1prints plan + auth fingerprint, exits before any state mutation. Run this before pointing at staging or any shared infrastructure.- Non-target guard refuses arbitrary endpoints (the controlplane
variant is locked to
staging-api.moleculesai.app; the OSS variant requires explicit auth + tenant scoping for non-localhost PLATFORM). - Cleanup failures emit
cleanup_*_failedevents with remediation hints; no silenced curl. ADMIN_TOKEN expiring mid-run surfaces as a structured event rather than a silent leak.
Activity trace caveat
If activity_trace.raw == "<endpoint_unavailable>", the per-workspace
/activity endpoint isn't wired on the target build — the bound
measurement is INCONCLUSIVE on the platform-ceiling question. Either
wire the endpoint or replace with the equivalent Datadog query. Note
that /activity accepts a since_secs query parameter; see the
endpoint handler for the supported range.
Other scripts
cleanup-rogue-workspaces.sh— emergency teardown for leaked workspaces. Prompts for confirmation. Pair with the harnesses if a cleanup trap fails (seecleanup_*_failedevents).canary-smoke.sh— quick smoke test for canary releases.dev-start.sh— local-dev platform bring-up.
The rest are self-documenting in their header comments.