molecule-ai/molecule-core

Fork 2

Files

T

History

Molecule AI Dev Engineer B (MiniMax) 08c2bd4d9a

CI / Python Lint & Test (pull_request) Successful in 3s

Details

sop-checklist / review-refire (pull_request_target) Has been skipped

Details

Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 4s

Details

Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 8s

Details

Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 5s

Details

Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 5s

Details

reserved-path-review / reserved-path-review (pull_request_target) Successful in 4s

Details

Handlers Postgres Integration / detect-changes (pull_request) Successful in 11s

Details

Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2s

Details

CI / Detect changes (pull_request) Successful in 15s

Details

sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, local-postgres-e2

Details

sop-checklist / na-declarations (pull_request) N/A: (none)

Details

E2E Chat / detect-changes (pull_request) Successful in 15s

Details

CI / Canvas (Next.js) (pull_request) Successful in 1s

Details

CI / Platform (Go) (pull_request) Successful in 1s

Details

sop-checklist / all-items-acked (pull_request_target) Successful in 12s

Details

E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 16s

Details

CI / Canvas Deploy Status (pull_request) Successful in 1s

Details

E2E Chat / E2E Chat (pull_request) Successful in 2s

Details

gate-check-v3 / gate-check (pull_request_target) Failing after 16s

Details

E2E API Smoke Test / detect-changes (pull_request) Successful in 21s

Details

E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 4s

Details

E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 3s

Details

Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (stub) (pull_request) Successful in 33s

Details

lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 53s

Details

Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (real image + MiniMax LLM, advisory) (pull_request) Failing after 19s

Details

Ops Scripts Tests / Ops scripts (unittest) (pull_request) Successful in 1m4s

Details

CI / Shellcheck (E2E scripts) (pull_request) Successful in 1m23s

Details

CI / all-required (pull_request) Successful in 4s

Details

reserved-path-review / reserved-path-review (pull_request_review) Successful in 4s

Details

qa-review / approved (pull_request_target) Approved via pull_request_review trigger

security-review / approved (pull_request_target) Approved via pull_request_review trigger

qa-review / approved (pull_request_review) Successful in 9s

Details

security-review / approved (pull_request_review) Successful in 9s

Details

audit-force-merge / audit (pull_request_target) Successful in 8s

Details

test(ops): fix hermetic CF-preflight test (Researcher 11116 REQUEST_CHANGES)

Three real bugs in the regression test, all surfaced by CI:

1) Mock server didn't reliably come up: the port-probe didn't use
   SO_REUSEADDR (so a freed probe port could TIME_WAIT the server's
   bind), and the readiness wait was a chained curl+grep shell
   pipeline (racy pipe-handle interactions under CI load). Replaced
   with a Python-based readiness probe (TCP connect + HTTP GET +
   JSON parse + status==active check, single source of truth) and a
   kill -0 on the server PID so a crash surfaces with stderr instead
   of timing out silently. Bumped the ceiling 10s -> 15s (75 * 0.2s)
   for busy runners.

2) Inactive-token case omits CF_ZONE_ID: only CF_API_TOKEN was set
   for case (b), so the script's 'need CF_ZONE_ID' guard short-
   circuited BEFORE the preflight and we never actually exercised
   the auth-failure path. Set the full ENV_TOKENS (same as the
   success case) for (b) so a missing CF_ZONE_ID can't mask the
   regression we want to catch.

3) EXPECTED_COUNT=3 was stale: the preflight addition brought the
   CF base refs in sweep-cf-orphans.sh from 3 to 4 (token-verify +
   zone-lookup in the preflight block, plus the original 2 in the
   sweep body). The patch-and-redirect test then replaced 4
   occurrences, not 3, and the count assertion failed. Updated to 4
   with a comment.

4) Server returned zone id 'zones' for active/down: the Python
   mock extracted zone_id from rest.split('/')[2] which is the
   literal 'zones' token, not the actual zone id (which lives at
   index 3 after the /client/v4/ prefix). Active/down cases then
   tripped the preflight's zone-mismatch check. Use seg[3] (with a
   seg[-1] fallback) and add a comment explaining the layout.

No change to the preflight behavior in scripts/ops/sweep-cf-orphans.sh
— only the test harness. The four critical behaviors are now
exercised deterministically:
  (a) active token + reachable zone -> preflight passes
  (b) inactive token               -> preflight fails fast, no gather
  (c) zone id mismatch              -> preflight fails on mismatch
  (d) 500 + non-JSON                 -> preflight fails on non-JSON

Locally verified: 'bash scripts/ops/test_sweep_cf_orphans_preflight.sh'
prints all four PASS lines and exits 0.

2026-06-12 16:33:45 +00:00

demo-freeze-snapshots

ops: demo-day freeze + rollback runbook

2026-05-01 12:04:30 -07:00

ops

test(ops): fix hermetic CF-preflight test (Researcher 11116 REQUEST_CHANGES)

2026-06-12 16:33:45 +00:00

build-images.sh

chore: retire unmaintained workspace runtimes

2026-05-23 23:45:09 -07:00

bundle-compile.sh

initial commit — Molecule AI platform

2026-04-13 11:55:37 -07:00

check-manifest-repos-exist.sh

fix(manifest): restore seo-agent + google-adk templates; auth the existence check

2026-06-05 17:00:10 -07:00

check-stale-promote-pr.sh

fix(ci): replace gh pr CLI with Gitea v1 REST in workflows + scripts (#75 class A)

2026-05-07 15:29:26 -07:00

cleanup-rogue-workspaces.sh

fix(provisioner): stop rogue config-missing restart loop (#17 )

2026-04-14 07:32:58 -07:00

clone-manifest.sh

feat: refresh workspace templates from repo cache

2026-05-25 12:05:05 -07:00

demo-day-runbook.md

chore: restrict maintained workspace runtimes

2026-05-24 19:48:00 -07:00

demo-freeze.sh

chore: restrict maintained workspace runtimes

2026-05-24 19:48:00 -07:00

demo-thaw.sh

ops: demo-day freeze + rollback runbook

2026-05-01 12:04:30 -07:00

dev-start.sh

harden(security): remove dev-mode fail-open auth — fail-closed everywhere + dev-token + regression gate

2026-06-05 01:02:48 -07:00

edge-429-probe.sh

chore(observability): edge-429 probe + ratelimit observability runbook

2026-05-07 15:48:34 -07:00

import-agent.sh

initial commit — Molecule AI platform

2026-04-13 11:55:37 -07:00

lockdown-tenant-sg.sh

feat(security): Phase 35.1 — SG lockdown script for tenant EC2 instances

2026-04-18 12:01:41 -07:00

measure-coordinator-task-bounds-runner.sh

chore: restrict maintained workspace runtimes

2026-05-24 19:48:00 -07:00

measure-coordinator-task-bounds.sh

chore: restrict maintained workspace runtimes

2026-05-24 19:48:00 -07:00

nuke-and-rebuild.sh

tech-debt: rename molecule-monorepo-net -> molecule-core-net

2026-05-09 20:51:48 +00:00

post-rebuild-setup.sh

security: remove hardcoded API keys from post-rebuild-setup.sh

2026-04-20 13:02:52 -07:00

promote-tenant-image.sh

fix(scripts): validate AWS region + ECR account ID in promote-tenant-image (#676 )

2026-06-07 23:46:22 +00:00

README.md

chore: restrict maintained workspace runtimes

2026-05-24 19:48:00 -07:00

refresh-workspace-images.sh

chore: retire unmaintained workspace runtimes

2026-05-23 23:45:09 -07:00

rollback-latest.sh

fix(scripts): migrate ghcr.io→ECR + raw.githubusercontent.com→Gitea (#46 )

2026-05-07 00:56:23 -07:00

staging-smoke.sh

refactor(ci): drop "canary-" prefix → staging-smoke/staging-verify (Hongming directive 2026-05-11) (#443 )

2026-05-11 11:25:29 +00:00

test-a2a-cross-runtime.sh

initial commit — Molecule AI platform

2026-04-13 11:55:37 -07:00

test-all-adapters.sh

chore: retire unmaintained workspace runtimes

2026-05-23 23:45:09 -07:00

test-all-runtimes-a2a-e2e.sh

test(e2e): give google-adk a hermes-class online window

2026-05-29 14:51:58 -07:00

test-all.sh

initial commit — Molecule AI platform

2026-04-13 11:55:37 -07:00

test-check-stale-promote-pr.sh

feat(ops): hourly alarm for auto-promote PR stuck on REVIEW_REQUIRED (#2975 )

2026-05-05 17:55:27 -07:00

test-cross-agent-chat.sh

chore: restrict maintained workspace runtimes

2026-05-24 19:48:00 -07:00

test-hermes-plugin-e2e.sh

test(e2e): unified A2A round-trip parity harness across all 4 runtimes

2026-05-02 04:36:23 -07:00

test-nuke-and-rebuild.sh

fix(scripts): nuke-and-rebuild self-bootstraps templates; add E2E test

2026-04-26 14:37:04 -07:00

test-promote-tenant-image.sh

fix(scripts): validate AWS region + ECR account ID in promote-tenant-image (#676 )

2026-06-07 23:46:22 +00:00

test-team-e2e.sh

chore: restrict maintained workspace runtimes

2026-05-24 19:48:00 -07:00

wheel_smoke.py

chore(ci): add line-local rationales for lint/type suppressions (mc#1769)

2026-05-27 20:33:06 +00:00

README.md

scripts/

Operational and one-off scripts for molecule-core. Most are self-documenting — see the header comments in each file.

RFC #2251 coordinator task-bound harnesses

There are three related scripts; pick the right one:

Script	Purpose	Targets
`measure-coordinator-task-bounds.sh`	Canonical v1 harness for the RFC #2251 / Issue 4 reproduction. Provisions a PM coordinator + Researcher child via `claude-code-default` + `claude-code` templates, sends a synthesis-heavy A2A kickoff, observes elapsed time + activity trace.	OSS-shape platform — localhost or any `/workspaces`-shaped endpoint. Has tenant/admin-token guards for non-localhost runs.
`measure-coordinator-task-bounds-runner.sh`	Generalised runner for the same measurement contract but with arbitrary template + secret + model combinations (Hermes/MiniMax, etc.). Useful for cross-runtime variants without modifying the canonical harness.	Same as above (local or SaaS via `MODE=saas`).
`measure-coordinator-task-bounds.sh` (in molecule-controlplane)	Production-shape variant that bootstraps a real staging tenant via `POST /cp/admin/orgs`, then runs the same measurement against `<slug>.staging.moleculesai.app`.	Staging controlplane only — refuses to run against production.

See reference_harness_pair_pattern (auto-memory) for when to use which and the cross-repo design rationale.

Common safety pattern across all three

Cleanup trap on EXIT/INT/TERM auto-deletes provisioned resources.
DRY_RUN=1 prints plan + auth fingerprint, exits before any state mutation. Run this before pointing at staging or any shared infrastructure.
Non-target guard refuses arbitrary endpoints (the controlplane variant is locked to staging-api.moleculesai.app; the OSS variant requires explicit auth + tenant scoping for non-localhost PLATFORM).
Cleanup failures emit cleanup_*_failed events with remediation hints; no silenced curl. ADMIN_TOKEN expiring mid-run surfaces as a structured event rather than a silent leak.

Activity trace caveat

If activity_trace.raw == "<endpoint_unavailable>", the per-workspace /activity endpoint isn't wired on the target build — the bound measurement is INCONCLUSIVE on the platform-ceiling question. Either wire the endpoint or replace with the equivalent Datadog query. Note that /activity accepts a since_secs query parameter; see the endpoint handler for the supported range.

Other scripts

cleanup-rogue-workspaces.sh — emergency teardown for leaked workspaces. Prompts for confirmation. Pair with the harnesses if a cleanup trap fails (see cleanup_*_failed events).
staging-smoke.sh — quick smoke test for the staging canary fleet (formerly canary-smoke.sh).
dev-start.sh — local-dev platform bring-up.

The rest are self-documenting in their header comments.