molecule-ai/molecule-core

Fork 2

Files

T

History

core-devops 98323734ea

Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 10s

Details

Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 15s

Details

qa-review / approved (pull_request) Failing after 14s

Details

CI / Detect changes (pull_request) Successful in 24s

Details

security-review / approved (pull_request) Failing after 12s

Details

sop-tier-check / tier-check (pull_request) Successful in 12s

Details

E2E API Smoke Test / detect-changes (pull_request) Successful in 26s

Details

gate-check-v3 / gate-check (pull_request) Successful in 22s

Details

Handlers Postgres Integration / detect-changes (pull_request) Successful in 26s

Details

E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 27s

Details

Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 26s

Details

CI / Platform (Go) (pull_request) Successful in 5s

Details

CI / Canvas (Next.js) (pull_request) Successful in 4s

Details

CI / Shellcheck (E2E scripts) (pull_request) Successful in 5s

Details

CI / Canvas Deploy Reminder (pull_request) Has been skipped

Details

E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 7s

Details

CI / Python Lint & Test (pull_request) Successful in 7s

Details

Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 6s

Details

Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 6s

Details

E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 7s

Details

CI / all-required (pull_request) Successful in 3s

Details

feat(ci): status-reaper rev2 sweeps last 10 main commits (closes stranded-status gap)

rev1 (PR #618, merged 4db64bcb) only inspected the CURRENT main HEAD per
tick. Schedule workflows post `failure` to whatever SHA was HEAD when the
run COMPLETED, which by the next */5 tick is usually a stale commit
because main has already moved forward via merges. Result: rev1 was
running successfully but with `compensated:0` on every tick across ~6
cycles (orchestrator + hongming-pc2 Phase 1+2 evidence 23:46Z / 23:59Z /
00:02Z); reds stranded on stale commits.

rev2 sweeps the last 10 main commits per tick:

- New `list_recent_commit_shas(branch, limit)` wraps
  GET /repos/{o}/{r}/commits?sha={branch}&limit={limit}. Vendor-truth
  probe 2026-05-11 confirms Gitea 1.22.6 returns a JSON list of commit
  objects with `sha` keys (per `feedback_smoke_test_vendor_truth_not_
  shape_match`).
- New `reap_branch()` orchestrates the sweep:
  - For each SHA: GET combined status with PER-SHA ERROR ISOLATION
    (refinement #7) — ApiError on one stale SHA logs `::warning::` and
    continues to the next. Different from the single-HEAD pre-rev2 path
    where fail-loud was correct; the sweep is best-effort across
    historical commits.
  - When `combined.state == "success"`: skip the per-context loop
    entirely (refinement #2, cost optimization, common case).
  - Otherwise delegate to the existing per-SHA `reap()` worker (logic
    UNCHANGED — `_has_push_trigger` / `parse_push_context` /
    `scan_workflows` not touched per refinement #6).
- Aggregated counters preserve all rev1 fields PLUS:
  - `scanned_shas`: how many SHAs we actually iterated (always 10
    in normal operation; less if commits API returns fewer)
  - `compensated_per_sha`: {<full_sha>: [<context>, ...]} for the
    SHAs that actually got at least one compensation
- `reap()` now also returns `compensated_contexts` so `reap_branch()`
  can build `compensated_per_sha` without re-deriving it from the POST
  stream. Backwards-compatible — all existing test assertions check
  specific counter keys, none enforce a closed dict shape.
- `main()` switches from `get_head_sha` + `get_combined_status` + `reap`
  to a single `reap_branch()` call. Adds `--limit` CLI flag for
  ops-driven sweep-width tuning (default 10).

Design choices (refinements 1-4):
- N=10: covers the burst-merge window between */5 ticks; older reds
  falling off acceptable (the schedule run that posted them has long
  since been overwritten by a real push trigger).
- Skip combined=success early: most commits in the window will be green;
  short-circuit before the per-context loop saves work.
- No de-dup needed (refinement #4): each workflow run posts to exactly
  one SHA, so two different SHAs in the sweep cannot have the same
  (context) pair eligible for compensation.

Test suite: 37 + 3 = 40/40 cases pass.
- New: test_reap_sweeps_n_shas_smoke (mock 3 SHAs, verify each GET'd)
- New: test_reap_skips_combined_success_shas (verify the
  combined=success short-circuit; only the 1 failure SHA is iterated)
- New: test_reap_continues_on_per_sha_apierror (per-SHA error isolation
  contract — ApiError on SHA[0] logged + skipped + SHA[1] processes)
- All 37 existing rev1 tests pass unchanged (per-SHA worker logic + the
  helpers it consumes are untouched).

Live dry-run smoke against git.moleculesai.app:
  scanned 41 workflows; push-triggered=18, class-O candidates=23
  summary: {"branch":"main","compensated":0,"compensated_per_sha":{},
           "dry_run":true,"limit":10,"preserved_non_failure":196,
           ...,"scanned_shas":10}

Cross-link:
- internal#327 (sibling publish-runtime-bot)
- task #90 (orchestrator brief), task #46 (hongming-pc2 brief)
- PR #618 (parent rev1, merge 4db64bcb)
- `reference_post_suspension_pipeline`
- `feedback_no_shared_persona_token_use` (commit author = core-devops, not hongming-pc2)
- `feedback_strict_root_only_after_class_a` (root cause, not symptom)
- `feedback_brief_hypothesis_vs_evidence` (evidence: compensated:0 across 6 cycles)

Removal path: drop this workflow when Gitea >= 1.24 ships with a real
fix for the hardcoded-suffix bug. Audit issue (filed alongside rev1)
tracks the deletion as a follow-up sweep.

2026-05-11 18:41:39 -07:00

e2e

fix(ci): canonicalize MOLECULE_STAGING_ADMIN_TOKEN -> CP_STAGING_ADMIN_API_TOKEN (post-#443 rebase) + drop staging-smoke continue-on-error

2026-05-11 04:33:56 -07:00

harness

ci(docker): pin base image digests in all Dockerfiles

2026-05-09 23:56:39 +00:00

ops

ops: add Railway SHA-pin drift audit script + regression test (#2001 )

2026-04-27 05:01:23 -07:00

README.md

chore: final open-source cleanup — binary, stale paths, private refs

2026-04-18 00:38:55 -07:00

test_ci_required_drift.py

feat(internal#219 §4+§6): port ci-required-drift + audit-force-merge sidecar from CP

2026-05-11 00:35:25 -07:00

test_main_red_watchdog.py

feat(ci): main-red watchdog (Option C of main-never-red directive) — closes #420

2026-05-11 00:36:20 -07:00

test_status_reaper.py

feat(ci): status-reaper rev2 sweeps last 10 main commits (closes stranded-status gap)

2026-05-11 18:41:39 -07:00

README.md

Tests

This repo uses the standard monorepo testing convention: unit tests live with their package, cross-component E2E tests live here.

Where to find tests

Scope	Location
Go unit + integration (platform, CLI, handlers)	`workspace-server/*/_test.go` — run with `cd workspace-server && go test -race ./...`
TypeScript unit (canvas components, hooks, store)	`canvas/src/**/__tests__/` — run with `cd canvas && npm test -- --run`
TypeScript unit (MCP server handlers)	`mcp-server/src/__tests__/` — run with `cd mcp-server && npx jest`
Python unit (workspace runtime, adapters)	`workspace/tests/` — run with `cd workspace && python3 -m pytest`
Python unit (SDK: plugin + remote agent)	`sdk/python/tests/` — run with `cd sdk/python && python3 -m pytest`
Cross-component E2E (spans platform + runtime + HTTP)	`tests/e2e/` ← you are here

Why split this way

Go requires co-located _test.go files to access unexported symbols.
Per-package test commands keep the inner loop fast — changing canvas doesn't re-run Go tests.
tests/e2e/ covers scenarios that no single package owns: a full workspace lifecycle, A2A across two provisioned agents, delegation chains, bundle round-trips.

Running E2E

Every E2E script here assumes the platform is running at localhost:8080 and (where noted) provisioned agents are online. See the header comment of each .sh for specifics.

Cleaning up rogue test workspaces

If an E2E run aborts before its teardown runs (Ctrl-C, crash, CI timeout), the platform can be left with workspaces whose config volume is stale or empty — Docker's unless-stopped restart policy then spins those containers in a FileNotFoundError loop. The platform's pre-flight check (#17) marks such workspaces failed on the next restart, but a manual cleanup is useful:

bash scripts/cleanup-rogue-workspaces.sh               # deletes ws with id/name starting aaaaaaaa-, bbbbbbbb-, cccccccc-, test-ws-
MOLECULE_URL=http://host:8080 bash scripts/cleanup-rogue-workspaces.sh

The script DELETEs each matching workspace via the API and force-removes the ws-<id[:12]> container as a belt-and-suspenders fallback.