bcc72419ce
8 Commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
229b1a902a |
fix(ci): pre-clone manifest deps in harness-replays workflow (#173 followup)
Some checks failed
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 15s
Check merge_group trigger on required workflows / Required workflows have merge_group trigger (pull_request) Successful in 11s
CI / Detect changes (pull_request) Successful in 15s
E2E API Smoke Test / detect-changes (pull_request) Successful in 17s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 20s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 14s
Harness Replays / detect-changes (pull_request) Successful in 21s
Retarget main PRs to staging / Retarget to staging (pull_request) Has been skipped
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 23s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 18s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 18s
CodeQL / Analyze (${{ matrix.language }}) (go) (pull_request) Failing after 1m51s
CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (pull_request) Failing after 1m54s
CodeQL / Analyze (${{ matrix.language }}) (python) (pull_request) Failing after 1m57s
CI / Platform (Go) (pull_request) Successful in 9s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 6s
CI / Canvas (Next.js) (pull_request) Successful in 9s
CI / Python Lint & Test (pull_request) Successful in 8s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 9s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 12s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 15s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 16s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
Harness Replays / Harness Replays (pull_request) Failing after 2m13s
harness-replays.yml builds tenant-alpha + tenant-beta via tests/harness/
compose.yml using workspace-server/Dockerfile.tenant. Post-#173, that
Dockerfile expects .tenant-bundle-deps/{workspace-configs-templates,
org-templates,plugins} pre-cloned at the build context root. Sister
PR #38 added the pre-clone step to publish-workspace-server-image.yml
but missed harness-replays.yml.
Symptoms:
- main run #892 (2026-05-07T20:28:53Z): COPY
.tenant-bundle-deps/plugins -> failed to calculate checksum ...
not found.
- staging run #964 (2026-05-07T20:41:52Z): hits the OLD in-image
clone path (staging hasn't picked up the Dockerfile.tenant
refactor yet via auto-sync) and fails on
'fatal: could not read Username for https://git.moleculesai.app'
when cloning the first private workspace-template-* repo.
Fix: add the same Pre-clone step to harness-replays.yml,
mirroring publish-workspace-server-image.yml. Uses AUTO_SYNC_TOKEN
(devops-engineer persona PAT) per
feedback_per_agent_gitea_identity_default.
Once auto-sync main->staging unblocks (sister agent fixing the
7-file conflict in flight), staging will inherit both this workflow
fix AND the Dockerfile.tenant refactor atomically.
Refs: #168, #173
|
||
|
|
10e510f50c |
chore: drop github-app-auth + swap GHCR→ECR (closes #157, #161)
Some checks failed
Retarget main PRs to staging / Retarget to staging (pull_request) Has been skipped
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 5s
Check merge_group trigger on required workflows / Required workflows have merge_group trigger (pull_request) Successful in 5s
CI / Detect changes (pull_request) Successful in 8s
E2E API Smoke Test / detect-changes (pull_request) Successful in 8s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 8s
Harness Replays / detect-changes (pull_request) Successful in 9s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 9s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 8s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 8s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 9s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 2s
CI / Python Lint & Test (pull_request) Successful in 4s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 5s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 5s
CI / Canvas (Next.js) (pull_request) Successful in 17s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 30s
Harness Replays / Harness Replays (pull_request) Failing after 32s
CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (pull_request) Failing after 1m26s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m21s
CodeQL / Analyze (${{ matrix.language }}) (go) (pull_request) Failing after 1m36s
CodeQL / Analyze (${{ matrix.language }}) (python) (pull_request) Failing after 1m36s
CI / Platform (Go) (pull_request) Successful in 2m18s
Two coupled cleanups for the post-2026-05-06 stack:
============================================
The plugin injected GITHUB_TOKEN/GH_TOKEN via the App's
installation-access flow (~hourly rotation). Per-agent Gitea
identities replaced this approach after the 2026-05-06 suspension —
workspaces now provision with a per-persona Gitea PAT from .env
instead of an App-rotated token. The plugin code itself lived on
github.com/Molecule-AI/molecule-ai-plugin-github-app-auth which is
also unreachable post-suspension; checking it out at CI build time
was already failing.
Removed:
- workspace-server/cmd/server/main.go: githubappauth import + the
`if os.Getenv("GITHUB_APP_ID") != ""` block that called
BuildRegistry. gh-identity remains as the active mutator.
- workspace-server/Dockerfile + Dockerfile.tenant: COPY of the
sibling repo + the `replace github.com/Molecule-AI/molecule-ai-
plugin-github-app-auth => /plugin` directive injection.
- workspace-server/go.mod + go.sum: github-app-auth dep entry
(cleaned up by `go mod tidy`).
- 3 workflows: actions/checkout steps for the sibling plugin repo:
- .github/workflows/codeql.yml (Go matrix path)
- .github/workflows/harness-replays.yml
- .github/workflows/publish-workspace-server-image.yml
Verified `go build ./cmd/server` + `go vet ./...` pass post-removal.
=======================================================
Same workflow used to push to ghcr.io/molecule-ai/platform +
platform-tenant. ghcr.io/molecule-ai is gone post-suspension. The
operator's ECR org (153263036946.dkr.ecr.us-east-2.amazonaws.com/
molecule-ai/) already hosts platform-tenant + workspace-template-*
+ runner-base images and is the post-suspension SSOT for container
images. This PR aligns publish-workspace-server-image with that
stack.
- env.IMAGE_NAME + env.TENANT_IMAGE_NAME repointed to ECR URL.
- docker/login-action swapped for aws-actions/configure-aws-
credentials@v4 + aws-actions/amazon-ecr-login@v2 chain (the
standard ECR auth pattern; uses AWS_ACCESS_KEY_ID/SECRET secrets
bound to the molecule-cp IAM user).
The :staging-<sha> + :staging-latest tag policy is unchanged —
staging-CP's TENANT_IMAGE pin still points at :staging-latest, just
with the new registry prefix.
Refs molecule-core#157, #161; parallel to org-wide CI-green sweep.
|
||
|
|
e01077be38 |
fix(ci): lowercase 'molecule-ai/' in cross-repo workflow refs
Some checks failed
Retarget main PRs to staging / Retarget to staging (pull_request) Has been skipped
cascade-list-drift-gate / check (pull_request) Successful in 3s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 3s
Check merge_group trigger on required workflows / Required workflows have merge_group trigger (pull_request) Successful in 5s
CI / Detect changes (pull_request) Successful in 4s
pr-guards / disable-auto-merge-on-push (pull_request) Failing after 0s
E2E API Smoke Test / detect-changes (pull_request) Successful in 5s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 4s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 5s
Harness Replays / detect-changes (pull_request) Successful in 4s
CodeQL / Analyze (${{ matrix.language }}) (go) (pull_request) Failing after 50s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 3s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 4s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 5s
CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (pull_request) Failing after 1m16s
CodeQL / Analyze (${{ matrix.language }}) (python) (pull_request) Failing after 1m16s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 7s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 3s
CI / Python Lint & Test (pull_request) Failing after 16s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 4s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 6s
Harness Replays / Harness Replays (pull_request) Failing after 40s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 3s
CI / Canvas (Next.js) (pull_request) Failing after 4m47s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / Platform (Go) (pull_request) Successful in 5m25s
Gitea is case-sensitive on owner slugs; canonical is lowercase `molecule-ai/...`. Mixed-case `Molecule-AI/...` refs fail-at-0s when the runner tries to resolve the cross-repo workflow / checkout. Same fix as molecule-controlplane#12. Mechanical case-correction; no behavior change beyond making CI resolve again. Refs: internal#46 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
3598eb41d1
|
chore(deps)(deps): bump actions/checkout from 4 to 6
Bumps [actions/checkout](https://github.com/actions/checkout) from 4 to 6. - [Release notes](https://github.com/actions/checkout/releases) - [Commits](https://github.com/actions/checkout/compare/v4...v6) --- updated-dependencies: - dependency-name: actions/checkout dependency-version: '6' dependency-type: direct:production update-type: version-update:semver-major ... Signed-off-by: dependabot[bot] <support@github.com> |
||
|
|
c275716005 |
harness(phase-2): multi-tenant compose + cross-tenant isolation replays
Brings the local harness from "single tenant covering the request path" to "two tenants covering both the request path AND the per-tenant isolation boundary" — the same shape production runs (one EC2 + one Postgres + one MOLECULE_ORG_ID per tenant). Why this matters: the four prior replays exercise the SaaS request path against one tenant. They cannot prove that TenantGuard rejects a misrouted request (production CF tunnel + AWS LB are the failure surface), nor that two tenants doing legitimate work in parallel keep their `activity_logs` / `workspaces` / connection-pool state partitioned. Both are real bug classes — TenantGuard allowlist drift shipped #2398, lib/pq prepared-statement cache collision is documented as an org-wide hazard. What changed: 1. compose.yml — split into two tenants. tenant-alpha + postgres-alpha + tenant-beta + postgres-beta + the shared cp-stub, redis, cf-proxy. Each tenant gets a distinct ADMIN_TOKEN + MOLECULE_ORG_ID and its own Postgres database. cf-proxy depends on both tenants becoming healthy. 2. cf-proxy/nginx.conf — Host-header → tenant routing. `map $host $tenant_upstream` resolves the right backend per request. Required `resolver 127.0.0.11 valid=30s ipv6=off;` because nginx needs an explicit DNS resolver to use a variable in `proxy_pass` (literal hostnames resolve once at startup; variables resolve per request — without the resolver nginx fails closed with 502). `server_name` lists both tenants + the legacy alias so unknown Host headers don't silently route to a default and mask routing bugs. 3. _curl.sh — per-tenant + cross-tenant-negative helpers. `curl_alpha_admin` / `curl_beta_admin` set the right Host + Authorization + X-Molecule-Org-Id triple. `curl_alpha_creds_at_beta` / `curl_beta_creds_at_alpha` exist precisely to make WRONG requests (replays use them to assert TenantGuard rejects). `psql_exec_alpha` / `psql_exec_beta` shell out per-tenant Postgres exec. Legacy aliases (`curl_admin`, `psql_exec`) keep the four pre-Phase-2 replays working without edits. 4. seed.sh — registers parent+child workspaces in BOTH tenants. Captures server-generated IDs via `jq -r '.id'` (POST /workspaces ignores body.id, so the older client-side mint silently desynced from the workspaces table and broke FK-dependent replays). Stashes `ALPHA_PARENT_ID` / `ALPHA_CHILD_ID` / `BETA_PARENT_ID` / `BETA_CHILD_ID` to .seed.env, plus legacy `ALPHA_ID` / `BETA_ID` aliases for backwards compat with chat-history / channel-envelope. 5. New replays. tenant-isolation.sh (13 assertions) — TenantGuard 404s any request whose X-Molecule-Org-Id doesn't match the container's MOLECULE_ORG_ID. Asserts the 404 body has zero tenant/org/forbidden/denied keywords (existence of a tenant must not be probable from the outside). Covers cross-tenant routing misconfigure + allowlist drift + missing-org-header. per-tenant-independence.sh (12 assertions) — both tenants seed activity_logs in parallel with distinct row counts (3 vs 5) and confirm each tenant's history endpoint returns exactly its own counts. Then a concurrent INSERT race (10 rows per tenant in parallel via `&` + wait) catches shared-pool corruption + prepared-statement cache poisoning + redis cross-keyspace bleed. 6. Bug fix: down.sh + dump-logs SECRETS_ENCRYPTION_KEY validation. `docker compose down -v` validates the entire compose file even though it doesn't read the env. up.sh generates a per-run key into its own shell — down.sh runs in a fresh shell that wouldn't see it, so without a placeholder `compose down` exited non-zero before removing volumes. Workspaces silently leaked into the next ./up.sh + seed.sh boot. Caught when tenant-isolation.sh F1/F2 saw 3× duplicate alpha-parent rows accumulated across three prior runs. Same fix applied to the workflow's dump-logs step. 7. requirements.txt — pin molecule-ai-workspace-runtime>=0.1.78. channel-envelope-trust-boundary.sh imports from `molecule_runtime.*` (the wheel-rewritten path) so it catches the failure mode where the wheel build silently strips a fix that unit tests on local source still pass. CI was failing this replay because the wheel wasn't installed — caught in the staging push run from #2492. 8. .github/workflows/harness-replays.yml — Phase 2 plumbing. * Removed /etc/hosts step (Host-header path eliminated the need; scripts already source _curl.sh). * Updated dump-logs to reference the new service names (tenant-alpha + tenant-beta + postgres-alpha + postgres-beta). * Added SECRETS_ENCRYPTION_KEY placeholder env on the dump step. Verified: ./run-all-replays.sh from a clean state — 6/6 passed (buildinfo-stale-image, channel-envelope-trust-boundary, chat-history, peer-discovery-404, per-tenant-independence, tenant-isolation). Roadmap section updated: Phase 2 marked shipped. Phase 3 promoted to "replace cp-stub with real molecule-controlplane Docker build + env coherence lint." Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
c8b17ea1ad |
fix(harness): install httpx for replay Python evals
peer-discovery-404 imports workspace/a2a_client.py which depends on httpx; the runner's stock Python doesn't have it, so the replay's PARSE assertion (b) fails with ModuleNotFoundError on every run. The WIRE assertion (a) — pure curl — passes, so the failure was masking just enough to make the replay LOOK partially-broken when the tenant side is fine. Adding tests/harness/requirements.txt with only httpx instead of sourcing workspace/requirements.txt: that file pulls a2a-sdk, langchain-core, opentelemetry, sqlalchemy, temporalio, etc. — ~30s of install for one replay's PARSE step. The harness's deps surface should grow when a new replay introduces a new import, not by default. Workflow gains one step (`pip install -r tests/harness/requirements.txt`) between the /etc/hosts setup and run-all-replays. No other changes. |
||
|
|
24cb2a286f |
ci(harness-replays): KEEP_UP=1 so dump-logs step has containers to read
First run on PR #2410 failed with 'container harness-tenant-1 is unhealthy' but the dump-compose-logs step printed empty tenant logs because run-all-replays.sh's trap-on-EXIT had already torn down the harness. Setting KEEP_UP=1 leaves containers in place; the always-run Force teardown step at the end owns cleanup explicitly. Now we'll actually see why the tenant didn't become healthy. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |
||
|
|
3105e87cf7 |
ci: gate PRs on tests/harness/run-all-replays.sh
Closes the gap between "the harness exists" and "the harness blocks bugs." Phase 2 of the harness roadmap (per tests/harness/README.md): make harness-based E2E a required CI check on every PR touching the tenant binary or the harness itself. Trigger: push + pull_request to staging+main, paths-filtered to workspace-server/**, canvas/**, tests/harness/**, and this workflow. merge_group support included so this becomes branch-protectable. Single-job-with-conditional-steps pattern (matches e2e-api.yml). One check run regardless of paths-filter outcome; satisfies branch protection cleanly per the PR #2264 SKIPPED-in-set finding. Why this exists: 2026-04-30 we shipped a TenantGuard allowlist gap (/buildinfo added to router.go in #2398, never added to the allowlist) that the existing buildinfo-stale-image.sh replay would have caught. The harness was wired correctly; nobody ran it. Replays as a discipline beat replays as a memory item. The CI pipeline: detect-changes (paths filter) └ harness-replays (always) ├ no-op pass when paths-filter says no relevant change └ otherwise: checkout + sibling plugin checkout + /etc/hosts entry + run-all-replays.sh + compose-logs-on-failure + force-teardown Compose logs from tenant/cp-stub/cf-proxy/postgres are dumped on failure so a CI red is debuggable without re-reproducing locally. The trap in run-all-replays.sh handles teardown; the always-run down.sh step is a belt-and-suspenders against trap-bypass kills. Follow-ups (not in this PR): - Add this check to staging branch protection once it's been green for a few PRs (the new-workflow-instability hedge that other gates followed). - Eventually wire the buildx GHA cache to speed up tenant image builds — currently every PR rebuilds the full Dockerfile.tenant (Go + Next.js + template clones) from scratch. Acceptable for now; optimize when the timeout-minutes:30 ceiling becomes painful. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> |