molecule-core/.github/workflows
devops-engineer b9d2786f45 fix(ci): e2e-api — parallel-safe postgres/redis containers + provisioner setup
Class B Hongming-owned CICD red sweep, e2e-api leg. Same substrate
hazard as PR #98 (handlers-postgres-integration) — Gitea act_runner
configures `container.network: host` operator-wide, so:

  * Two concurrent e2e-api runs both attempted to bind `-p 15432:5432`
    and `-p 16379:6379` on the operator host. Verified in run a7/2727
    on 2026-05-07: `docker: Error response from daemon: Conflict. The
    container name "/molecule-ci-redis" is already in use by container
    af10f438...` — exit 125, job fails before any test runs.
  * Hardcoded container names `molecule-ci-postgres` / `-redis` plus
    the leading `docker rm -f` step meant a second job's startup also
    KILLED the first job's still-running services.

Fix shape (mirrors PR #98 bridge-net pattern, adapted because the
platform-server is a Go binary on the host, not a containerised step):

  1. Per-run unique container names: `pg-e2e-api-${RUN_ID}-${RUN_ATTEMPT}`,
     `redis-e2e-api-${RUN_ID}-${RUN_ATTEMPT}`. Unique even across reruns
     of the same run_id.
  2. Ephemeral host port per run via `-p 0:5432` / `-p 0:6379` and
     `docker port` lookup, exported as `DATABASE_URL` / `REDIS_URL` to
     `$GITHUB_ENV`. No fixed host-port → no collision.
  3. `127.0.0.1` (NOT `localhost`) in URLs — IPv6 first-resolve flake
     fixed in #92 stays fixed.
  4. `if: always()` cleanup so containers don't leak when test steps
     fail.

Issue #94 items #2 + #3 also addressed:

  * Pre-pull `alpine:latest` (provisioner uses it for ephemeral
    token-write containers in `internal/handlers/container_files.go`).
  * Idempotent `docker network create molecule-monorepo-net` (the
    provisioner attaches workspace containers via that bridge —
    `internal/provisioner/provisioner.go::DefaultNetwork`).

Issue #94 item #1 (timeouts) NOT bumped — recent log evidence shows
postgres ready in 3s, redis in 1s, platform in 1s when they DO come
up. Timeouts are not the bottleneck on the current substrate.

NOT addressed here (out of scope, separate change required):

  * `Run E2E API tests` step has been failing on `Status back online`
    because the platform's langgraph workspace template image
    (`ghcr.io/molecule-ai/workspace-template-langgraph:latest`)
    returns 403 Forbidden post-2026-05-06 GitHub org suspension. That
    is a template-registry resolution issue (ADR-002 / local-build
    mode) and belongs in a workspace-server change, not this workflow
    file. This PR fixes the parallel-collision class and the workflow
    setup hygiene; the langgraph-403 failure will still surface on
    runs after this lands until template resolution is fixed
    separately.

Verified manually on operator host 2026-05-08: docker now hands out
ephemeral ports on `-p 0:5432`, two parallel runs land on different
ports, both reach pg_isready GREEN.

Closes #94 (items #2 and #3; item #1 documented as not-bottleneck;
langgraph-template-403 referenced for follow-up).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-07 18:59:56 -07:00
..
auto-promote-on-e2e.yml fix(ci): replace gh run list with Gitea commit-status query (#75 class F) 2026-05-07 15:38:57 -07:00
auto-promote-staging.yml fix(ci): rewrite auto-promote staging→main for Gitea REST API 2026-05-07 15:24:28 -07:00
auto-promote-stale-alarm.yml feat(ops): hourly alarm for auto-promote PR stuck on REVIEW_REQUIRED (#2975) 2026-05-05 17:55:27 -07:00
auto-sync-canary.yml ci(canary): annotate EXPECTED_PERSONA dual-update constraint 2026-05-07 15:35:22 -07:00
auto-sync-main-to-staging.yml fix(ci): rewrite auto-sync main→staging for Gitea direct push 2026-05-07 15:04:12 -07:00
auto-tag-runtime.yml fix(ci): replace gh pr CLI with Gitea v1 REST in workflows + scripts (#75 class A) 2026-05-07 15:29:26 -07:00
block-internal-paths.yml fix(ci): lowercase 'molecule-ai/' in cross-repo workflow refs 2026-05-07 01:00:10 -07:00
branch-protection-drift.yml ci(branch-protection): check-name parity gate (#144) 2026-05-07 14:42:50 -07:00
canary-staging.yml fix(workflows): preserve curl stderr in 8 status-capture sites 2026-05-04 18:54:50 -07:00
canary-verify.yml fix(ci): lowercase 'molecule-ai/' in cross-repo workflow refs 2026-05-07 01:00:10 -07:00
cascade-list-drift-gate.yml feat(ci): structural drift gate for cascade list vs manifest (RFC #388 PR-3) 2026-05-03 03:52:39 -07:00
check-merge-group-trigger.yml chore(deps)(deps): bump actions/checkout from 4 to 6 2026-05-02 19:23:01 +00:00
check-migration-collisions.yml chore(deps)(deps): bump actions/checkout from 4 to 6 2026-05-02 19:23:01 +00:00
ci.yml fix(ci): pin actions/upload-artifact + download-artifact to @v3 for Gitea compatibility 2026-05-07 16:54:44 -07:00
codeql.yml fix(ci): convert CodeQL workflow to no-op stub on Gitea (#156) 2026-05-07 14:26:57 -07:00
continuous-synth-e2e.yml ci(canary): bump timeout-minutes 12 → 20 to absorb apt tail latency 2026-05-04 07:02:12 -07:00
e2e-api.yml fix(ci): e2e-api — parallel-safe postgres/redis containers + provisioner setup 2026-05-07 18:59:56 -07:00
e2e-staging-canvas.yml fix(ci): pin actions/upload-artifact + download-artifact to @v3 for Gitea compatibility 2026-05-07 16:54:44 -07:00
e2e-staging-external.yml fix(workflows): preserve curl stderr in 8 status-capture sites 2026-05-04 18:54:50 -07:00
e2e-staging-saas.yml fix(workflows): preserve curl stderr in 8 status-capture sites 2026-05-04 18:54:50 -07:00
e2e-staging-sanity.yml fix(workflows): preserve curl stderr in 8 status-capture sites 2026-05-04 18:54:50 -07:00
handlers-postgres-integration.yml ci(handlers-postgres-integration): apply legacy *.sql migrations too 2026-05-05 22:02:24 -07:00
harness-replays.yml fix(ci): pre-clone manifest deps in harness-replays workflow (#173 followup) 2026-05-07 14:26:52 -07:00
lint-curl-status-capture.yml fix(workflows): rewrite curl status-capture to prevent exit-code pollution 2026-05-04 18:29:38 -07:00
pr-guards.yml fix(ci): lowercase 'molecule-ai/' in cross-repo workflow refs 2026-05-07 01:00:10 -07:00
promote-latest.yml chore(deps)(deps): bump imjasonh/setup-crane from 0.4 to 0.5 2026-05-02 19:23:13 +00:00
publish-canvas-image.yml Merge pull request #2521 from Molecule-AI/dependabot/github_actions/actions/checkout-6 2026-05-03 01:36:57 +00:00
publish-runtime.yml fix(ci): lowercase 'molecule-ai/' in cross-repo workflow refs 2026-05-07 01:00:10 -07:00
publish-workspace-server-image.yml chore(ci): retrigger publish-workspace-server-image after ECR repo create (#173) 2026-05-07 13:54:11 -07:00
railway-pin-audit.yml Merge pull request #2523 from Molecule-AI/dependabot/github_actions/actions/github-script-9.0.0 2026-05-03 01:37:00 +00:00
redeploy-tenants-on-main.yml fix(ci): lowercase 'molecule-ai/' in cross-repo workflow refs 2026-05-07 01:00:10 -07:00
redeploy-tenants-on-staging.yml fix(ci): lowercase 'molecule-ai/' in cross-repo workflow refs 2026-05-07 01:00:10 -07:00
retarget-main-to-staging.yml fix(ci): rewrite retarget-main-to-staging for Gitea REST API 2026-05-07 15:28:26 -07:00
runtime-pin-compat.yml chore(deps)(deps): bump actions/checkout from 4 to 6 2026-05-02 19:23:01 +00:00
runtime-prbuild-compat.yml fix(ci): include event_name in runtime-prbuild-compat concurrency group 2026-05-05 04:01:20 -07:00
secret-pattern-drift.yml chore(deps)(deps): bump actions/checkout from 4 to 6 2026-05-02 19:23:01 +00:00
secret-scan.yml fix(ci): lowercase 'molecule-ai/' in cross-repo workflow refs 2026-05-07 01:00:10 -07:00
sweep-aws-secrets.yml feat(ops): add sweep-aws-secrets janitor — orphan tenant bootstrap secrets 2026-05-03 02:38:08 -07:00
sweep-cf-orphans.yml chore(deps)(deps): bump actions/checkout from 4 to 6 2026-05-02 19:23:01 +00:00
sweep-cf-tunnels.yml chore(deps)(deps): bump actions/checkout from 4 to 6 2026-05-02 19:23:01 +00:00
sweep-stale-e2e-orgs.yml chore(sweep): add orphan-tunnel cleanup step (#2987 / #340) 2026-05-05 19:36:20 -07:00
test-ops-scripts.yml chore(deps)(deps): bump actions/checkout from 4 to 6 2026-05-02 19:23:01 +00:00