fix(ci): e2e-api — parallel-safe postgres/redis containers (closes #94) #100
Loading…
Reference in New Issue
Block a user
No description provided.
Delete Branch "fix/issue-94-e2e-api-parallel-safe-class-b"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Summary
Class B Hongming-owned CICD red sweep, e2e-api leg. Mirrors PR #98 (handlers-postgres-integration) for the e2e-api workflow.
Root cause (verified, not just hypothesised)
Gitea act_runner is configured with
container.network: hostoperator-wide. Parallel runs ofe2e-apicollide on TWO axes:docker rm -f molecule-ci-redisthendocker run --name molecule-ci-redis .... First job's still-running redis gets killed, OR the seconddocker runfails withConflict. The container name "/molecule-ci-redis" is already in use ...(exit 125). VERIFIED in operator-host log/opt/molecule/gitea/actions_log/molecule-ai/molecule-core/a7/2727.log:-p 15432:5432and-p 16379:6379are fixed; second concurrent job's bind fails withAddress in use.Also confirmed: Issue #94 items #2 + #3 are independent failures that surface AT TEST TIME (provisioner needs
alpine:latestandmolecule-monorepo-net).Fix
pg-e2e-api-${RUN_ID}-${RUN_ATTEMPT},redis-e2e-api-${RUN_ID}-${RUN_ATTEMPT}. Unique even across reruns of the same run_id (mirrors PR #98).-p 0:5432/-p 0:6379,docker portlookup,DATABASE_URL/REDIS_URLexported to$GITHUB_ENV. No fixed host-port → no collision.127.0.0.1(notlocalhost) in DB/cache URLs — IPv6 first-resolve flake (#92) stays fixed.if: always()cleanup so containers don't leak when test steps fail.alpine:latest— provisioner needs it for ephemeral token-write containers (internal/handlers/container_files.go). Issue #94 item #2.docker network create molecule-monorepo-net— provisioner attaches workspaces to it (internal/provisioner/provisioner.go::DefaultNetwork). Issue #94 item #3.Issue #94 item #1 (timeouts) is NOT bumped — evidence on recent runs (77/3191, ae/4270, 0e/2318) shows postgres ready in 3s, redis in 1s, platform in 1s. Timeouts are not the bottleneck.
Hostile self-review — 3 weakest spots
docker portparsing assumes IPv4 line. Theawk -F: '/^0\.0\.0\.0:/ {print $2}'filter could miss if Docker prints only:::NNNN(IPv6). Mitigation: fallbackhead -1 | awk -F: '{print $NF}'— handles either format. Empirically0.0.0.0:NNNNis what-p 0:5432produces on the operator host (Linux + bridge daemon).docker pull alpine:latesthas no|| true. If the daemon is unreachable, the workflow fails before any other step. Counterargument: if the daemon is unreachable, NOTHING in this workflow works (every other step uses the same socket), so fail-fast is correct.Run E2E API testsstep'sStatus back onlinefailure (caused byghcr.io/molecule-ai/workspace-template-langgraph:latestreturning 403 Forbidden post-2026-05-06 GitHub suspension) is OUT OF SCOPE here — it's a template-registry resolution problem inworkspace-server, not a workflow problem. This PR does NOT promise green tests on a fresh main; it promises parallel-safe service startup + ready provisioner setup.Test plan
docker rm -f $PG_CONTAINERcleanup runs on both success and failure paths.Closes #94 (items #2 + #3; item #1 documented as not-bottleneck; langgraph-template-403 split out for separate follow-up).
[Class B Hongming-owned CICD sweep]
E2E API parallel-safe postgres/redis containers fix. Mirrors PR #98 (Class B). Unique container names per run + ephemeral host port. Closes #94. By devops-engineer. Approved.
Ghost referenced this pull request2026-05-08 02:16:10 +00:00