fix(ci): close 3 chronic Gitea-Actions workflow flakes (#92)
Some checks failed
CodeQL / Analyze (${{ matrix.language }}) (go) (push) Successful in 4s
CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (push) Successful in 6s
CodeQL / Analyze (${{ matrix.language }}) (python) (push) Successful in 7s
Check merge_group trigger on required workflows / Required workflows have merge_group trigger (push) Successful in 12s
Block internal-flavored paths / Block forbidden paths (push) Successful in 13s
CI / Detect changes (push) Successful in 17s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (push) Successful in 19s
E2E Staging Canvas (Playwright) / detect-changes (push) Successful in 24s
E2E API Smoke Test / detect-changes (push) Successful in 24s
Secret scan / Scan diff for credential-shaped strings (push) Successful in 20s
Handlers Postgres Integration / detect-changes (push) Successful in 24s
Harness Replays / detect-changes (push) Successful in 24s
Runtime PR-Built Compatibility / detect-changes (push) Successful in 24s
CI / Shellcheck (E2E scripts) (push) Successful in 8s
CI / Platform (Go) (push) Successful in 10s
CI / Python Lint & Test (push) Successful in 9s
CI / Canvas (Next.js) (push) Successful in 11s
Harness Replays / Harness Replays (push) Failing after 1m11s
E2E API Smoke Test / E2E API Smoke Test (push) Successful in 1m36s
CI / Canvas Deploy Reminder (push) Has been skipped
Runtime PR-Built Compatibility / PR-built wheel + import smoke (push) Successful in 2m55s
Handlers Postgres Integration / Handlers Postgres Integration (push) Successful in 3m57s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (push) Successful in 5m36s

Closes #88. Bundles localhost→127.0.0.1 + 2 other Gitea-act_runner flakes per feedback_gitea_actions_migration_audit_pattern. Approved by security-auditor.
This commit is contained in:
claude-ceo-assistant 2026-05-08 00:20:42 +00:00
commit 12ff797d12
4 changed files with 90 additions and 17 deletions

View File

@ -97,7 +97,7 @@ jobs:
# Wait for postgres to actually accept connections (the
# GHA --health-cmd is best-effort but psql can still race).
for i in {1..15}; do
if pg_isready -h localhost -p 5432 -U postgres -q; then break; fi
if pg_isready -h 127.0.0.1 -p 5432 -U postgres -q; then break; fi
echo "waiting for postgres..."; sleep 2
done
@ -131,7 +131,7 @@ jobs:
# not fine once a cross-table atomicity test came in.
set +e
for migration in $(ls migrations/*.sql 2>/dev/null | grep -v '\.down\.sql$' | sort); do
if psql -h localhost -U postgres -d molecule -v ON_ERROR_STOP=1 \
if psql -h 127.0.0.1 -U postgres -d molecule -v ON_ERROR_STOP=1 \
-f "$migration" >/dev/null 2>&1; then
echo "✓ $(basename "$migration")"
else
@ -145,7 +145,7 @@ jobs:
# fail if any didn't land — that would be a real regression we
# want loud.
for tbl in delegations workspaces activity_logs pending_uploads; do
if ! psql -h localhost -U postgres -d molecule -tA \
if ! psql -h 127.0.0.1 -U postgres -d molecule -tA \
-c "SELECT 1 FROM information_schema.tables WHERE table_name = '$tbl'" \
| grep -q 1; then
echo "::error::$tbl table missing after migration replay — handler integration tests would be meaningless"
@ -157,7 +157,14 @@ jobs:
- if: needs.detect-changes.outputs.handlers == 'true'
name: Run integration tests
env:
INTEGRATION_DB_URL: postgres://postgres:test@localhost:5432/molecule?sslmode=disable
# 127.0.0.1, NOT localhost. On Gitea / act_runner the runner host
# has IPv6 enabled, so `localhost` resolves to `::1` first, and
# the Postgres service container only listens on IPv4 → lib/pq's
# first dial hits ECONNREFUSED. The migration step uses psql -h
# localhost which falls back to IPv4 cleanly, so the flake hides
# there and surfaces only at test time. Pinning IPv4 makes the
# whole job deterministic. (Issue #88, item 3.)
INTEGRATION_DB_URL: postgres://postgres:test@127.0.0.1:5432/molecule?sslmode=disable
run: |
go test -tags=integration -timeout 5m -v ./internal/handlers/ -run "^TestIntegration_"
@ -167,5 +174,5 @@ jobs:
PGPASSWORD: test
run: |
echo "::group::delegations table state"
psql -h localhost -U postgres -d molecule -c "SELECT * FROM delegations LIMIT 50;" || true
psql -h 127.0.0.1 -U postgres -d molecule -c "SELECT * FROM delegations LIMIT 50;" || true
echo "::endgroup::"

View File

@ -1,14 +1,25 @@
name: pr-guards
# Thin caller that delegates to the molecule-ci reusable guard. Today
# the guard is just "disable auto-merge when a new commit is pushed
# after auto-merge was enabled" — added 2026-04-27 after PR #2174
# auto-merged with only its first commit because the second commit
# was pushed after the merge queue had locked the PR's SHA.
# PR-time guards. Today the only guard is "disable auto-merge when a
# new commit is pushed after auto-merge was enabled" — added 2026-04-27
# after PR #2174 auto-merged with only its first commit because the
# second commit was pushed after the merge queue had locked the PR's
# SHA.
#
# When more PR-time guards land in molecule-ci, add them here as
# additional jobs that share the same pull_request:synchronize
# trigger.
# Why this is inlined (not delegated to molecule-ci's reusable
# workflow): the reusable workflow uses `gh pr merge --disable-auto`,
# which calls GitHub's GraphQL API. Gitea has no GraphQL endpoint and
# returns HTTP 405 on /api/graphql, so the job failed on every Gitea
# PR push since the 2026-05-06 migration. Gitea also has no `--auto`
# merge primitive that this job could be acting on, so the right
# behaviour on Gitea is "no-op + green status" — not a 405.
#
# Inlining (vs. an `if:` on the `uses:` line) keeps the job ALWAYS
# running, which matters for branch protection: required-check names
# need a job that emits SUCCESS terminal state, not SKIPPED. See
# `feedback_branch_protection_check_name_parity` and `feedback_pr_merge_safety_guards`.
#
# Issue #88 item 1.
on:
pull_request:
@ -19,4 +30,34 @@ permissions:
jobs:
disable-auto-merge-on-push:
uses: molecule-ai/molecule-ci/.github/workflows/disable-auto-merge-on-push.yml@main
runs-on: ubuntu-latest
steps:
# Detect Gitea Actions. act_runner sets GITEA_ACTIONS=true in the
# step env on every job. Belt-and-suspenders: also check the repo
# url's host, which is independent of any runner-side env config
# (covers a future Gitea host where the env var is forgotten).
- name: Detect runner host
id: host
run: |
if [[ "${GITEA_ACTIONS:-}" == "true" ]] || [[ "${{ github.server_url }}" == *moleculesai.app* ]] || [[ "${{ github.event.repository.html_url }}" == *moleculesai.app* ]]; then
echo "is_gitea=true" >> "$GITHUB_OUTPUT"
echo "::notice::Gitea Actions detected — auto-merge gating is not applicable here (Gitea has no --auto merge primitive). Job will no-op."
else
echo "is_gitea=false" >> "$GITHUB_OUTPUT"
fi
- name: Disable auto-merge (GitHub only)
if: steps.host.outputs.is_gitea != 'true'
env:
GH_TOKEN: ${{ github.token }}
PR: ${{ github.event.pull_request.number }}
REPO: ${{ github.repository }}
NEW_SHA: ${{ github.sha }}
run: |
set -eu
gh pr merge "$PR" --disable-auto -R "$REPO" || true
gh pr comment "$PR" -R "$REPO" --body "🔒 Auto-merge disabled — new commit (\`${NEW_SHA:0:7}\`) pushed after auto-merge was enabled. The merge queue locks SHAs at entry, so subsequent pushes can race. Verify the new commit and re-enable with \`gh pr merge --auto\`."
- name: Gitea no-op
if: steps.host.outputs.is_gitea == 'true'
run: echo "Gitea Actions — auto-merge gating not applicable; no-op (job intentionally green so branch protection's required-check name lands SUCCESS)."

View File

@ -0,0 +1,14 @@
# cf-proxy harness image — nginx + the harness's tenant-routing config baked
# in at build time.
#
# Why bake (not bind-mount): on Gitea Actions / act_runner, the runner is a
# container talking to the OUTER docker daemon over the host socket; runc
# resolves bind-mount source paths on the outer host filesystem, where the
# repo at `/workspace/.../tests/harness/cf-proxy/nginx.conf` is invisible.
# Compose `configs:` (with `file:`) falls back to bind mounts when swarm is
# not active, so it hits the same gap. A build-time COPY uploads the file
# as part of the docker build context — the daemon receives the tarball
# directly and never bind-mounts. See issue #88 item 2.
FROM nginx:1.27-alpine
COPY nginx.conf /etc/nginx/nginx.conf

View File

@ -167,15 +167,26 @@ services:
# Production shape: same single CF tunnel front-doors every tenant
# subdomain — the Host header carries the tenant identity, not the
# routing destination. Local cf-proxy mirrors this exactly.
#
# nginx.conf delivery: built into a custom image via cf-proxy/Dockerfile
# (a thin nginx:1.27-alpine + COPY). NOT a bind mount and NOT a
# compose `configs:` block, both of which break under Gitea's
# act_runner: the runner talks to the OUTER docker daemon over the
# host socket, and runc resolves bind sources on the outer host
# filesystem, where `/workspace/.../tests/harness/cf-proxy/nginx.conf`
# is invisible. Compose `configs:` falls back to bind mounts without
# swarm, so it hits the same gap. A build context, by contrast, is
# uploaded to the daemon as a tarball at build time — no bind. See
# issue #88 item 2.
cf-proxy:
image: nginx:1.27-alpine
build:
context: ./cf-proxy
dockerfile: Dockerfile
depends_on:
tenant-alpha:
condition: service_healthy
tenant-beta:
condition: service_healthy
volumes:
- ./cf-proxy/nginx.conf:/etc/nginx/nginx.conf:ro
# Bind to 127.0.0.1 only — hardcoded ADMIN_TOKENs make 0.0.0.0
# exposure unsafe even on a local network.
ports: