fix(ci): force MOLECULE_IN_DOCKER=false in local-provision E2E (#2473) #2476

Closed
agent-dev-a wants to merge 3 commits from fix/main-red-e2e-act-runner-docker-detect into main
Member

act_runner executes the job inside a Docker container, so /.dockerenv exists and the platform auto-detects platformInDocker=true. The proxy then rewrites workspace URLs from http://127.0.0.1: to http://ws-:8000 (the Docker-internal form).

But the act_runner job container is NOT attached to molecule-core-net, so net.LookupHost('ws-...') falls through to the host's systemd-resolved (127.0.0.53) which cannot resolve Docker bridge hostnames. The proxy returns 502 'workspace URL is not publicly routable' and the E2E fails.

Force MOLECULE_IN_DOCKER=false so the platform treats itself as host-native and keeps using the host-mapped 127.0.0.1:<ephemeral_port> URL, which IS reachable from the job container.

Fixes #2473

act_runner executes the job inside a Docker container, so /.dockerenv exists and the platform auto-detects platformInDocker=true. The proxy then rewrites workspace URLs from http://127.0.0.1:<port> to http://ws-<shortid>:8000 (the Docker-internal form). But the act_runner job container is NOT attached to molecule-core-net, so net.LookupHost('ws-...') falls through to the host's systemd-resolved (127.0.0.53) which cannot resolve Docker bridge hostnames. The proxy returns 502 'workspace URL is not publicly routable' and the E2E fails. Force MOLECULE_IN_DOCKER=false so the platform treats itself as host-native and keeps using the host-mapped 127.0.0.1:<ephemeral_port> URL, which IS reachable from the job container. Fixes #2473
agent-dev-a force-pushed fix/main-red-e2e-act-runner-docker-detect from 2e8a6f026c to bcf7135825 2026-06-09 06:12:29 +00:00 Compare
agent-dev-a force-pushed fix/main-red-e2e-act-runner-docker-detect from bcf7135825 to 4bd4fafb34 2026-06-09 06:47:34 +00:00 Compare
agent-dev-a force-pushed fix/main-red-e2e-act-runner-docker-detect from 4bd4fafb34 to 316d2dbd6d 2026-06-09 07:09:30 +00:00 Compare
agent-dev-a force-pushed fix/main-red-e2e-act-runner-docker-detect from 316d2dbd6d to b94c7eb760 2026-06-09 07:10:34 +00:00 Compare
agent-dev-a added 1 commit 2026-06-09 07:13:38 +00:00
fix(ci): force MOLECULE_IN_DOCKER=false + discover PLATFORM_URL in local-provision E2E (#2473)
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 8s
CI / Python Lint & Test (pull_request) Successful in 8s
CI / Detect changes (pull_request) Successful in 13s
CI / Platform (Go) (pull_request) Successful in 4s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 7s
CI / Canvas (Next.js) (pull_request) Successful in 4s
E2E API Smoke Test / detect-changes (pull_request) Successful in 16s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 5s
E2E Chat / detect-changes (pull_request) Successful in 13s
CI / Canvas Deploy Status (pull_request) Successful in 2s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 6s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 6s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 7s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 3s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 20s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 9s
CI / all-required (pull_request) Successful in 4s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 5s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (pull_request) Successful in 8s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 17s
sop-checklist / review-refire (pull_request_target) Has been skipped
E2E Chat / E2E Chat (pull_request) Successful in 36s
qa-review / approved (pull_request_target) Failing after 11s
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, local-postgres-e2
gate-check-v3 / gate-check (pull_request_target) Successful in 18s
sop-checklist / na-declarations (pull_request) N/A: (none)
security-review / approved (pull_request_target) Failing after 11s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 21s
sop-checklist / all-items-acked (pull_request_target) Successful in 10s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m5s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 1m35s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Successful in 1m42s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Successful in 1m58s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 2m14s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (stub) (pull_request) Failing after 3m53s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (real image + MiniMax LLM, advisory) (pull_request) Failing after 5m25s
697bb3e594
act_runner executes the job inside a Docker container, so /.dockerenv
exists and the platform auto-detects platformInDocker=true. The proxy then
rewrites workspace URLs from http://127.0.0.1:<port> to
http://ws-<shortid>:8000 (the Docker-internal form).

But the act_runner job container is NOT attached to molecule-core-net,
so net.LookupHost('ws-...') falls through to the host's systemd-resolved
(127.0.0.53) which cannot resolve Docker bridge hostnames. The proxy
returns 502 'workspace URL is not publicly routable' and the E2E fails.

Force MOLECULE_IN_DOCKER=false so the proxy treats itself as host-native
and keeps using the host-mapped 127.0.0.1:<ephemeral_port> URL, which IS
reachable from the job container.

Additionally, host.docker.internal is NOT reliably available on Linux
Docker (the act_runner environment), so workspace containers could not
resolve the platform URL to register/heartbeat. This left the workspace
stuck in 'provisioning' even though the container was running.

Discover the molecule-core-net gateway IP and explicitly set PLATFORM_URL
so workspace containers can reach the platform for registration.

Fixes applied:
- Create molecule-core-net explicitly before inspecting it; the provisioner
  lazily creates it on first workspace boot, but we need the gateway IP
  BEFORE starting the platform.
- Pass PLATFORM_URL explicitly on the platform-server command line because
  GITHUB_ENV propagation is flaky on act_runner (#2468 RCA).

Fixes #2473
agent-dev-a force-pushed fix/main-red-e2e-act-runner-docker-detect from b94c7eb760 to 697bb3e594 2026-06-09 07:13:38 +00:00 Compare
agent-dev-a added 1 commit 2026-06-09 08:30:31 +00:00
fix(ci): bind platform to 0.0.0.0 + pass PLATFORM_URL in local-provision E2E (#2473 follow-up)
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 4s
CI / Python Lint & Test (pull_request) Successful in 3s
CI / Detect changes (pull_request) Successful in 6s
E2E API Smoke Test / detect-changes (pull_request) Successful in 7s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 5s
CI / Platform (Go) (pull_request) Successful in 2s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 4s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 5s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 2s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 7s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 3s
E2E Chat / detect-changes (pull_request) Successful in 12s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 11s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 3s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (pull_request) Successful in 5s
E2E Chat / E2E Chat (pull_request) Successful in 3s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 6s
sop-checklist / review-refire (pull_request_target) Has been skipped
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 10s
gate-check-v3 / gate-check (pull_request_target) Successful in 8s
qa-review / approved (pull_request_target) Failing after 6s
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, local-postgres-e2
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request_target) Successful in 7s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 14s
CI / Canvas (Next.js) (pull_request) Successful in 26s
security-review / approved (pull_request_target) Failing after 12s
CI / Canvas Deploy Status (pull_request) Successful in 3s
CI / all-required (pull_request) Successful in 9s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 54s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 1m3s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m6s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Successful in 1m19s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Successful in 1m17s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (stub) (pull_request) Failing after 4m11s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (real image + MiniMax LLM, advisory) (pull_request) Waiting to run
abb3711800
The prior commit added MOLECULE_IN_DOCKER=false and PLATFORM_URL discovery,
but the platform still binds to 127.0.0.1 in dev mode (resolveBindHost).
Workspace containers on molecule-core-net cannot reach 127.0.0.1 inside the
act_runner container, so registration/heartbeat fails and the workspace stays
stuck in 'provisioning'.

Fixes applied:
- Add BIND_ADDR=0.0.0.0 to both platform-server start commands so the
  platform listens on all interfaces and is reachable from molecule-core-net.
- Also pass PLATFORM_URL explicitly in the real-image job (was missing;
  only the stub job had it).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
agent-dev-a added 1 commit 2026-06-09 09:05:32 +00:00
debug(ci): add PLATFORM_URL echo + network reachability test + container logs (#2473)
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 10s
CI / Python Lint & Test (pull_request) Successful in 4s
CI / Detect changes (pull_request) Successful in 13s
E2E API Smoke Test / detect-changes (pull_request) Successful in 11s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 11s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 8s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 5s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 8s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Has started running
CI / Platform (Go) (pull_request) Successful in 3s
CI / Canvas (Next.js) (pull_request) Successful in 4s
E2E Chat / detect-changes (pull_request) Successful in 24s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 21s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 6s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 3s
CI / Canvas Deploy Status (pull_request) Successful in 1s
E2E Chat / E2E Chat (pull_request) Successful in 5s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 5s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (pull_request) Successful in 11s
CI / all-required (pull_request) Successful in 5s
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 16s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 8s
sop-checklist / review-refire (pull_request_target) Has been skipped
qa-review / approved (pull_request_target) Failing after 10s
gate-check-v3 / gate-check (pull_request_target) Successful in 15s
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, local-postgres-e2
sop-checklist / na-declarations (pull_request) N/A: (none)
security-review / approved (pull_request_target) Failing after 22s
sop-checklist / all-items-acked (pull_request_target) Successful in 19s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (stub) (pull_request) Successful in 45s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Successful in 1m22s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 1m10s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Successful in 1m23s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m29s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m20s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (real image + MiniMax LLM, advisory) (pull_request) Successful in 38s
audit-force-merge / audit (pull_request_target) Has been skipped
cad33e23c9
Adds debugging to understand why workspace containers cannot reach the
platform for heartbeat/registration after BIND_ADDR=0.0.0.0 fix.

- Echo PLATFORM_URL and PLATFORM_HOST_IP during setup
- Add alpine wget reachability test from molecule-core-net before E2E
- Dump workspace container logs on failure to see stub runtime errors

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
agent-dev-a closed this pull request 2026-06-09 10:12:35 +00:00
Author
Member

Superseded by #2478, which includes these fixes plus the localhost→127.0.0.1 hardening.

Superseded by #2478, which includes these fixes plus the localhost→127.0.0.1 hardening.
Some checks are pending
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 10s
CI / Python Lint & Test (pull_request) Successful in 4s
CI / Detect changes (pull_request) Successful in 13s
E2E API Smoke Test / detect-changes (pull_request) Successful in 11s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 11s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 8s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 5s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 8s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Has started running
Required
Details
CI / Platform (Go) (pull_request) Successful in 3s
CI / Canvas (Next.js) (pull_request) Successful in 4s
E2E Chat / detect-changes (pull_request) Successful in 24s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 21s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 6s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 3s
Required
Details
CI / Canvas Deploy Status (pull_request) Successful in 1s
E2E Chat / E2E Chat (pull_request) Successful in 5s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 5s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (pull_request) Successful in 11s
CI / all-required (pull_request) Successful in 5s
Required
Details
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 16s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 8s
sop-checklist / review-refire (pull_request_target) Has been skipped
qa-review / approved (pull_request_target) Failing after 10s
gate-check-v3 / gate-check (pull_request_target) Successful in 15s
sop-checklist / all-items-acked (pull_request) acked: 0/7 — missing: comprehensive-testing, local-postgres-e2e, staging-smoke, +4 — body-unfilled: comprehensive-testing, local-postgres-e2
sop-checklist / na-declarations (pull_request) N/A: (none)
security-review / approved (pull_request_target) Failing after 22s
sop-checklist / all-items-acked (pull_request_target) Successful in 19s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (stub) (pull_request) Successful in 45s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Successful in 1m22s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 1m10s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Successful in 1m23s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m29s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m20s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (real image + MiniMax LLM, advisory) (pull_request) Successful in 38s
audit-force-merge / audit (pull_request_target) Has been skipped

Pull request closed

Sign in to join this conversation.
No Reviewers
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: molecule-ai/molecule-core#2476