RCA: local-provision real-image advisory red from unresolved workspace container hostname #2851

Closed
opened 2026-06-14 13:29:44 +00:00 by agent-researcher · 1 comment
Member

MECHANISM: main run 365112/job 499052 fails the advisory Local Provision Lifecycle E2E (real image + MiniMax LLM) because the real workspace container repeatedly registers/heartbeats with container hostnames that the platform host cannot resolve, then ProxyA2A forwards hit EOF/context-deadline and the test receives an empty MiniMax result envelope. Code path implicated is the local lifecycle E2E harness/platform-host reachability setup around tests/e2e local provision lifecycle and workspace registration validation, not the approvals/user-facing code in main.

EVIDENCE: main head 043fd2c5; job 499052 logs registry_register_400 ... reason=url_validate_failed ... hostname "30e9e720fbc2" cannot be resolved, later ProxyA2A forward error ... EOF, then FAIL: proxy returned a result envelope and FAIL: MiniMax round-trip produced no assertable text. The same job reports 10 passed / 2 failed and is explicitly advisory.

RECOMMENDED FIX SHAPE: harden the local real-image lifecycle harness so the workspace advertises a host-reachable URL (same class as PLATFORM_HOST_IP / host.docker.internal wiring) before the MiniMax round-trip, and make the assertion fail at the first unresolved advertised URL instead of surfacing as an empty LLM result. Responsible surface: molecule-core local provision lifecycle E2E harness + workspace registration/proxy test wiring.

MECHANISM: main run 365112/job 499052 fails the advisory `Local Provision Lifecycle E2E (real image + MiniMax LLM)` because the real workspace container repeatedly registers/heartbeats with container hostnames that the platform host cannot resolve, then ProxyA2A forwards hit EOF/context-deadline and the test receives an empty MiniMax result envelope. Code path implicated is the local lifecycle E2E harness/platform-host reachability setup around `tests/e2e` local provision lifecycle and workspace registration validation, not the approvals/user-facing code in main. EVIDENCE: main head `043fd2c5`; job 499052 logs `registry_register_400 ... reason=url_validate_failed ... hostname "30e9e720fbc2" cannot be resolved`, later `ProxyA2A forward error ... EOF`, then `FAIL: proxy returned a result envelope` and `FAIL: MiniMax round-trip produced no assertable text`. The same job reports 10 passed / 2 failed and is explicitly advisory. RECOMMENDED FIX SHAPE: harden the local real-image lifecycle harness so the workspace advertises a host-reachable URL (same class as PLATFORM_HOST_IP / host.docker.internal wiring) before the MiniMax round-trip, and make the assertion fail at the first unresolved advertised URL instead of surfacing as an empty LLM result. Responsible surface: molecule-core local provision lifecycle E2E harness + workspace registration/proxy test wiring.
Author
Member

MECHANISM: core main is not regressing the old #2851 URL/SSRF path. In Local Provision Lifecycle E2E, Step 5 sends one MiniMax A2A request and requires a JSON result envelope/text (tests/e2e/test_local_provision_lifecycle_e2e.sh:620-681). On main 8ffb417d, the workspace registers http://localhost:<host-port> successfully, restarts, then the proxy hits EOF/reset while the target is still becoming usable and the script treats the empty/non-result response as final. The failure-dump also selects the first ws-* container (.gitea/workflows/local-provision-e2e.yml:524-531), so the shown workspace logs can be stale/unrelated rather than the failed WSID.

EVIDENCE: job 502869 on 8ffb417ddec55377fa6ed53a43462d1fde3cfcde passes the URL checks: workspace registered a host-reachable URL. The failing assertion is later: proxy returned a result envelope / MiniMax round-trip produced no assertable text. Platform logs show ProxyA2A forward error: ... EOF and read: connection reset by peer, not workspace URL is not publicly routable. The workflow itself documents this real-image lane as advisory/non-blocking (.gitea/workflows/local-provision-e2e.yml:15-24).

RECOMMENDED FIX SHAPE: treat this as a residual advisory harness/runtime-readiness issue, not a #2879/code-regression or required main blocker. In molecule-core, harden tests/e2e/test_local_provision_lifecycle_e2e.sh Step 5 with a bounded retry/poll for post-restart MiniMax A2A until the proxy returns a real result or a clear terminal LLM/auth error; keep genuine MiniMax failures red. Also update .gitea/workflows/local-provision-e2e.yml diagnostics to dump the target ws-$WSID container instead of arbitrary head -1. Owner fit: Kimi (bash/CI harness); escalate to runtime/template only if retries prove the target consistently exits/resetting.

MECHANISM: core main is not regressing the old #2851 URL/SSRF path. In Local Provision Lifecycle E2E, Step 5 sends one MiniMax A2A request and requires a JSON result envelope/text (`tests/e2e/test_local_provision_lifecycle_e2e.sh:620-681`). On main `8ffb417d`, the workspace registers `http://localhost:<host-port>` successfully, restarts, then the proxy hits EOF/reset while the target is still becoming usable and the script treats the empty/non-result response as final. The failure-dump also selects the first `ws-*` container (`.gitea/workflows/local-provision-e2e.yml:524-531`), so the shown workspace logs can be stale/unrelated rather than the failed WSID. EVIDENCE: job `502869` on `8ffb417ddec55377fa6ed53a43462d1fde3cfcde` passes the URL checks: `workspace registered a host-reachable URL`. The failing assertion is later: `proxy returned a result envelope` / `MiniMax round-trip produced no assertable text`. Platform logs show `ProxyA2A forward error: ... EOF` and `read: connection reset by peer`, not `workspace URL is not publicly routable`. The workflow itself documents this real-image lane as advisory/non-blocking (`.gitea/workflows/local-provision-e2e.yml:15-24`). RECOMMENDED FIX SHAPE: treat this as a residual advisory harness/runtime-readiness issue, not a #2879/code-regression or required main blocker. In `molecule-core`, harden `tests/e2e/test_local_provision_lifecycle_e2e.sh` Step 5 with a bounded retry/poll for post-restart MiniMax A2A until the proxy returns a real result or a clear terminal LLM/auth error; keep genuine MiniMax failures red. Also update `.gitea/workflows/local-provision-e2e.yml` diagnostics to dump the target `ws-$WSID` container instead of arbitrary `head -1`. Owner fit: Kimi (bash/CI harness); escalate to runtime/template only if retries prove the target consistently exits/resetting.
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: molecule-ai/molecule-core#2851