[main-red] molecule-ai/molecule-core: b3241aecf5 #2450

Closed
opened 2026-06-08 23:07:49 +00:00 by gitea-actions · 1 comment

Main is RED on molecule-ai/molecule-core at b3241aecf5

Commit: https://git.moleculesai.app/molecule-ai/molecule-core/commit/b3241aecf57545e493cf3ae999b431bdf0e22d78

Auto-filed by .gitea/workflows/main-red-watchdog.yml (Option C of the main-never-red directive). Per feedback_no_such_thing_as_flakes + feedback_fix_root_not_symptom: investigate the root cause; do NOT revert as a reflex. The watchdog itself never reverts.

Failed status contexts

  • Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (stub) (push)failurelogs
    • Failing after 1m31s
  • E2E Staging SaaS (full lifecycle) / E2E Staging Concierge user_tasks (push)failurelogs
    • Failing after 2m40s
  • E2E Staging SaaS (full lifecycle) / E2E Staging Concierge Platform Agent (push)failurelogs
    • Failing after 2m55s
  • Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (real image + MiniMax LLM, advisory) (push)failurelogs
    • Failing after 31s
  • E2E Staging SaaS (full lifecycle) / E2E Staging Platform Boot (push)failurelogs
    • Failing after 5m9s
  • E2E Staging SaaS (full lifecycle) / E2E Staging Concierge Creates Workspace (push)failurelogs
    • Failing after 5m42s
  • E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (push)failurelogs
    • Failing after 7m5s
  • E2E Chat / E2E Chat (push)failurelogs
    • Failing after 7m19s
  • publish-workspace-server-image / Production auto-deploy (push)failurelogs
    • Failing after 4m50s

Resolution path

  1. Read the failed logs (links above).
  2. If reproducible locally, fix forward in a PR targeting main.
  3. If the failure is a real flake — STOP. Per feedback_no_such_thing_as_flakes, intermittent failures are real bugs. Investigate to root cause; do not mark as flake.
  4. If the failure is blocking unrelated work for >1 hour, file a follow-up issue and assign someone. Do NOT revert without a human GO per feedback_prod_apply_needs_hongming_chat_go (branch protection is a prod surface).

Debug

{
  "all_contexts": [
    {
      "context": "E2E Chat / detect-changes (push)",
      "state": "success"
    },
    {
      "context": "E2E Staging Canvas (Playwright) / detect-changes (push)",
      "state": "success"
    },
    {
      "context": "E2E Staging SaaS (full lifecycle) / pr-validate (push)",
      "state": "success"
    },
    {
      "context": "Handlers Postgres Integration / detect-changes (push)",
      "state": "success"
    },
    {
      "context": "CI / Python Lint & Test (push)",
      "state": "success"
    },
    {
      "context": "E2E API Smoke Test / detect-changes (push)",
      "state": "success"
    },
    {
      "context": "E2E Staging SaaS (full lifecycle) / E2E Staging Concierge (compile+skip) (push)",
      "state": "success"
    },
    {
      "context": "Harness Replays / detect-changes (push)",
      "state": "success"
    },
    {
      "context": "Secret scan / Scan diff for credential-shaped strings (push)",
      "state": "success"
    },
    {
      "context": "CI / Canvas (Next.js) (push)",
      "state": "success"
    },
    {
      "context": "CI / Shellcheck (E2E scripts) (push)",
      "state": "success"
    },
    {
      "context": "Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (push)",
      "state": "success"
    },
    {
      "context": "Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push)",
      "state": "success"
    },
    {
      "context": "E2E Staging Canvas (Playwright) / Canvas tabs E2E (push)",
      "state": "success"
    },
    {
      "context": "Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (stub) (push)",
      "state": "failure"
    },
    {
      "context": "E2E Staging SaaS (full lifecycle) / E2E Staging Concierge user_tasks (push)",
      "state": "failure"
    },
    {
      "context": "E2E Staging SaaS (full lifecycle) / E2E Staging Concierge Platform Agent (push)",
      "state": "failure"
    },
    {
      "context": "Harness Replays / Harness Replays (push)",
      "state": "success"
    },
    {
      "context": "CI / Canvas Deploy Status (push)",
      "state": "success"
    },
    {
      "context": "publish-workspace-server-image / build-and-push (push)",
      "state": "success"
    },
    {
      "context": "Handlers Postgres Integration / Handlers Postgres Integration (push)",
      "state": "success"
    },
    {
      "context": "Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (real image + MiniMax LLM, advisory) (push)",
      "state": "failure"
    },
    {
      "context": "CI / Platform (Go) (push)",
      "state": "success"
    },
    {
      "context": "CI / all-required (push)",
      "state": "success"
    },
    {
      "context": "E2E Staging SaaS (full lifecycle) / E2E Staging Platform Boot (push)",
      "state": "failure"
    },
    {
      "context": "E2E Staging SaaS (full lifecycle) / E2E Staging Concierge Creates Workspace (push)",
      "state": "failure"
    },
    {
      "context": "E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (push)",
      "state": "failure"
    },
    {
      "context": "E2E Chat / E2E Chat (push)",
      "state": "failure"
    },
    {
      "context": "publish-workspace-server-image / Production auto-deploy (push)",
      "state": "failure"
    },
    {
      "context": "E2E API Smoke Test / E2E API Smoke Test (push)",
      "state": "success"
    }
  ],
  "branch": "main",
  "combined_state": "failure",
  "failed_contexts": [
    "Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (stub) (push)",
    "E2E Staging SaaS (full lifecycle) / E2E Staging Concierge user_tasks (push)",
    "E2E Staging SaaS (full lifecycle) / E2E Staging Concierge Platform Agent (push)",
    "Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (real image + MiniMax LLM, advisory) (push)",
    "E2E Staging SaaS (full lifecycle) / E2E Staging Platform Boot (push)",
    "E2E Staging SaaS (full lifecycle) / E2E Staging Concierge Creates Workspace (push)",
    "E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (push)",
    "E2E Chat / E2E Chat (push)",
    "publish-workspace-server-image / Production auto-deploy (push)"
  ],
  "recheck_combined_state": "failure",
  "recheck_failed_contexts": [
    "Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (stub) (push)",
    "E2E Staging SaaS (full lifecycle) / E2E Staging Concierge user_tasks (push)",
    "E2E Staging SaaS (full lifecycle) / E2E Staging Concierge Platform Agent (push)",
    "Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (real image + MiniMax LLM, advisory) (push)",
    "E2E Staging SaaS (full lifecycle) / E2E Staging Platform Boot (push)",
    "E2E Staging SaaS (full lifecycle) / E2E Staging Concierge Creates Workspace (push)",
    "E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (push)",
    "E2E Chat / E2E Chat (push)",
    "publish-workspace-server-image / Production auto-deploy (push)"
  ],
  "sha": "b3241aecf57545e493cf3ae999b431bdf0e22d78"
}

This issue is idempotent: the watchdog runs hourly at :05 and edits this body in place. When main returns to green, the watchdog will close this issue automatically with a "main returned to green" comment.

# Main is RED on `molecule-ai/molecule-core` at `b3241aecf5` Commit: <https://git.moleculesai.app/molecule-ai/molecule-core/commit/b3241aecf57545e493cf3ae999b431bdf0e22d78> Auto-filed by `.gitea/workflows/main-red-watchdog.yml` (Option C of the [main-never-red directive](https://git.moleculesai.app/molecule-ai/molecule-core/issues/420)). Per `feedback_no_such_thing_as_flakes` + `feedback_fix_root_not_symptom`: investigate the root cause; do NOT revert as a reflex. The watchdog itself never reverts. ## Failed status contexts - **Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (stub) (push)** — `failure` → [logs](/molecule-ai/molecule-core/actions/runs/292964/jobs/390982) - Failing after 1m31s - **E2E Staging SaaS (full lifecycle) / E2E Staging Concierge user_tasks (push)** — `failure` → [logs](/molecule-ai/molecule-core/actions/runs/292959/jobs/390971) - Failing after 2m40s - **E2E Staging SaaS (full lifecycle) / E2E Staging Concierge Platform Agent (push)** — `failure` → [logs](/molecule-ai/molecule-core/actions/runs/292959/jobs/390974) - Failing after 2m55s - **Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (real image + MiniMax LLM, advisory) (push)** — `failure` → [logs](/molecule-ai/molecule-core/actions/runs/292964/jobs/390983) - Failing after 31s - **E2E Staging SaaS (full lifecycle) / E2E Staging Platform Boot (push)** — `failure` → [logs](/molecule-ai/molecule-core/actions/runs/292959/jobs/390970) - Failing after 5m9s - **E2E Staging SaaS (full lifecycle) / E2E Staging Concierge Creates Workspace (push)** — `failure` → [logs](/molecule-ai/molecule-core/actions/runs/292959/jobs/390972) - Failing after 5m42s - **E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (push)** — `failure` → [logs](/molecule-ai/molecule-core/actions/runs/292959/jobs/390969) - Failing after 7m5s - **E2E Chat / E2E Chat (push)** — `failure` → [logs](/molecule-ai/molecule-core/actions/runs/292957/jobs/390965) - Failing after 7m19s - **publish-workspace-server-image / Production auto-deploy (push)** — `failure` → [logs](/molecule-ai/molecule-core/actions/runs/292965/jobs/390985) - Failing after 4m50s ## Resolution path 1. Read the failed logs (links above). 2. If reproducible locally, fix forward in a PR targeting `main`. 3. If the failure is a real flake — STOP. Per `feedback_no_such_thing_as_flakes`, intermittent failures are real bugs. Investigate to root cause; do not mark as flake. 4. If the failure is blocking unrelated work for >1 hour, file a follow-up issue and assign someone. Do NOT revert without a human GO per `feedback_prod_apply_needs_hongming_chat_go` (branch protection is a prod surface). ## Debug ```json { "all_contexts": [ { "context": "E2E Chat / detect-changes (push)", "state": "success" }, { "context": "E2E Staging Canvas (Playwright) / detect-changes (push)", "state": "success" }, { "context": "E2E Staging SaaS (full lifecycle) / pr-validate (push)", "state": "success" }, { "context": "Handlers Postgres Integration / detect-changes (push)", "state": "success" }, { "context": "CI / Python Lint & Test (push)", "state": "success" }, { "context": "E2E API Smoke Test / detect-changes (push)", "state": "success" }, { "context": "E2E Staging SaaS (full lifecycle) / E2E Staging Concierge (compile+skip) (push)", "state": "success" }, { "context": "Harness Replays / detect-changes (push)", "state": "success" }, { "context": "Secret scan / Scan diff for credential-shaped strings (push)", "state": "success" }, { "context": "CI / Canvas (Next.js) (push)", "state": "success" }, { "context": "CI / Shellcheck (E2E scripts) (push)", "state": "success" }, { "context": "Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (push)", "state": "success" }, { "context": "Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (push)", "state": "success" }, { "context": "E2E Staging Canvas (Playwright) / Canvas tabs E2E (push)", "state": "success" }, { "context": "Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (stub) (push)", "state": "failure" }, { "context": "E2E Staging SaaS (full lifecycle) / E2E Staging Concierge user_tasks (push)", "state": "failure" }, { "context": "E2E Staging SaaS (full lifecycle) / E2E Staging Concierge Platform Agent (push)", "state": "failure" }, { "context": "Harness Replays / Harness Replays (push)", "state": "success" }, { "context": "CI / Canvas Deploy Status (push)", "state": "success" }, { "context": "publish-workspace-server-image / build-and-push (push)", "state": "success" }, { "context": "Handlers Postgres Integration / Handlers Postgres Integration (push)", "state": "success" }, { "context": "Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (real image + MiniMax LLM, advisory) (push)", "state": "failure" }, { "context": "CI / Platform (Go) (push)", "state": "success" }, { "context": "CI / all-required (push)", "state": "success" }, { "context": "E2E Staging SaaS (full lifecycle) / E2E Staging Platform Boot (push)", "state": "failure" }, { "context": "E2E Staging SaaS (full lifecycle) / E2E Staging Concierge Creates Workspace (push)", "state": "failure" }, { "context": "E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (push)", "state": "failure" }, { "context": "E2E Chat / E2E Chat (push)", "state": "failure" }, { "context": "publish-workspace-server-image / Production auto-deploy (push)", "state": "failure" }, { "context": "E2E API Smoke Test / E2E API Smoke Test (push)", "state": "success" } ], "branch": "main", "combined_state": "failure", "failed_contexts": [ "Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (stub) (push)", "E2E Staging SaaS (full lifecycle) / E2E Staging Concierge user_tasks (push)", "E2E Staging SaaS (full lifecycle) / E2E Staging Concierge Platform Agent (push)", "Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (real image + MiniMax LLM, advisory) (push)", "E2E Staging SaaS (full lifecycle) / E2E Staging Platform Boot (push)", "E2E Staging SaaS (full lifecycle) / E2E Staging Concierge Creates Workspace (push)", "E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (push)", "E2E Chat / E2E Chat (push)", "publish-workspace-server-image / Production auto-deploy (push)" ], "recheck_combined_state": "failure", "recheck_failed_contexts": [ "Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (stub) (push)", "E2E Staging SaaS (full lifecycle) / E2E Staging Concierge user_tasks (push)", "E2E Staging SaaS (full lifecycle) / E2E Staging Concierge Platform Agent (push)", "Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (real image + MiniMax LLM, advisory) (push)", "E2E Staging SaaS (full lifecycle) / E2E Staging Platform Boot (push)", "E2E Staging SaaS (full lifecycle) / E2E Staging Concierge Creates Workspace (push)", "E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (push)", "E2E Chat / E2E Chat (push)", "publish-workspace-server-image / Production auto-deploy (push)" ], "sha": "b3241aecf57545e493cf3ae999b431bdf0e22d78" } ``` _This issue is idempotent: the watchdog runs hourly at `:05` and edits this body in place. When `main` returns to green, the watchdog will close this issue automatically with a "main returned to green" comment._
Member

Initial triage of the failed jobs on this head:

  1. Local Provision Lifecycle E2E (stub) — root cause is listen tcp 127.0.0.1:8080: bind: address already in use (runner port conflict). This suggests a prior job's platform-server process is still holding port 8080 on the shared runner, or cleanup is incomplete.
  2. E2E Chat — root cause is Playwright locator timeouts (locator.click: Test timeout of 30000ms exceeded). 7 tests failed; appears to be a UI rendering / performance regression rather than an infra port issue.
  3. Staging SaaS failures likely share the same port-conflict or service-startup root cause as the local provision jobs.

Recommendation: the port-conflict class needs runner-level cleanup (or dynamic port allocation in the test harness). The Playwright timeout class needs a frontend/perf investigation. Per feedback_fix_root_not_symptom, these should be fixed forward, not reverted.

I am continuing to monitor my own open PRs (#2438, #2426, #2451) which are queued behind this backlog.

Initial triage of the failed jobs on this head: 1. **Local Provision Lifecycle E2E (stub)** — root cause is `listen tcp 127.0.0.1:8080: bind: address already in use` (runner port conflict). This suggests a prior job's platform-server process is still holding port 8080 on the shared runner, or cleanup is incomplete. 2. **E2E Chat** — root cause is Playwright locator timeouts (`locator.click: Test timeout of 30000ms exceeded`). 7 tests failed; appears to be a UI rendering / performance regression rather than an infra port issue. 3. **Staging SaaS** failures likely share the same port-conflict or service-startup root cause as the local provision jobs. Recommendation: the port-conflict class needs runner-level cleanup (or dynamic port allocation in the test harness). The Playwright timeout class needs a frontend/perf investigation. Per `feedback_fix_root_not_symptom`, these should be fixed forward, not reverted. I am continuing to monitor my own open PRs (#2438, #2426, #2451) which are queued behind this backlog.
Sign in to join this conversation.
2 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: molecule-ai/molecule-core#2450