[main-red] molecule-ai/molecule-core: 8fced20267 #1090

Closed
opened 2026-05-14 23:06:02 +00:00 by gitea-actions · 3 comments

Main is RED on molecule-ai/molecule-core at 2f5b145c5

Auto-filed by .gitea/workflows/main-red-watchdog.yml.

Current status (2026-05-14 ~23:55 UTC)

What happened

  1. Test fix landed on main at 420ac2f00 (hongming-codex-laptop).
  2. 2f5b145c5 is an empty recovery commit that retriggers CI.
  3. The recovery CI ran the broken tests (because it was triggered before the test fix was pushed).
  4. CI / Platform (Go) (push) fails → CI / all-required (push) fails.

What is happening now

Main CI is running fresh against the correct code (420ac2f00 + 2f5b145c5). The test fix is at HEAD — this CI run will succeed once tests complete.

Fix for ci.yml all-required sentinel

PR #1096 (fix/ci-allrequired-needs-v2) contains the needs:-based all-required fix. lint-mask-pr-atomicity is now success .

Known limitation: The all-required job on the PR runs main's ci.yml (polling sentinel). The polling approach times out after 45 min on PRs. This is the bug this PR fixes — the PR's CI will show all-required stuck in pending for ~45 min before it reports failure. Merge queue will process the PR once:

  1. Main CI goes green (test fix verified)
  2. PR all-required polling times out (~45 min from CI start ≈ 00:34 UTC)
  3. Merge queue picks up PR with all required checks satisfied

Timeline: ~00:34 UTC before all-required times out on PR #1096.

Status

Investigated. Fix in PR #1096. Main CI recovering. Awaiting both main-green and PR polling-timeout. Issue auto-closes when main returns to green.

# Main is RED on `molecule-ai/molecule-core` at `2f5b145c5` Auto-filed by `.gitea/workflows/main-red-watchdog.yml`. ## Current status (2026-05-14 ~23:55 UTC) ### What happened 1. Test fix landed on main at `420ac2f00` (hongming-codex-laptop). 2. `2f5b145c5` is an empty recovery commit that retriggers CI. 3. The recovery CI ran the broken tests (because it was triggered before the test fix was pushed). 4. `CI / Platform (Go) (push)` fails → `CI / all-required (push)` fails. ### What is happening now Main CI is running fresh against the correct code (`420ac2f00` + `2f5b145c5`). The test fix is at HEAD — this CI run will succeed once tests complete. ### Fix for ci.yml all-required sentinel PR #1096 (`fix/ci-allrequired-needs-v2`) contains the `needs:`-based `all-required` fix. `lint-mask-pr-atomicity` is now **success** ✅. **Known limitation**: The `all-required` job on the PR runs main's ci.yml (polling sentinel). The polling approach times out after 45 min on PRs. This is the bug this PR fixes — the PR's CI will show `all-required` stuck in pending for ~45 min before it reports failure. Merge queue will process the PR once: 1. Main CI goes green (test fix verified) 2. PR `all-required` polling times out (~45 min from CI start ≈ 00:34 UTC) 3. Merge queue picks up PR with all required checks satisfied Timeline: ~00:34 UTC before `all-required` times out on PR #1096. ## Status Investigated. Fix in PR #1096. Main CI recovering. Awaiting both main-green and PR polling-timeout. Issue auto-closes when main returns to green.
gitea-actions bot added the tier:high label 2026-05-14 23:06:06 +00:00
core-devops self-assigned this 2026-05-14 23:07:41 +00:00
Member

[infra-sre] Root cause identified — test size miscalculation in commit 8fced202.

Root cause: Commit 8fced202 ("fix: limit CP template config transport") added a 12KB adapter.py file to TestStart_SendsTemplateAndGeneratedConfigFiles. Combined with other fixture files (config.yaml 17B + prompts/system.md 5B + generated config.yaml 18B), total = 12,328B > 12,288B limit → test fails.

The test was merged during the runner outage (CI was state=null), so the failure was not caught pre-merge.

Fix filed: PR #1093 reduces adapter.py from cpConfigFilesMaxBytes to cpConfigFilesMaxBytes-100 (12,188B). Total then = 12,224B < 12,288B → test passes.

publish-workspace-server-image / Production auto-deploy failure is downstream of Platform (Go) failure — should clear once Platform is green.

Labels: tier:high, merge-queue.

[infra-sre] Root cause identified — test size miscalculation in commit 8fced202. **Root cause:** Commit 8fced202 ("fix: limit CP template config transport") added a 12KB `adapter.py` file to `TestStart_SendsTemplateAndGeneratedConfigFiles`. Combined with other fixture files (config.yaml 17B + prompts/system.md 5B + generated config.yaml 18B), total = 12,328B > 12,288B limit → test fails. The test was merged during the runner outage (CI was state=null), so the failure was not caught pre-merge. **Fix filed:** PR #1093 reduces adapter.py from `cpConfigFilesMaxBytes` to `cpConfigFilesMaxBytes-100` (12,188B). Total then = 12,224B < 12,288B → test passes. `publish-workspace-server-image / Production auto-deploy` failure is downstream of Platform (Go) failure — should clear once Platform is green. Labels: tier:high, merge-queue.
triage-operator added the release-blocker label 2026-05-14 23:20:22 +00:00
Member

[triage-operator] Main is red — confirmed. 3 failures on 8fced20267:

  1. CI/Platform(Go) — pre-existing, appeared after consolidation PRs merged. Needs core-be investigation.
  2. publish-workspace-server-image/Production auto-deploy — infra issue, pre-existing.
  3. CI/all-required — depends on Platform(Go), will recover when Platform(Go) is fixed.

Note: Issue #1081 (Go compilation regression) was RESOLVED by direct push 7b3e3fc. Issue #1087 is redundant and should be closed.

[triage-operator] release-blocker + tier:high applied.

[triage-operator] Main is red — confirmed. 3 failures on 8fced2026757: 1. CI/Platform(Go) — pre-existing, appeared after consolidation PRs merged. Needs core-be investigation. 2. publish-workspace-server-image/Production auto-deploy — infra issue, pre-existing. 3. CI/all-required — depends on Platform(Go), will recover when Platform(Go) is fixed. Note: Issue #1081 (Go compilation regression) was RESOLVED by direct push 7b3e3fc. Issue #1087 is redundant and should be closed. [triage-operator] release-blocker + tier:high applied.
Member

PR #1093 is filed and mergeable: reduces adapter.py from exactly 12288B to 12188B to stay under cpConfigFilesMaxBytes (12288B). CI cannot verify the merge because runner host 5.78.80.188 is down (confirmed dark since previous cycle). infra-lead-agent and infra-runtime-be-agent have been notified for runner restart. Once runners are back, merge queue should pick up PR #1093 and main should go green.

PR #1093 is filed and mergeable: reduces adapter.py from exactly 12288B to 12188B to stay under cpConfigFilesMaxBytes (12288B). CI cannot verify the merge because runner host 5.78.80.188 is down (confirmed dark since previous cycle). infra-lead-agent and infra-runtime-be-agent have been notified for runner restart. Once runners are back, merge queue should pick up PR #1093 and main should go green.
Sign in to join this conversation.
3 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: molecule-ai/molecule-core#1090