Auto-promote staging→main fails on Gitea: gh CLI calls /api/graphql + workflow_dispatch endpoint missing (#195 Phase 1) #73

Closed
opened 2026-05-07 22:20:26 +00:00 by claude-ceo-assistant · 1 comment

Phase 1 — Investigation findings

Symptom

auto-promote-staging.yml is the staging→main promote workflow. It currently uses gh pr create, gh pr merge --auto, gh pr view, gh run list, and gh workflow run against Gitea. Every one of those calls hits a Gitea endpoint that does not exist or returns 405:

  • gh pr createPOST /api/graphql → 405 (same root cause as #65)
  • gh pr merge --auto → GraphQL → 405
  • gh pr list/view --json → GraphQL → 405
  • gh run list --workflow=... → GraphQL → 405
  • gh workflow run … → REST POST /actions/workflows/{id}/dispatches does NOT exist on Gitea 1.22.6 (verified via https://git.moleculesai.app/swagger.v1.json)

This workflow has not run successfully a single time since the GitHub→Gitea cutover on 2026-05-06. Next time staging passes its gates, the promote will fail red.

Root cause class

Same as #65: the workflow assumes:

  1. GitHub GraphQL is reachable (it isn't on Gitea)
  2. GitHub merge queue exists (it doesn't on Gitea)
  3. workflow_dispatch REST endpoint exists (it doesn't on Gitea 1.22.6)

Critical constraint discovered: main cannot be direct-pushed

PR #66 fixed the auto-sync (main→staging) by direct-pushing as devops-engineer (whitelisted on staging via push_whitelist_usernames). The reverse direction (staging→main) cannot use that pattern because main's branch protection has enable_push: false with NO whitelist:

{
  "branch_name": "main",
  "enable_push": false,
  "push_whitelist_usernames": [],
  "required_approvals": 1,
  "block_on_outdated_branch": true,
  "dismiss_stale_approvals": true,
  "block_on_rejected_reviews": true
}

(Verified via GET /api/v1/repos/molecule-ai/molecule-core/branch_protections.)

Direct push to main is impossible for any persona. The promote MUST go via a PR.

Affected surfaces (audit per feedback_gitea_actions_migration_audit_pattern)

  1. Workflow YAMLauto-promote-staging.yml (this issue). 5 distinct gh call sites.
  2. Token + scopeAUTO_SYNC_TOKEN (devops-engineer persona) already exists and has push: true repo scope. Same persona can create PRs against main and merge them via Gitea REST. No new secret needed.
  3. Branch protectionmain ruleset stays untouched. PR-mediated merges respect approvals + status checks naturally. No edits.
  4. Runner config — irrelevant; this is workflow code not act_runner config.
  5. Docs — workflow header comment block needs full rewrite (currently describes GitHub-era merge-queue mechanism).

Downstream cascade analysis

The original workflow had a tail step that explicitly dispatched publish-workspace-server-image.yml after promote merge, because GitHub's GITHUB_TOKEN-initiated merges suppress downstream on: push events.

This is a GitHub-specific safety rule (no recursion). Gitea Actions does not have this rule (verified empirically: PR #66's merge to main fired auto-sync-main-to-staging naturally on the next push trigger).

publish-workspace-server-image.yml triggers on on: push: branches: [main]. Once the promote PR merges, the resulting commit on main fires the cascade naturally. The explicit dispatch step is now dead code on Gitea — and even if we wanted to keep it, Gitea has no workflow_dispatch REST endpoint to call.

Why approval requirement on main is load-bearing

feedback_prod_apply_needs_hongming_chat_go (saved 2026-05-07): "prod state mutations route through Hongming chat". Staging→main IS a prod state mutation (the next deploy fans out to tenants). Auto-merging without human review would fight the project rule.

Gitea's merge_when_checks_succeed: true + required_approvals: 1 work well together: the workflow opens the PR, schedules auto-merge, but Gitea waits for approval AND status checks before landing. Hongming reviews via the canvas/chat-handle of the PR notification → approves → Gitea auto-merges. Zero branch-protection changes needed.

Parked follow-ups

  • The explicit publish-workspace-server-image dispatch tail step is removed (dead code on Gitea). If we ever need to wake it manually for a misfired cascade, an operator can gh workflow run from a CLI with GitHub-shape — wait, no, that endpoint doesn't exist on Gitea either. Operator alternative: empty commit to main triggers the publish. File a separate small issue if observed.

Fix

PR https://git.moleculesai.app/molecule-ai/molecule-core/pulls/: rewrite to use Gitea REST API (no gh CLI). New shape:

  1. Check all required gates (Gitea REST gate-status query, replacing gh run list)
  2. Open or reuse a PR staging → main via POST /api/v1/repos/.../pulls
  3. Schedule auto-merge via POST /api/v1/repos/.../pulls/{index}/merge with merge_when_checks_succeed: true
  4. Done. Gitea waits for approval + green checks, then merges. The merge-commit on main fires the natural on: push cascade.

The post-merge polling + workflow_dispatch chain is removed entirely (dead on Gitea, replaced by natural on: push cascade).

Verification plan

  • E2E: trigger a real promote on any green staging SHA. Watch:
    • Gitea PR opened by devops-engineer with auto-merge scheduled
    • On Hongming approval + green checks, Gitea auto-merges
    • main advances; publish-workspace-server-image.yml fires naturally
  • ≥2 consecutive green runs.

Refs

  • Issue #65 (Phase 1 findings, the auto-sync sister case)
  • PR #66 (the merge-queue→direct-push pattern, MERGED — used as reference for shape and identity model)
  • Saved memories: feedback_per_agent_gitea_identity_default, feedback_fix_root_not_symptom, feedback_gitea_actions_migration_audit_pattern, feedback_prod_apply_needs_hongming_chat_go, feedback_long_term_robust_automated
# Phase 1 — Investigation findings ## Symptom `auto-promote-staging.yml` is the staging→main promote workflow. It currently uses `gh pr create`, `gh pr merge --auto`, `gh pr view`, `gh run list`, and `gh workflow run` against Gitea. Every one of those calls hits a Gitea endpoint that does not exist or returns 405: - `gh pr create` → `POST /api/graphql` → 405 (same root cause as #65) - `gh pr merge --auto` → GraphQL → 405 - `gh pr list/view --json` → GraphQL → 405 - `gh run list --workflow=...` → GraphQL → 405 - `gh workflow run …` → REST `POST /actions/workflows/{id}/dispatches` does NOT exist on Gitea 1.22.6 (verified via `https://git.moleculesai.app/swagger.v1.json`) This workflow has not run successfully a single time since the GitHub→Gitea cutover on 2026-05-06. Next time staging passes its gates, the promote will fail red. ## Root cause class Same as #65: the workflow assumes: 1. GitHub GraphQL is reachable (it isn't on Gitea) 2. GitHub merge queue exists (it doesn't on Gitea) 3. `workflow_dispatch` REST endpoint exists (it doesn't on Gitea 1.22.6) ## Critical constraint discovered: `main` cannot be direct-pushed PR #66 fixed the auto-sync (main→staging) by direct-pushing as `devops-engineer` (whitelisted on staging via `push_whitelist_usernames`). The reverse direction (staging→main) **cannot use that pattern** because main's branch protection has `enable_push: false` with NO whitelist: ```json { "branch_name": "main", "enable_push": false, "push_whitelist_usernames": [], "required_approvals": 1, "block_on_outdated_branch": true, "dismiss_stale_approvals": true, "block_on_rejected_reviews": true } ``` (Verified via `GET /api/v1/repos/molecule-ai/molecule-core/branch_protections`.) Direct push to main is impossible for any persona. The promote MUST go via a PR. ## Affected surfaces (audit per `feedback_gitea_actions_migration_audit_pattern`) 1. **Workflow YAML** — `auto-promote-staging.yml` (this issue). 5 distinct `gh` call sites. 2. **Token + scope** — `AUTO_SYNC_TOKEN` (devops-engineer persona) already exists and has `push: true` repo scope. Same persona can create PRs against `main` and merge them via Gitea REST. No new secret needed. 3. **Branch protection** — `main` ruleset stays untouched. PR-mediated merges respect approvals + status checks naturally. No edits. 4. **Runner config** — irrelevant; this is workflow code not act_runner config. 5. **Docs** — workflow header comment block needs full rewrite (currently describes GitHub-era merge-queue mechanism). ## Downstream cascade analysis The original workflow had a tail step that explicitly dispatched `publish-workspace-server-image.yml` after promote merge, because GitHub's GITHUB_TOKEN-initiated merges suppress downstream `on: push` events. This is a GitHub-specific safety rule (no recursion). Gitea Actions does not have this rule (verified empirically: PR #66's merge to main fired `auto-sync-main-to-staging` naturally on the next push trigger). `publish-workspace-server-image.yml` triggers on `on: push: branches: [main]`. Once the promote PR merges, the resulting commit on main fires the cascade naturally. **The explicit dispatch step is now dead code on Gitea — and even if we wanted to keep it, Gitea has no `workflow_dispatch` REST endpoint to call.** ## Why approval requirement on main is load-bearing `feedback_prod_apply_needs_hongming_chat_go` (saved 2026-05-07): "prod state mutations route through Hongming chat". Staging→main IS a prod state mutation (the next deploy fans out to tenants). Auto-merging without human review would fight the project rule. Gitea's `merge_when_checks_succeed: true` + `required_approvals: 1` work well together: the workflow opens the PR, schedules auto-merge, but Gitea waits for approval AND status checks before landing. Hongming reviews via the canvas/chat-handle of the PR notification → approves → Gitea auto-merges. Zero branch-protection changes needed. ## Parked follow-ups - The explicit publish-workspace-server-image dispatch tail step is removed (dead code on Gitea). If we ever need to wake it manually for a misfired cascade, an operator can `gh workflow run` from a CLI with GitHub-shape — wait, no, that endpoint doesn't exist on Gitea either. Operator alternative: empty commit to main triggers the publish. File a separate small issue if observed. ## Fix PR https://git.moleculesai.app/molecule-ai/molecule-core/pulls/<NEW>: rewrite to use Gitea REST API (no `gh` CLI). New shape: 1. Check all required gates (Gitea REST gate-status query, replacing `gh run list`) 2. Open or reuse a PR `staging → main` via `POST /api/v1/repos/.../pulls` 3. Schedule auto-merge via `POST /api/v1/repos/.../pulls/{index}/merge` with `merge_when_checks_succeed: true` 4. Done. Gitea waits for approval + green checks, then merges. The merge-commit on main fires the natural `on: push` cascade. The post-merge polling + workflow_dispatch chain is removed entirely (dead on Gitea, replaced by natural `on: push` cascade). ## Verification plan - E2E: trigger a real promote on any green staging SHA. Watch: - Gitea PR opened by devops-engineer with auto-merge scheduled - On Hongming approval + green checks, Gitea auto-merges - main advances; `publish-workspace-server-image.yml` fires naturally - ≥2 consecutive green runs. ## Refs - Issue #65 (Phase 1 findings, the auto-sync sister case) - PR #66 (the merge-queue→direct-push pattern, MERGED — used as reference for shape and identity model) - Saved memories: `feedback_per_agent_gitea_identity_default`, `feedback_fix_root_not_symptom`, `feedback_gitea_actions_migration_audit_pattern`, `feedback_prod_apply_needs_hongming_chat_go`, `feedback_long_term_robust_automated`
Author
Owner

Fix shipped in PR #78. All 22 CI contexts green; ready for review.

Verification path (Phase 4): merge PR #78 → workflow becomes live on main → operator triggers workflow_dispatch with force=true for the first real promote test. Will need ≥2 consecutive green runs of the new mechanism to consider this fully verified.

Phase 4 verification cannot be performed by the bot (would require self-merge to main, exactly the bot-ring fingerprint we are avoiding per feedback_github_botring_fingerprint).

Fix shipped in PR #78. All 22 CI contexts green; ready for review. Verification path (Phase 4): merge PR #78 → workflow becomes live on main → operator triggers `workflow_dispatch` with `force=true` for the first real promote test. Will need ≥2 consecutive green runs of the new mechanism to consider this fully verified. Phase 4 verification cannot be performed by the bot (would require self-merge to main, exactly the bot-ring fingerprint we are avoiding per `feedback_github_botring_fingerprint`).
Sign in to join this conversation.
No Milestone
No project
No Assignees
1 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: molecule-ai/molecule-core#73
No description provided.