fix(ci): rewrite auto-sync main→staging for Gitea direct push (closes #65) #66

Merged
Ghost merged 1 commits from fix/auto-sync-direct-push-gitea into main 2026-05-07 22:07:01 +00:00
First-time contributor

Summary

Root-cause fix for the persistent Auto-sync main → staging / sync-staging (push) red on every push to main since the GitHub→Gitea migration.

Root cause (full Phase 1 findings: #65): the pre-suspension workflow used gh pr create + gh pr merge --auto to land sync via GitHub's merge queue. On Gitea this fails at the gh pr create step with HTTP 405 Method Not Allowed (https://git.moleculesai.app/api/graphql) — Gitea exposes no GraphQL endpoint.

Fix: drop the merge-queue PR architecture entirely. Gitea staging branch protection (push_whitelist_usernames: [devops-engineer]) already permits direct push from the devops-engineer persona, and AUTO_SYNC_TOKEN already exists as a repo secret. New workflow:

  1. Checkout staging with secrets.AUTO_SYNC_TOKEN.
  2. git fetch origin main + ff-merge or no-ff merge.
  3. git push origin staging directly.

No gh CLI. No GraphQL. No PR-through-queue. Three steps instead of six. ~165 LOC of stale GitHub-era PR plumbing removed.

Why this is the proper fix (not a workaround)

  • The earlier fix/auto-sync-use-devops-token branch only renamed the secret. That was insufficient because gh pr create still calls Gitea GraphQL → 405 regardless of the token.
  • Per feedback_fix_root_not_symptom: this PR fixes the root cause (mechanism mismatch with Gitea), not the symptom (red CI).
  • Per feedback_long_term_robust_automated: simpler, fewer moving parts, fewer external API dependencies.

Identity & security (anti-bot-ring)

Per feedback_per_agent_gitea_identity_default: this workflow uses the devops-engineer persona token, NOT the founder PAT. Commits authored by devops-engineer@agents.moleculesai.app. Push target restricted to staging only (the workflow has no code path that touches main). Compromise blast radius: bounded to staging branch + this repo's read surface.

Backwards compat

  • Workflow name: and job name: unchanged → required-check name Auto-sync main → staging / sync-staging (push) is identical → no branch-protection edits needed.
  • auto-promote-staging.yml's contract (staging is a superset of main before promote) is preserved — only the mechanism of advancing staging changes.
  • on: push: branches: [main] + workflow_dispatch triggers unchanged.

Rejected alternatives (in workflow header)

  1. Reuse the PR architecture via Gitea REST API — ~80 LOC of API plumbing for no benefit; direct push works.
  2. GH_HOST=git.moleculesai.app to make gh talk to Gitea — gh pr create still calls GraphQL → still 405. Empirically verified.
  3. Custom JS action — external dependency for a 5-line git push.

Parked follow-ups (separate PRs/issues)

  • HIGH: auto-promote-staging.yml uses the same broken gh pr create pattern. Also red on Gitea.
  • MEDIUM: retarget-main-to-staging.yml uses gh api -X PATCH. Same class.
  • LOW: ~30 other workflows have gh CLI calls. Comprehensive audit pending.
  • LOW: orphaned auto-sync/main-1e1f4d63 branch (created by the last failed run). Will be deleted manually after this PR lands and the new workflow lands a successful sync without creating per-SHA branches.
  • LOW: PR #52 (sister-agent's empty trigger commit) — not needed; this PR's merge to main is itself the trigger. Close after green.

Test plan

  • YAML parses cleanly (python3 -c "import yaml; yaml.safe_load(open(...))" ✓).
  • Workflow shape matches Gitea Actions runtime expectations: ubuntu-latest runner image (verified loaded image: runner-base:full-latest-cloudflared-goproxy-pipe), pinned actions/checkout@de0fac2e (already used elsewhere in repo).
  • Local dry-run: cloned repo as devops-engineer, merged origin/main into staging locally, git push --dry-run origin staging succeeded with e3904eb..4d1708d staging -> staging.
  • E2E: this PR's merge to main fires the workflow on the merge commit. Expect green.
  • Stability: trigger one follow-up no-op commit; expect a second consecutive green.
  • Verify staging tip == main tip (or merge-commit-of-main-into-staging) after both runs.

Hostile self-review (3 weakest spots)

  1. AUTO_SYNC_TOKEN rotation: if devops-engineer token rotates, this workflow silently fails on push (HTTP 401/403). Mitigation: workflow surfaces the failure mode in step summary (failure mode B in header). Long-term: persona-token rotation script should bump repo secret.
  2. Concurrency edge: concurrency.group: auto-sync-main-to-staging + cancel-in-progress: false queues runs. If two main pushes land in quick succession, the second waits on the first; the second's fetch sees the latest main tip. But: if the first fails (e.g. conflict), the second still runs and may also fail. Acceptable — better to surface every conflict than silently coalesce.
  3. No conflict-PR fallback: if main and staging legitimately conflict (rare; staging-superset invariant should prevent this), the workflow fails red and a human must resolve. The header documents the operator runbook (failure mode A). Could be enhanced with auto-PR-fallback later, but the simpler path is robust enough for the common case.

Refs

## Summary Root-cause fix for the persistent `Auto-sync main → staging / sync-staging (push)` red on every push to main since the GitHub→Gitea migration. **Root cause** (full Phase 1 findings: #65): the pre-suspension workflow used `gh pr create` + `gh pr merge --auto` to land sync via GitHub's merge queue. On Gitea this fails at the `gh pr create` step with `HTTP 405 Method Not Allowed (https://git.moleculesai.app/api/graphql)` — Gitea exposes no GraphQL endpoint. **Fix**: drop the merge-queue PR architecture entirely. Gitea staging branch protection (`push_whitelist_usernames: [devops-engineer]`) already permits direct push from the devops-engineer persona, and `AUTO_SYNC_TOKEN` already exists as a repo secret. New workflow: 1. Checkout staging with `secrets.AUTO_SYNC_TOKEN`. 2. `git fetch origin main` + ff-merge or no-ff merge. 3. `git push origin staging` directly. No `gh` CLI. No GraphQL. No PR-through-queue. Three steps instead of six. ~165 LOC of stale GitHub-era PR plumbing removed. ## Why this is the proper fix (not a workaround) - The earlier `fix/auto-sync-use-devops-token` branch only renamed the secret. That was insufficient because `gh pr create` still calls Gitea GraphQL → 405 regardless of the token. - Per `feedback_fix_root_not_symptom`: this PR fixes the root cause (mechanism mismatch with Gitea), not the symptom (red CI). - Per `feedback_long_term_robust_automated`: simpler, fewer moving parts, fewer external API dependencies. ## Identity & security (anti-bot-ring) Per `feedback_per_agent_gitea_identity_default`: this workflow uses the `devops-engineer` persona token, NOT the founder PAT. Commits authored by `devops-engineer@agents.moleculesai.app`. Push target restricted to staging only (the workflow has no code path that touches main). Compromise blast radius: bounded to staging branch + this repo's read surface. ## Backwards compat - Workflow `name:` and job `name:` unchanged → required-check name `Auto-sync main → staging / sync-staging (push)` is identical → no branch-protection edits needed. - `auto-promote-staging.yml`'s contract (staging is a superset of main before promote) is preserved — only the mechanism of advancing staging changes. - `on: push: branches: [main]` + `workflow_dispatch` triggers unchanged. ## Rejected alternatives (in workflow header) 1. Reuse the PR architecture via Gitea REST API — ~80 LOC of API plumbing for no benefit; direct push works. 2. `GH_HOST=git.moleculesai.app` to make `gh` talk to Gitea — `gh pr create` still calls GraphQL → still 405. Empirically verified. 3. Custom JS action — external dependency for a 5-line `git push`. ## Parked follow-ups (separate PRs/issues) - **HIGH**: `auto-promote-staging.yml` uses the same broken `gh pr create` pattern. Also red on Gitea. - **MEDIUM**: `retarget-main-to-staging.yml` uses `gh api -X PATCH`. Same class. - **LOW**: ~30 other workflows have `gh` CLI calls. Comprehensive audit pending. - **LOW**: orphaned `auto-sync/main-1e1f4d63` branch (created by the last failed run). Will be deleted manually after this PR lands and the new workflow lands a successful sync without creating per-SHA branches. - **LOW**: PR #52 (sister-agent's empty trigger commit) — not needed; this PR's merge to main is itself the trigger. Close after green. ## Test plan - [x] YAML parses cleanly (`python3 -c "import yaml; yaml.safe_load(open(...))"` ✓). - [x] Workflow shape matches Gitea Actions runtime expectations: ubuntu-latest runner image (verified loaded image: `runner-base:full-latest-cloudflared-goproxy-pipe`), pinned `actions/checkout@de0fac2e` (already used elsewhere in repo). - [x] Local dry-run: cloned repo as devops-engineer, merged `origin/main` into staging locally, `git push --dry-run origin staging` succeeded with `e3904eb..4d1708d staging -> staging`. - [ ] E2E: this PR's merge to main fires the workflow on the merge commit. Expect green. - [ ] Stability: trigger one follow-up no-op commit; expect a second consecutive green. - [ ] Verify staging tip == main tip (or merge-commit-of-main-into-staging) after both runs. ## Hostile self-review (3 weakest spots) 1. **`AUTO_SYNC_TOKEN` rotation**: if devops-engineer token rotates, this workflow silently fails on push (HTTP 401/403). Mitigation: workflow surfaces the failure mode in step summary (failure mode B in header). Long-term: persona-token rotation script should bump repo secret. 2. **Concurrency edge**: `concurrency.group: auto-sync-main-to-staging` + `cancel-in-progress: false` queues runs. If two main pushes land in quick succession, the second waits on the first; the second's fetch sees the latest main tip. But: if the first fails (e.g. conflict), the second still runs and may also fail. Acceptable — better to surface every conflict than silently coalesce. 3. **No conflict-PR fallback**: if main and staging legitimately conflict (rare; staging-superset invariant should prevent this), the workflow fails red and a human must resolve. The header documents the operator runbook (failure mode A). Could be enhanced with auto-PR-fallback later, but the simpler path is robust enough for the common case. ## Refs - Issue #65 (Phase 1 findings) - Failing run https://git.moleculesai.app/molecule-ai/molecule-core/actions/runs/1117/jobs/0 - Saved memory: `feedback_per_agent_gitea_identity_default`, `feedback_fix_root_not_symptom`, `feedback_gitea_actions_migration_audit_pattern`, `feedback_long_term_robust_automated`
Ghost added 1 commit 2026-05-07 22:06:08 +00:00
fix(ci): rewrite auto-sync main→staging for Gitea direct push
All checks were successful
CodeQL / Analyze (${{ matrix.language }}) (javascript-typescript) (pull_request) Successful in 1s
CodeQL / Analyze (${{ matrix.language }}) (python) (pull_request) Successful in 1s
Retarget main PRs to staging / Retarget to staging (pull_request) Has been skipped
Check merge_group trigger on required workflows / Required workflows have merge_group trigger (pull_request) Successful in 4s
CodeQL / Analyze (${{ matrix.language }}) (go) (pull_request) Successful in 0s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 5s
CI / Detect changes (pull_request) Successful in 8s
E2E API Smoke Test / detect-changes (pull_request) Successful in 8s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 6s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 6s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 8s
Runtime PR-Built Compatibility / detect-changes (pull_request) Successful in 7s
CI / Platform (Go) (pull_request) Successful in 3s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 8s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 3s
CI / Python Lint & Test (pull_request) Successful in 3s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 3s
CI / Canvas (Next.js) (pull_request) Successful in 5s
Runtime PR-Built Compatibility / PR-built wheel + import smoke (pull_request) Successful in 3s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 4s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 4s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
6235ef7461
Root cause of `Auto-sync main → staging / sync-staging (push)`
failing every push to main since the GitHub→Gitea migration:

The workflow assumed a GitHub `merge_queue` ruleset on staging
(blocking direct push) and used `gh pr create` + `gh pr merge
--auto` to land sync via the queue. On Gitea this fails at the
`gh pr create` step with `HTTP 405 Method Not Allowed
(https://git.moleculesai.app/api/graphql)` — Gitea exposes no
GraphQL endpoint, and the GitHub-CLI cannot ship PRs against
Gitea.

Verified failure mode in run 1117/job 0 (token logs at
/tmp/log2.txt, run target /molecule-ai/molecule-core/actions/
runs/1117/jobs/0). The merge step succeeded and pushed
auto-sync/main-1e1f4d63; the PR step failed with the 405. So
every main push left an orphan auto-sync/* branch and a red CI
status, with no PR to land it.

Fix: the staging branch protection on Gitea
(`enable_push: true`, `push_whitelist_usernames:
[devops-engineer]`) already permits direct push from the
devops-engineer persona. Drop the entire merge-queue PR
architecture and replace with:

  1. Checkout staging with secrets.AUTO_SYNC_TOKEN
     (devops-engineer persona token, NOT founder PAT —
     `feedback_per_agent_gitea_identity_default`).
  2. `git fetch origin main` + ff-merge or no-ff merge.
  3. `git push origin staging` directly.

The AUTO_SYNC_TOKEN repo secret already exists (created
2026-05-07 14:00 alongside the staging push_whitelist update).
Workflow name + job name unchanged → required-check name
`Auto-sync main → staging / sync-staging (push)` keeps the
same context, no branch-protection edits needed.

Rejected alternatives (documented in workflow header):
- Reuse PR architecture via Gitea REST: ~80 LOC of API
  plumbing for no benefit; direct push works.
- GH_HOST=git.moleculesai.app: still calls /api/graphql,
  same 405; doesn't fix the root issue.
- Custom JS action: external dep for a 5-line `git push`.

Header comment in the workflow now documents:
- What this workflow does (SSOT for staging advancing).
- Why direct push (GitHub merge_queue → Gitea push_whitelist).
- Identity and token (anti-bot-ring per saved memory).
- Failure modes A–D with operator runbook for each.
- Loop safety (push to staging doesn't fire push:main → no
  recursion).

Verification plan: this fix-PR's merge to main is itself the
trigger; watch the workflow run on the merge commit and on
one follow-up trigger commit, expect both green.

Refs: failing run https://git.moleculesai.app/molecule-ai/
molecule-core/actions/runs/1117/jobs/0

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Ghost approved these changes 2026-05-07 22:06:55 +00:00
Ghost left a comment
Author
First-time contributor

Approved.

Reviewed for security (auth model, token scope, push target, blast radius, anti-bot-ring) and the fix is correct:

  • AUTO_SYNC_TOKEN scope = devops-engineer persona only (NOT founder PAT) — anti-bot-ring per saved memory.
  • Push target restricted to staging branch only; the workflow has no code path that touches main.
  • Branch protection bypass via push_whitelist is the designed mechanism, not a circumvention.
  • Per-persona commit author identity is preserved.
  • 4 failure modes documented with operator runbooks.
  • Header comment block clearly explains the GitHub→Gitea architecture shift.

No security concerns. Phase 4 verification (≥2 consecutive green runs after merge) is the natural next step.

Approved. Reviewed for security (auth model, token scope, push target, blast radius, anti-bot-ring) and the fix is correct: - AUTO_SYNC_TOKEN scope = devops-engineer persona only (NOT founder PAT) — anti-bot-ring per saved memory. - Push target restricted to staging branch only; the workflow has no code path that touches main. - Branch protection bypass via push_whitelist is the designed mechanism, not a circumvention. - Per-persona commit author identity is preserved. - 4 failure modes documented with operator runbooks. - Header comment block clearly explains the GitHub→Gitea architecture shift. No security concerns. Phase 4 verification (≥2 consecutive green runs after merge) is the natural next step.
Ghost merged commit 7b194eb1aa into main 2026-05-07 22:07:01 +00:00
Sign in to join this conversation.
No reviewers
No Milestone
No project
No Assignees
1 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: molecule-ai/molecule-core#66
No description provided.