fix(ci): add Gitea health gate before push in uptime-probe #8

Merged
infra-lead merged 1 commits from sre/status-probe-gitea-health-gate into main 2026-05-10 13:33:42 +00:00
Member

Summary

Probe runs on GitHub Actions (ubuntu-latest) — confirmed independent of the Gitea Actions runner on operator host 5.78.80.188. The probe binary was healthy throughout the Gitea outage; only the commit step was affected (push to Gitea fails when 502).

Change: adds a Gitea health gate before the git push in the uptime-probe workflow:

  1. Health gate: checks git.moleculesai.app/api/v1/version returns 200 before pushing. Fails fast with a ::error annotation if Gitea is unhealthy, rather than silently skipping the push.
  2. Fail loudly: set -euo pipefail replaces set +e, so any push error surfaces as a workflow failure visible in the GitHub Actions UI.
  3. Self-heals: the next /5 cron firing picks up buffered history/ results once Gitea recovers.

Test plan

  • YAML validates
  • Branch pushed; GitHub Actions runs the workflow
  • GitHub Actions workflow run completes

SRE note

The real SPOF for the status page is not the Gitea Actions runner (the probe does not run there) — it is the git-push-to-Gitea step. This PR addresses the observability gap (fail-fast vs silent). A deeper resilience fix (push to a backup remote, or a separate Gitea instance) can be addressed as a follow-up.

🤖 Generated with Claude Code

## Summary Probe runs on GitHub Actions (`ubuntu-latest`) — confirmed independent of the Gitea Actions runner on operator host 5.78.80.188. The probe binary was healthy throughout the Gitea outage; only the commit step was affected (push to Gitea fails when 502). **Change:** adds a Gitea health gate before the `git push` in the uptime-probe workflow: 1. **Health gate:** checks `git.moleculesai.app/api/v1/version` returns 200 before pushing. Fails fast with a `::error` annotation if Gitea is unhealthy, rather than silently skipping the push. 2. **Fail loudly:** `set -euo pipefail` replaces `set +e`, so any push error surfaces as a workflow failure visible in the GitHub Actions UI. 3. **Self-heals:** the next /5 cron firing picks up buffered `history/` results once Gitea recovers. ## Test plan - [x] YAML validates - [x] Branch pushed; GitHub Actions runs the workflow - [ ] GitHub Actions workflow run completes ## SRE note The real SPOF for the status page is not the Gitea Actions runner (the probe does not run there) — it is the git-push-to-Gitea step. This PR addresses the observability gap (fail-fast vs silent). A deeper resilience fix (push to a backup remote, or a separate Gitea instance) can be addressed as a follow-up. 🤖 Generated with [Claude Code](https://claude.com/claude-code)
infra-sre added 1 commit 2026-05-10 13:32:47 +00:00
Probe runs on GitHub Actions (ubuntu-latest) — confirmed independent of
Gitea Actions runner. Previously the commit step silently swallowed push
failures with `|| echo "push failed"`. Now:

1. Health gate: checks git.moleculesai.app/api/v1/version returns 200
   before pushing. Fails fast with a clear ::error message if Gitea is
   502 or unreachable, rather than silently skipping the push.

2. Fail loudly: `set -euo pipefail` replaces `set +e`, so any push error
   surfaces as a workflow failure (visible in GitHub Actions UI).

3. Self-heals: the next /5 cron firing picks up the buffered history/
   results once Gitea recovers.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
infra-lead merged commit b524b79c82 into main 2026-05-10 13:33:42 +00:00
infra-lead deleted branch sre/status-probe-gitea-health-gate 2026-05-10 13:33:42 +00:00
Sign in to join this conversation.
No reviewers
No Label
No Milestone
No project
No Assignees
1 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: molecule-ai/molecule-ai-status#8
No description provided.