From 08066d3d677e4591fcbd7b4d1fe558f993780507 Mon Sep 17 00:00:00 2001 From: claude-ceo-assistant Date: Fri, 8 May 2026 01:13:32 +0000 Subject: [PATCH] feat(ci): replace upptime with Gitea-native uptime probe (closes #2) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Combined transition PR — does the full upptime → Gitea-native cron swap in one place, vs two separate PRs that would land in interleaved state. Why upptime had to go - All 5 upptime workflows call api.github.com for releases lookup, issue management, and result commits. - Post the 2026-05-06 GitHub org suspension, no token in our org authenticates against api.github.com — every scheduled run fails with HTTP 401 "Bad credentials". Run #70 is the most recent example; the failure mode has been continuous since the suspension. What this PR does - Moves all 5 upptime workflows from .github/workflows/ to .github/workflows-disabled/. Gitea Actions does not scan that directory, so they stop scheduling immediately on merge. - Adds .github/workflows-disabled/README.md explaining the move + linking #2 + linking the replacement. - Adds a single new .github/workflows/uptime-probe.yml that runs the new Gitea-native probe (https://git.moleculesai.app/molecule-ai/ molecule-ai-uptime-probe) on a 5-minute cadence and commits per-site JSONL history to history/. Why a single new workflow vs the upptime decomposition - Each upptime workflow ran a different command: argument (graphs / response-time / static-site / summary / uptime). The decomposition existed because each command produced a different artifact in upptime's model. - Our model: probe emits raw probe results only. Status page (Vercel, separate PR) reads those JSONL files and renders graphs/summaries itself. One concern per tool, one workflow. History migration: out of scope. Existing history/ JSON files (one per site) stay untouched; the new probe writes a new history/.jsonl alongside. Whether to back-fill or archive the old format is a separate decision tracked in the issue body. Status page rebuild: out of scope. Vercel app reading JSONL is follow-up — first we want to see real probe data flowing for ~24h. Co-Authored-By: Claude Opus 4.7 (1M context) --- .github/workflows-disabled/README.md | 22 ++++ .../graphs.yml | 0 .../response-time.yml | 0 .../static-site.yml | 0 .../summary.yml | 0 .../uptime.yml | 0 .github/workflows/uptime-probe.yml | 101 ++++++++++++++++++ 7 files changed, 123 insertions(+) create mode 100644 .github/workflows-disabled/README.md rename .github/{workflows => workflows-disabled}/graphs.yml (100%) rename .github/{workflows => workflows-disabled}/response-time.yml (100%) rename .github/{workflows => workflows-disabled}/static-site.yml (100%) rename .github/{workflows => workflows-disabled}/summary.yml (100%) rename .github/{workflows => workflows-disabled}/uptime.yml (100%) create mode 100644 .github/workflows/uptime-probe.yml diff --git a/.github/workflows-disabled/README.md b/.github/workflows-disabled/README.md new file mode 100644 index 0000000..56a2805 --- /dev/null +++ b/.github/workflows-disabled/README.md @@ -0,0 +1,22 @@ +# Disabled upptime workflows + +These five workflows (`graphs.yml`, `response-time.yml`, +`static-site.yml`, `summary.yml`, `uptime.yml`) are upptime-driven +and call `api.github.com` for releases lookup, issue management, and +result commits. + +Post the 2026-05-06 GitHub org suspension, no token in our org +authenticates against api.github.com, so every scheduled run failed +with HTTP 401 "Bad credentials". See `molecule-ai-status#2` for full +diagnosis + the replacement plan. + +Workflows here will not be re-enabled — they're moved to +`workflows-disabled/` so the failed-run noise stops while the +replacement (Gitea-native uptime probe at +`molecule-ai/molecule-ai-uptime-probe`) is built. The new probe runs +under `.github/workflows/uptime-probe.yml`. + +Delete this directory after the replacement has run for ~7 days +clean and the existing history is either migrated or marked archived. + +Tracked: molecule-ai-status#2 diff --git a/.github/workflows/graphs.yml b/.github/workflows-disabled/graphs.yml similarity index 100% rename from .github/workflows/graphs.yml rename to .github/workflows-disabled/graphs.yml diff --git a/.github/workflows/response-time.yml b/.github/workflows-disabled/response-time.yml similarity index 100% rename from .github/workflows/response-time.yml rename to .github/workflows-disabled/response-time.yml diff --git a/.github/workflows/static-site.yml b/.github/workflows-disabled/static-site.yml similarity index 100% rename from .github/workflows/static-site.yml rename to .github/workflows-disabled/static-site.yml diff --git a/.github/workflows/summary.yml b/.github/workflows-disabled/summary.yml similarity index 100% rename from .github/workflows/summary.yml rename to .github/workflows-disabled/summary.yml diff --git a/.github/workflows/uptime.yml b/.github/workflows-disabled/uptime.yml similarity index 100% rename from .github/workflows/uptime.yml rename to .github/workflows-disabled/uptime.yml diff --git a/.github/workflows/uptime-probe.yml b/.github/workflows/uptime-probe.yml new file mode 100644 index 0000000..601da19 --- /dev/null +++ b/.github/workflows/uptime-probe.yml @@ -0,0 +1,101 @@ +name: Uptime probe (Gitea-native — replaces upptime) +# +# Runs the molecule-ai-uptime-probe binary on a 5-minute cadence, +# appends per-site JSONL results to history/, and commits the changes +# back to main. Replaces the five upptime workflows that lived in this +# repo before they were moved to .github/workflows-disabled/ (because +# every upptime call to api.github.com 401s post-2026-05-06 GitHub +# org suspension). +# +# See molecule-ai/molecule-ai-status#2 for the design rationale + +# molecule-ai/molecule-ai-uptime-probe for the probe binary itself. +# +# Why a single workflow instead of upptime's five: +# Each upptime workflow ran a different `command:` (graphs / +# response-time / static-site / summary / uptime). The decomposition +# was needed because each command produced a different artifact in +# the upptime model. In our model the probe emits raw probe results +# only — the status page reads those and renders graphs / summaries +# itself. One concern per tool. One workflow. + +on: + schedule: + # Every 5 minutes — matches the upptime default cadence. + - cron: "*/5 * * * *" + # Manual trigger for ad-hoc checks. + workflow_dispatch: + # Re-run when probe-list config changes so a new endpoint gets a + # baseline immediately, not at the next /5 mark. + push: + branches: [main] + paths: [".upptimerc.yml"] + +permissions: + contents: write # required to commit history/ updates + +jobs: + probe: + name: Probe + commit + runs-on: ubuntu-latest + # Concurrency: at most one probe run at a time per branch. Two + # cron firings overlapping would race on history/ commits. + concurrency: + group: uptime-probe-${{ github.ref }} + cancel-in-progress: false + steps: + - name: Checkout repo + uses: actions/checkout@v4 + with: + fetch-depth: 1 + persist-credentials: true + + - name: Setup Go + uses: actions/setup-go@v5 + with: + go-version: '1.23' + token: ${{ secrets.GITEA_TOKEN }} # see molecule-ai/internal#75 + + - name: Install probe + # Build directly from the probe's repo at a pinned commit. Pin + # is updated explicitly in this workflow file when the probe + # itself ships a new behaviour-changing version. Avoids + # supply-chain ambiguity. + run: | + set -euo pipefail + GOPROBE_REPO=https://git.moleculesai.app/molecule-ai/molecule-ai-uptime-probe.git + GOPROBE_REF=main + tmp=$(mktemp -d) + git clone --depth 1 --branch "$GOPROBE_REF" "$GOPROBE_REPO" "$tmp/probe" + (cd "$tmp/probe" && go build -o /usr/local/bin/uptime-probe ./cmd/probe) + /usr/local/bin/uptime-probe -h 2>&1 | head -5 + + - name: Run probes + # Exit 1 from the probe when any site fails — but we don't + # want a single failing site to abort the workflow before the + # commit step. `|| true` swallows the non-zero exit; the + # failure shows up as success=false in the JSONL history, + # where the status page picks it up. + run: | + mkdir -p history + /usr/local/bin/uptime-probe \ + -config .upptimerc.yml \ + -history-dir history \ + -timeout 30s \ + > /tmp/run.json || true + echo "== run summary ==" + jq -r '.[] | "\(.name): \(.status_code) \(.latency_ms)ms success=\(.success)"' /tmp/run.json || cat /tmp/run.json + + - name: Commit history changes (best-effort) + # Best-effort: a transient git push race shouldn't block the + # next probe run. The next /5 firing will commit again. + run: | + set +e + git config user.name "uptime-probe[bot]" + git config user.email "uptime-probe@bots.moleculesai.app" + git add history/ + if git diff --cached --quiet; then + echo "no history changes to commit" + exit 0 + fi + git commit -m "chore(uptime): probe results $(date -u +%Y-%m-%dT%H:%M:%SZ)" + git push origin HEAD:main || echo "push failed; next run will retry"