molecule-core/.github/workflows
Hongming Wang 184f8256cd ci(redeploy): fire post-main tenant fleet redeploy via CP admin endpoint
Closes the "main merged but prod tenants still on old image" gap.

## Trigger chain

  main merge
   └─> publish-workspace-server-image (builds + pushes :latest + :<sha>)
        └─> redeploy-tenants-on-main (this workflow)
             └─> POST https://api.moleculesai.app/cp/admin/tenants/redeploy-fleet
                  └─> Canary hongmingwang + 60s soak, then batches of 3
                       with SSM Run Command redeploying each tenant EC2

## Features

- Auto-fires on every successful publish-workspace-server-image run.
- Manual dispatch with optional target_tag (for rollback to an older
  SHA), canary_slug override, batch_size, dry_run.
- 30s delay before calling CP so GHCR edge cache serves the new
  :latest consistently to every tenant's docker pull.
- Skips when publish job failed (workflow_run fires on any completion).
- Job summary renders per-tenant results as a markdown table so ops
  can see which tenant, if any, broke the chain.
- Exits non-zero on HTTP != 200 or ok=false so a broken rollout marks
  the commit status red.

## Secrets + vars required

- secret CP_ADMIN_API_TOKEN  — Railway prod molecule-platform / CP_ADMIN_API_TOKEN
                               Mirrored into this repo's secrets.
- var    CP_URL (optional)   — defaults to https://api.moleculesai.app

## Paired with

- Molecule-AI/molecule-controlplane branch feat/tenant-auto-redeploy
  which adds the /cp/admin/tenants/redeploy-fleet endpoint + the SSM
  orchestration. This workflow is a no-op until that lands on prod CP.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-24 14:34:28 -07:00
..
auto-promote-staging.yml ci: canary-verify graceful-skip + draft auto-promote staging→main 2026-04-22 22:39:23 +00:00
block-internal-paths.yml ci(block-paths): fetch PR base SHA to fix shallow-clone diff failure 2026-04-24 12:01:53 +00:00
canary-staging.yml fix(e2e): CP DELETE /cp/admin/tenants body uses 'confirm', not 'confirm_token' 2026-04-21 04:50:28 -07:00
canary-verify.yml ci: canary-verify graceful-skip + draft auto-promote staging→main 2026-04-22 22:39:23 +00:00
check-merge-group-trigger.yml ci: add linter that fails when required workflow lacks merge_group trigger 2026-04-24 00:33:05 -07:00
ci.yml ci: add merge_group trigger to ci + codeql 2026-04-23 21:24:53 -07:00
codeql.yml ci: add merge_group trigger to ci + codeql 2026-04-23 21:24:53 -07:00
e2e-api.yml feat(ci): run E2E API smoke test on staging branch 2026-04-23 17:47:47 -07:00
e2e-staging-canvas.yml feat(ci): run E2E Staging Canvas on staging branch pushes 2026-04-23 17:47:51 -07:00
e2e-staging-saas.yml fix(e2e): increase hermes workspace wait from 20 to 30 min 2026-04-24 17:11:37 +00:00
e2e-staging-sanity.yml fix(e2e): CP DELETE /cp/admin/tenants body uses 'confirm', not 'confirm_token' 2026-04-21 04:50:28 -07:00
promote-latest.yml perf(ci): move all public-repo workflows to ubuntu-latest 2026-04-22 12:56:49 -07:00
publish-canvas-image.yml perf(ci): move all public-repo workflows to ubuntu-latest 2026-04-22 12:56:49 -07:00
publish-workspace-server-image.yml ci(publish-image): also tag :staging-latest so CP auto-picks up new builds 2026-04-24 00:29:55 -07:00
redeploy-tenants-on-main.yml ci(redeploy): fire post-main tenant fleet redeploy via CP admin endpoint 2026-04-24 14:34:28 -07:00
retarget-main-to-staging.yml ci: auto-retarget bot PRs opened against main → staging (#1853) 2026-04-23 19:20:40 +00:00