molecule-core/workspace-server/internal/buildinfo/buildinfo.go
Hongming Wang 998e13c4bd feat(deploy): verify each tenant /buildinfo matches published SHA after redeploy
Closes the gap that let issue #2395 ship: redeploy-fleet workflows reported
ssm_status=Success based on SSM RPC return code alone, while EC2 tenants
silently kept serving the previous :latest digest because docker compose up
without an explicit pull is a no-op when the local tag already exists.

Wire:
  - new buildinfo package exposes GitSHA, set at link time via -ldflags from
    the GIT_SHA build-arg (default "dev" so test runs without ldflags fail
    closed against an unset deploy)
  - router exposes GET /buildinfo returning {git_sha} — public, no auth,
    cheap enough to curl from CI for every tenant
  - both Dockerfiles thread GIT_SHA into the Go build
  - publish-workspace-server-image.yml passes GIT_SHA=github.sha for both
    images
  - redeploy-tenants-on-main.yml + redeploy-tenants-on-staging.yml curl each
    tenant's /buildinfo after the redeploy SSM RPC and fail the workflow on
    digest mismatch; staging treats both :latest and :staging-latest as
    moving tags; verification is skipped only when an operator pinned a
    specific tag via workflow_dispatch

Tests:
  - TestGitSHA_DefaultDevSentinel pins the dev default
  - TestBuildInfoEndpoint_ReturnsGitSHA pins the wire shape that the
    workflow's jq lookup depends on

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-30 10:55:08 -07:00

27 lines
1.3 KiB
Go

// Package buildinfo exposes the git SHA the binary was built from.
//
// Set at link time:
//
// go build -ldflags "-X github.com/Molecule-AI/molecule-monorepo/platform/internal/buildinfo.GitSHA=<sha>"
//
// CI passes ${{ github.sha }} via Dockerfile.tenant ARG GIT_SHA; local
// dev builds default to "dev" so unset never reads as success.
//
// Why this package exists: redeploy-fleet (CP) returns ssm_status=Success
// when the SSM RPC didn't error — that's "the deploy command ran",
// NOT "the new code is running on every tenant." Image-tag-as-tag
// (`:latest`) caches in the local Docker daemon so `docker compose up -d`
// without an explicit `docker pull` is a no-op when the tag hasn't been
// invalidated. Both observed 2026-04-30: the user's tenant kept serving
// pre-501a42d7 chat_files even after main published the lazy-heal fix
// (#2395). Exposing GitSHA at /buildinfo lets the redeploy workflow
// verify EVERY tenant is actually running the published SHA before
// reporting success.
package buildinfo
// GitSHA is overwritten at build time via -ldflags. Default catches
// dev builds + any deploy that forgot to wire the build-arg through.
// "dev" is intentional — comparing it to a real SHA always fails,
// which is what we want for an unconfigured deploy.
var GitSHA = "dev"