Files
molecule-ai-workspace-templ…/OPS_NOTES.md
infra-sre 29fc94d682
CI / Adapter unit tests (push) Successful in 1m14s
CI / Template validation (static) (push) Successful in 1m29s
CI / Adapter unit tests (pull_request) Successful in 1m19s
CI / Template validation (static) (pull_request) Successful in 1m38s
CI / Template validation (runtime) (push) Successful in 1m43s
CI / T4 tier-4 conformance (live) (pull_request) Successful in 1m14s
CI / T4 tier-4 conformance (live) (push) Successful in 1m45s
CI / validate (push) Successful in 2s
CI / Template validation (runtime) (pull_request) Successful in 4m24s
CI / validate (pull_request) Successful in 6s
chore(codex): republish 283f371 after flaked ECR-login (no functional change)
PR #6 (283f371) merged to main but its publish-image run (action_run
78342, task 125567) failed at "Log in to ECR" with "Failed to
initialize: protocol not available" — a transient act_runner/AWS-CLI
flake, NOT a code/workflow defect (ci.yml passed on the same SHA;
publish-image.yml is byte-identical to the prior 858b093 which built
successfully). No sha-283f371 image was ever pushed to ECR, so the
deployed codex prod pin remains the stale 741b29b (codex ~0.57, no
~/.codex/auth.json Mode C, dead gpt-5 default) — the root cause of
prod-Reviewer/prod-Researcher not draining their A2A inbox.

Gitea 1.22.6 has no REST rerun / workflow_dispatch endpoint, so a
fresh main commit is the canonical rerun mechanism. This change is a
build-inert OPS_NOTES.md only (not COPYed into the image) — it exists
solely to re-trip on:push:branches:[main] -> publish-image and produce
the image of the already-merged, already-reviewed 283f371 fix.

Refs: local task #219; MEMORY.md "codex pin no-autopromoter".

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-18 02:27:05 -07:00

2.5 KiB

OPS_NOTES — codex workspace template

Operational/release notes for this template. Build-inert (this file is not COPYed into the image and does not affect the Docker build).

2026-05-18 — republish of 283f371 (PR #6) after a flaked ECR-login

Why this commit exists: PR #6 (283f371, "bump CLI to 0.130.0 + consume CODEX_AUTH_JSON subscription OAuth") merged to main at 2026-05-18T04:32:57Z. Its on: push: branches:[main] publish-image run (Gitea action_run 78342, job "Build & push workspace-template-codex image", task 125567) failed at the "Log in to ECR" step with:

Run set -euo pipefail
aws ecr get-login-password --region us-east-2 | docker login --username AWS --password-stdin "${ECR_REGISTRY}"
Failed to initialize: protocol not available
##[error]Process completed with exit code 1.

This is a transient act_runner/AWS-CLI environment flake at the ECR-login step, NOT a code or workflow defect:

  • ci.yml on the same commit (283f371, run 78341) passed.
  • publish-image.yml is byte-identical between this commit and the immediately-prior 858b093 (git diff 858b093 283f371 -- .gitea/workflows/publish-image.yml is empty), and 858b093's publish-image (run 78300) succeeded and pushed sha-858b093.
  • The fix diff 858b093..283f371 only touches Dockerfile/adapter.py/start.sh/requirements/tests.

Net effect of the flake: no sha-283f371 image was ever pushed to ECR (molecule-ai/workspace-template-codex only has sha-741b29b, sha-a051e18, sha-858b093, latest=858b093). The deployed prod codex CP pin is still git_sha 741b29b (codex CLI ~0.57, no CODEX_AUTH_JSON -> ~/.codex/auth.json Mode C, dead default model gpt-5), which is why the codex-runtime prod workspaces (prod-Reviewer, prod-Researcher) cannot authenticate codex, never start the app-server, and never drain their A2A inbox.

Gitea 1.22.6 has no REST workflow rerun / workflow_dispatch.inputs endpoint (404), so the canonical rerun mechanism is a fresh commit to main. This PR carries no functional change — it exists solely to re-trip publish-image and produce the sha-<merge> image of the already-merged, already-reviewed 283f371 fix.

After this lands and publish-image succeeds: the new image digest must be promoted onto the codex CP runtime-image pin (there is no auto-promoter — manual psql/CP pin bump, same class as MEMORY.md "codex pin auto-promote gap"), then the two codex prod workspaces restarted/re-provisioned to pull it.