tier:low and tier:high are OR gates — any one positive verdict
is sufficient. The previous implementation required ALL groups to have
positive verdicts, causing INCOMPLETE even when core-devops APPROVED
and core-lead was absent.
Now uses tier-specific logic:
- tier:low / tier:high (OR): any positive = CLEAR
- tier:medium (AND): all positive = CLEAR
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Paginate all list endpoints (comments, reviews) to handle PRs with
many comments without missing entries. Uses per_page=100 with page
increment loop, safety-capped at 20 pages.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Gitea reviews use "submitted_at" not "created_at" for when the review
was submitted. The earlier signal_1_comment_scan fix (inherited from
sop-tier-check investigation) already handled this; signal_2 and
signal_3 were missing the same correction.
Fixes KeyError: 'created_at' on PRs with no comments/reviews.
Includes the individual-check-status fix (use "status" not "state").
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Gitea Actions API uses "status" (pending/success/failure) not "state"
for individual status entries. The "state" field is null for pending
runs. This caused all_check_statuses to show Python null instead of
"pending" for queued jobs.
Also verified on PR #391 and PR #393 — individual checks now correctly
display "pending" while combined_state is "pending" (CI_PENDING verdict).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
SOP-6 + CI gate checker for Gitea PRs. Detects:
- Signal 1: Author-aware agent-tag comment scan (tier-aware)
- Signal 2: REQUEST_CHANGES reviews state machine
- Signal 3: Staleness detection (SOP-12)
- Signal 6: CI required-checks awareness
Post `[gate-check-v3] STATUS:` comment on PRs. CLI + Gitea Actions
workflow (cron hourly + PR-triggered).
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Audit finding: every workflow that emits a required-status-check name
on molecule-core's branch protection (apply.sh's STAGING_CHECKS +
MAIN_CHECKS) ALREADY uses the safe always-runs-with-conditional-steps
shape — Platform/Canvas/Python/Shellcheck in ci.yml, Canvas tabs E2E
in e2e-staging-canvas.yml, E2E API Smoke in e2e-api.yml, PR-built
wheel in runtime-prbuild-compat.yml, the codeql Analyze matrix, and
the always-on Secret scan + Detect changes. No production drift to
fix today.
Adds a regression-guard so the next path-filter / matrix refactor /
workflow rename can't silently re-introduce the bug shape called out
in saved memory feedback_branch_protection_check_name_parity:
"Path filters … silently break branch protection because no job
emits the protected sentinel status when path-filter returns false."
New tools:
- tools/branch-protection/check_name_parity.sh — extracts every
required check name from apply.sh's heredocs, then for each name
classifies the owning workflow as safe (no top-level paths:) /
safe (per-step if-gates without top-level paths:) / unsafe
(top-level paths: without per-step if-gates) / unsafe-mix
(top-level paths: WITH per-step if-gates — the workflow may still
skip entirely on path exclusion, leaving the gates dormant) /
missing (no emitter at all). Special-cases codeql.yml's matrix-
expanded `Analyze (${{ matrix.language }})`.
- tools/branch-protection/test_check_name_parity.sh — 6 unit tests
covering each classification: safe, unsafe-path-filter, missing,
safe-with-per-step-gates, unsafe-mix, matrix-expansion. Each test
builds a synthetic apply.sh + workflow file in a tmpdir, invokes
the script, and asserts on exit code + stderr substring. Per
feedback_assert_exact_not_substring the assertions pin specific
classifications, not just non-zero exit.
Wired into branch-protection-drift.yml so every PR touching
.github/workflows/** runs the parity check; the existing daily
schedule covers between-PR drift. The check is cheap (~1s) and runs
without the admin token — only reads files in the checkout. Self-
test step runs the unit tests on every invocation, so a regression
in the script can't false-pass on production.
Per BSD-vs-GNU portability hygiene: heredoc-marker extraction stays
in plain awk + sed (no gawk-only `match()` array form), grep regex
avoids `^` anchor for `if:` lines because real workflows use
` - if:` with the `-` step-marker between leading spaces and
`if:` (the original anchor missed every workflow's per-step gates).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Per documentation-specialist's grep agent (2026-05-07T07:30, see
internal#46): runtime-breaking ghcr.io references in shell scripts +
docker-compose + the slip-past-workflow lint_secret_pattern_drift.py
all need migration. These were missed by security-auditor's
workflow-only audit.
Files (6):
- .github/scripts/lint_secret_pattern_drift.py:40 — workspace-runtime
pre-commit-checks.sh consumer URL: raw.githubusercontent.com →
Gitea raw URL (https://git.moleculesai.app/molecule-ai/.../raw/
branch/main/...). The lint job runs in CI and would 404 today.
- scripts/refresh-workspace-images.sh:54 — workspace-template image
pull URL: ghcr.io → ECR (153263036946.dkr.ecr.us-east-2.amazonaws.com).
- scripts/rollback-latest.sh — full rewrite of header + auth flow:
* ghcr.io/molecule-ai/{platform,platform-tenant} → ECR
* GITHUB_TOKEN with write:packages → AWS ECR auth
(aws ecr get-login-password). Per saved memory
reference_post_suspension_pipeline, prod cutover is to ECR.
* Updated header docs to match new auth flow + prereqs.
- scripts/demo-freeze.sh:13,17 — comment-only ghcr → ECR
(the script doesn't currently exec these URLs, but the comments
describe the cascade and need to match reality).
- docker-compose.yml:215-216 — canvas image: ghcr.io → ECR + updated
the auth comment to describe `aws ecr get-login-password` flow.
- tools/check-template-parity.sh:21 — inline curl install instructions:
raw.githubusercontent.com → Gitea raw URL.
Hostile self-review:
1. rollback-latest.sh's GITHUB_TOKEN→aws-cli auth swap is a behavior
change. Operators using this script now need aws CLI
authenticated for region us-east-2 with ECR pull/push perms.
Documented in updated header. Operators who don't have aws CLI
will get 'aws: command not installed' which is a clear failure
mode (not silent).
2. The Gitea raw URL shape (/raw/branch/main/) differs from GitHub's
raw.githubusercontent.com structure. Verified pattern by
inspecting other Gitea raw URLs in the codebase. If Gitea's URL
changes (1.23+), update via the same one-line edit.
3. Doesn't touch packer/scripts/install-base.sh which has a similar
ghcr.io ref per the grep agent's findings — that's bigger-scope
(packer build pipeline) and lives in molecule-controlplane-ish
territory; filing as parked follow-up under #46 if not already.
Refs: molecule-ai/internal#46, molecule-ai/internal#37,
molecule-ai/internal#38, saved memory reference_post_suspension_pipeline
Multi-model review of #2827 caught: the script as-shipped would have
silently weakened branch protection on EVERY non-checks dimension
the moment anyone ran it. Live staging had
enforce_admins=true, dismiss_stale_reviews=false, strict=true,
allow_fork_syncing=false, bypass_pull_request_allowances={
HongmingWang-Rabbit + molecule-ai app
}
Script wrote the opposite for all five. Per memory
feedback_dismiss_stale_reviews_blocks_promote.md, the
dismiss_stale_reviews flip alone is the load-bearing one — would
silently re-block every auto-promote PR (cost user 2.5h once).
This PR:
1. apply.sh: per-branch payloads (build_staging_payload /
build_main_payload) that codify the deliberate per-branch policy
already on the repo, with the script's net contribution being
ONLY the new check names (Canvas tabs E2E + E2E API Smoke on
staging, Canvas tabs E2E on main).
2. apply.sh: R3 preflight that hits /commits/{sha}/check-runs and
asserts every desired check name has at least one historical run
on the branch tip. Catches typos like "Canvas Tabs E2E" vs
"Canvas tabs E2E" — pre-fix a typo would silently block every PR
forever waiting for a context that never emits. Skip via
--skip-preflight for genuinely-new workflows whose first run
hasn't fired.
3. drift_check.sh: compares the FULL normalised payload (admin,
review, lock, conversation, fork-syncing, deletion, force-push)
not just the checks list. Pre-fix the drift gate would have
missed a UI click that flipped enforce_admins or
dismiss_stale_reviews. Drops app_id from the comparison since
GH auto-resolves -1 to a specific app id post-write.
4. branch-protection-drift.yml: per memory
feedback_schedule_vs_dispatch_secrets_hardening.md — schedule +
pull_request triggers HARD-FAIL when GH_TOKEN_FOR_ADMIN_API is
missing (silent skip masks the gate disappearing).
workflow_dispatch keeps soft-skip for one-off operator runs.
Verified by running drift_check against live state: pre-fix would
have shown 5 destructive drifts on staging + 5 on main. Post-fix
shows ONLY the 2 intended additions on staging + 1 on main, which
go away after `apply.sh` runs.
Closes#9.
Three pieces, all small:
1. **docs/e2e-coverage.md** — source of truth for which E2E suites
guard which surfaces. Today three were running but informational
only on staging; that's how the org-import silent-drop bug shipped
without a test catching it pre-merge. Now the matrix shows what's
required where + a follow-up note for the two suites that need an
always-emit refactor before they can be required.
2. **tools/branch-protection/apply.sh** — branch protection as code.
Lets `staging` and `main` required-checks live in a reviewable
shell script instead of UI clicks that get lost between admins.
This PR's net change: add `E2E API Smoke Test` and `Canvas tabs E2E`
as required on staging. Both already use the always-emit path-filter
pattern (no-op step emits SUCCESS when the workflow's paths weren't
touched), so making them required can't deadlock unrelated PRs.
3. **branch-protection-drift.yml** — daily cron + drift_check.sh
that compares live protection against apply.sh's desired state.
Catches out-of-band UI edits before they drift further. Fails the
workflow on mismatch; ops re-runs apply.sh or updates the script.
Out of scope (filed as follow-ups):
- e2e-staging-saas + e2e-staging-external use plain `paths:` filters
and never trigger when paths are unchanged. They need refactoring
to the always-emit shape (same as e2e-api / e2e-staging-canvas)
before they can be required.
- main branch protection mirrors staging here; if main wants the
E2E SaaS / External added later, do it in apply.sh and rerun.
Operator must apply once after merge:
bash tools/branch-protection/apply.sh
The drift check picks it up from there.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three complementary regression tests for the chain of P0s fixed today.
Each targets a specific bug class that reached production, and will
fire loud if any of them regress.
## 1. E2E A2A assertion enhancements (tests/e2e/test_staging_full_saas.sh)
The existing A2A check looked for "error|exception" in the response text,
which was too broad and missed the actual error patterns we hit. Now
matches each known error class individually with a diagnostic fail
message pointing at the exact bug:
- "[hermes-agent error 401]" → hermes #12 (API_SERVER_KEY)
- "hermes-agent unreachable" → gateway process died
- "model_not_found" → hermes #13 (model prefix)
- "Encrypted content is not supported" → hermes #14 (api_mode)
- "Unknown provider" → bridge PROVIDER misconfig
Also asserts the response contains the PONG token the prompt asked for —
catches silent-truncation/echo regressions.
## 2. Hermes install.sh bridge shell harness (tools/test-hermes-bridge.sh)
4 scenarios × 16 assertions, all offline (no docker, no network):
- openai-bridge-happy: OPENAI_API_KEY + openai/gpt-4o →
provider=custom, model="gpt-4o" (prefix stripped),
api_mode=chat_completions
- operator-custom-wins: explicit HERMES_CUSTOM_* → bridge skipped
- openrouter-not-touched: OPENROUTER_API_KEY → provider=openrouter,
slug kept
- non-prefixed-model: bare "gpt-4o" → prefix-strip is a no-op
Runs in <1s, can be wired into template-hermes CI. Pins the exact
config.yaml shape — any drift in derive-provider.sh or the bridge
if-block breaks a test.
## 3. Canvas ConfigTab hermes tests (ConfigTab.hermes.test.tsx)
5 vitest cases covering the #1894 bugs:
- Runtime loads from workspace metadata when config.yaml missing
- "No config.yaml found" red error hidden for hermes
- Hermes info banner shown instead
- Langgraph workspace still sees the red error (regression-guard the
other way)
- config.yaml runtime wins over workspace metadata when present
## Running
bash tools/test-hermes-bridge.sh # 16 assertions
cd canvas && npx vitest run src/components/tabs/__tests__/ConfigTab.hermes.test.tsx # 5 cases
# E2E enhancements ride on the existing staging E2E workflow
## Not yet covered (tracked in #1900)
CP admin delete-tenant EC2 cascade, cp-provisioner instance_id
lookup (#1738), purge audit SQL mismatch (#241), and pq prepared-
statement cache collision (#242). These are in-controlplane-repo
concerns — separate PR with CP-side sqlmock + integration tests.
Closes items in #1900.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>