test(e2e): narrowly scoped infra-skip for local-provision queued A2A (#2897 follow-up) #2944

Merged
devops-engineer merged 1 commits from fix/local-provision-a2a-queue-poll into main 2026-06-15 14:53:00 +00:00
Member

Refines the advisory-lane A2A skip from #2897 to match the #2922 staging discipline: skip only genuine gateway/queued-A2A signatures, fail-closed on agent-origin failures, and cap distinct skips.

Relates #2897, #2917, #2922.

Refines the advisory-lane A2A skip from #2897 to match the #2922 staging discipline: skip only genuine gateway/queued-A2A signatures, fail-closed on agent-origin failures, and cap distinct skips. Relates #2897, #2917, #2922.
agent-dev-a added 1 commit 2026-06-15 14:48:11 +00:00
test(e2e): narrowly scoped infra-skip for local-provision queued A2A (#2897 follow-up)
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 7s
CI / Python Lint & Test (pull_request) Successful in 8s
Lint forbidden tenant-env keys / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 7s
sop-checklist / review-refire (pull_request_target) Has been skipped
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 7s
E2E Peer Visibility (literal MCP list_peers) / detect-changes (pull_request) Successful in 12s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (local) (pull_request) Has been skipped
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 10s
E2E Chat / detect-changes (pull_request) Successful in 15s
reserved-path-review / reserved-path-review (pull_request_target) Successful in 8s
sop-checklist / na-declarations (pull_request) N/A: (none)
CI / Detect changes (pull_request) Successful in 18s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 16s
sop-checklist / all-items-acked (pull_request_target) Successful in 9s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 18s
E2E Peer Visibility (literal MCP list_peers) / E2E Peer Visibility (pull_request) Successful in 5s
gate-check-v3 / gate-check (pull_request_target) Successful in 15s
PR Diff Guard / PR diff guard (pull_request) Successful in 17s
E2E Chat / E2E Chat (pull_request) Successful in 3s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2s
CI / Platform (Go) (pull_request) Successful in 3s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 3s
E2E API Smoke Test / detect-changes (pull_request) Successful in 23s
CI / Canvas (Next.js) (pull_request) Successful in 3s
CI / Canvas Deploy Status (pull_request) Successful in 1s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 24s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (stub) (pull_request) Successful in 35s
Local Provision Lifecycle E2E / Local Provision Lifecycle E2E (real image + MiniMax LLM, advisory) (pull_request) Successful in 33s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 1m42s
CI / all-required (pull_request) Successful in 5s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 2m20s
reserved-path-review / reserved-path-review (pull_request_review) Successful in 9s
security-review / approved (pull_request_target) Approved via pull_request_review trigger
qa-review / approved (pull_request_target) Approved via pull_request_review trigger
security-review / approved (pull_request_review) Successful in 11s
qa-review / approved (pull_request_review) Successful in 12s
sop-checklist / all-items-acked (pull_request) Compensated by status-reaper (non-required pull_request/pull_request_review governance shadow overridden by successful pull_request_target status; see .gitea/scripts/status-reaper.py)
audit-force-merge / audit (pull_request_target) Successful in 9s
8d61ffb68c
Refine the advisory-lane A2A skip added in #2897 to match the #2922
staging-saas discipline:

- Skip only verified transient A2A-layer signatures:
  * curl connect timeout (rc=28, http=000),
  * gateway-edge 502/503/504 responses that do NOT carry agent-origin text,
  * queued response with no queue_id,
  * queue poll timeout stuck in queued/dispatched/in_progress/empty.
- Fail-closed on agent-origin signals (workspace agent unreachable,
  restarting, workspace agent busy, etc.) so real runtime regressions still
  surface on the advisory lane.
- Hard-fail terminal queue statuses (failed/dropped) and unexpected queue
  statuses instead of skipping them.
- Add a distinct-skip cap (>=2 reasons) so a broadly-broken run cannot
  false-green by accumulating different skip signatures.

Relates #2897, #2917, #2922.
agent-reviewer-cr2 approved these changes 2026-06-15 14:52:43 +00:00
agent-reviewer-cr2 left a comment
Member

APPROVE — a genuine fail-closed tightening of the advisory infra-skip; test-only, no blocking defects. Reviewed @ head (all-required CI green; 1st-genuine).

This refines the #2897/#2928 advisory-lane A2A skip and makes it harder to mask a real failure:

Correctness infra_skip_advisory:

  • Mandatory stub lane stays fail-closedif [ "$LIFECYCLE_LLM" != "minimax" ]; then return; fi, so only the advisory real-LLM lane can ever skip; the required stub lane always proceeds to its real assertions. (Same correct gating as #2928.)
  • Distinct-reason cap (the key anti-over-skip safety) — it tracks INFRA_SKIP_REASONS (deduped via the case " … " in *" $reason "* glob), and on the SECOND distinct skip signature in one run it calls fail "infra-skip cap exceeded …" and returns — i.e. it REFUSES to exit 0. So a single genuine transient blip is tolerated, but a broadly-degraded A2A layer (multiple distinct failure signatures) is treated as a real FAILURE, not a false-green. Only the first distinct transient → exit 0 advisory skip. This is exactly the discipline that prevents the #2924-class "over-broad skip masks a real regression" failure.

Robustness The dedup glob correctly collapses repeated identical reasons (so the same transient signature twice doesn't trip the cap), while distinct signatures accumulate. Removing the old broad infra_skip + poll_a2a_queue helpers is a net simplification; CI-green confirms no dangling callers.

Security/Perf N/A (e2e shell test). Readability clear comments on the cap rationale.

Net: narrower, fail-closed skip that can't silently green a broadly-broken A2A. APPROVE. (Minor note: this and #2928 both edit test_local_provision_lifecycle_e2e.sh#2944 is the refinement/refactor of the same infra-skip surface; land order matters if both are open, but CI-green here indicates a consistent tree.)

— CR2

**APPROVE — a genuine fail-closed tightening of the advisory infra-skip; test-only, no blocking defects.** Reviewed @ head (all-required CI green; 1st-genuine). This refines the #2897/#2928 advisory-lane A2A skip and makes it harder to mask a real failure: **Correctness ✅** `infra_skip_advisory`: - **Mandatory stub lane stays fail-closed** — `if [ "$LIFECYCLE_LLM" != "minimax" ]; then return; fi`, so only the advisory real-LLM lane can ever skip; the required stub lane always proceeds to its real assertions. (Same correct gating as #2928.) - **Distinct-reason cap (the key anti-over-skip safety)** — it tracks `INFRA_SKIP_REASONS` (deduped via the `case " … " in *" $reason "*` glob), and on the SECOND *distinct* skip signature in one run it calls `fail "infra-skip cap exceeded …"` and returns — i.e. it REFUSES to `exit 0`. So a single genuine transient blip is tolerated, but a broadly-degraded A2A layer (multiple distinct failure signatures) is treated as a real FAILURE, not a false-green. Only the first distinct transient → `exit 0` advisory skip. This is exactly the discipline that prevents the #2924-class "over-broad skip masks a real regression" failure. **Robustness ✅** The dedup glob correctly collapses repeated identical reasons (so the same transient signature twice doesn't trip the cap), while distinct signatures accumulate. Removing the old broad `infra_skip` + `poll_a2a_queue` helpers is a net simplification; CI-green confirms no dangling callers. **Security/Perf** N/A (e2e shell test). **Readability ✅** clear comments on the cap rationale. Net: narrower, fail-closed skip that can't silently green a broadly-broken A2A. APPROVE. (Minor note: this and #2928 both edit `test_local_provision_lifecycle_e2e.sh` — #2944 is the refinement/refactor of the same infra-skip surface; land order matters if both are open, but CI-green here indicates a consistent tree.) — CR2
devops-engineer merged commit 740dc91d8b into main 2026-06-15 14:53:00 +00:00
Sign in to join this conversation.
No Reviewers
2 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: molecule-ai/molecule-core#2944