3269e93216
ci-arm64-advisory / fast-checks (pull_request) Waiting to run
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Successful in 11s
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 22s
CI / Python Lint & Test (pull_request) Successful in 16s
CI / Detect changes (pull_request) Successful in 38s
E2E API Smoke Test / detect-changes (pull_request) Successful in 22s
E2E Chat / detect-changes (pull_request) Successful in 13s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 10s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 7s
Harness Replays / detect-changes (pull_request) Successful in 5s
Lint curl status-code capture / Scan workflows for curl status-capture pollution (pull_request) Successful in 12s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 3s
E2E Staging SaaS (full lifecycle) / pr-validate (pull_request) Successful in 34s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 4s
lint-continue-on-error-tracking / lint-continue-on-error-tracking (pull_request) Failing after 1m7s
lint-mask-pr-atomicity / lint-mask-pr-atomicity (pull_request) Successful in 1m14s
Lint pre-flip continue-on-error / Verify continue-on-error flips have run-log proof (pull_request) Successful in 1m11s
lint-required-workflows-docker-host-pinned / Lint docker-host pin on docker-touching workflows (pull_request) Successful in 8s
lint-required-context-exists-in-bp / lint-required-context-exists-in-bp (pull_request) Successful in 1m24s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 4s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m3s
gate-check-v3 / gate-check (pull_request) Successful in 5s
qa-review / approved (pull_request) Failing after 8s
security-review / approved (pull_request) Failing after 12s
verify-providers-gen / Regenerate providers artifact and fail on drift (pull_request) Successful in 27s
sop-checklist / na-declarations (pull_request) N/A: (none)
sop-checklist / all-items-acked (pull_request) Successful in 4s
sop-checklist / review-refire (pull_request) Has been skipped
sop-tier-check / tier-check (pull_request) Successful in 6s
Lint workflow YAML (Gitea-1.22.6-hostile shapes) / Lint workflow YAML for Gitea-1.22.6-hostile shapes (pull_request) Successful in 1m25s
E2E Staging SaaS (full lifecycle) / E2E Staging SaaS (pull_request) Failing after 4m31s
CI / Canvas (Next.js) (pull_request) Successful in 6s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 17s
E2E Chat / E2E Chat (pull_request) Successful in 4s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 8s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 2m12s
Harness Replays / Harness Replays (pull_request) Successful in 4s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 2m16s
CI / Platform (Go) (pull_request) Successful in 6m10s
CI / Canvas Deploy Reminder (pull_request) Has been skipped
CI / all-required (pull_request) Successful in 13m28s
The A2A e2e historically asserted only response SHAPE (test_a2a_e2e.sh
checked '"kind":"text"' only). A broken agent returns its error AS a
text part -- {"kind":"text","text":"Agent error (Exception) ..."} --
which STILL matches the shape check, so it PASSED on a fully broken
agent. That is why the 2026-05-2x drained-key / byok-misroute failures
(agents-team PM + reno marketing erroring on every LLM call) sailed
through CI. "Channel returns text shape" is not "agent completed an LLM
round-trip."
Adds, ADDITIVELY (no existing assertion weakened or removed):
- tests/e2e/lib/completion_assert.sh -- reusable gates:
* a2a_assert_real_completion: deterministic known-answer round-trip;
asserts CONTAINS the expected token AND NOT an error-as-text marker
(Agent error / Exception / error result / MISSING_BYOK_CREDENTIAL).
* provider_liveness_matrix + offered_platform_models_for_runtime:
per-offered-provider cheap (max_tokens:4) probe; the offered set is
read from the providers.yaml SSOT (runtimes.<rt>.providers[platform]
.models) -- not a hardcoded list -- so the matrix tracks the SSOT.
* assert_byok_not_platform_proxy: #1994 regression guard -- a
byok-resolving workspace must NOT resolve platform_managed (reads the
same derived resolver GET /admin/workspaces/:id/llm-billing-mode the
provision strip gate uses).
- tests/e2e/test_staging_full_saas.sh (the live-agent lane, MiniMax
primary): new stanzas 8b (PINEAPPLE known-answer, the core gate),
8c (byok-routing guard), 8d (SSOT-driven per-provider liveness matrix).
- tests/e2e/test_a2a_e2e.sh: added check_no_error_as_text on Echo + SEO
replies so the brief's literal shape-only example now FAILS on an
error-as-text payload.
- tests/e2e/test_completion_assert_unit.sh: offline fail-direction proof
(16 cases) that the negative gates are load-bearing -- error-as-text
MUST fail, platform_managed MUST trip the #1994 guard. Wired into
ci.yml "Run E2E bash unit tests (no live infra)" (required, per-PR +
main). e2e-staging-saas.yml paths filter extended to re-trigger the
live lane on lib changes.
No #1994 fix code touched -- tests/e2e + workflow wiring only.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>