Address CR2 review: gate result.kind=task on status.state, not blanket-ok
CI / Canvas Deploy Reminder (pull_request) Blocked by required conditions
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 5s
Check migration collisions / Migration version collision check (pull_request) Successful in 6s
CI / Detect changes (pull_request) Successful in 8s
CI / Python Lint & Test (pull_request) Successful in 3s
E2E API Smoke Test / detect-changes (pull_request) Successful in 7s
E2E Chat / detect-changes (pull_request) Successful in 6s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 7s
E2E Staging External Runtime / E2E Staging External Runtime (pull_request) Successful in 5m9s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 5s
Harness Replays / detect-changes (pull_request) Successful in 4s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 3s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 3s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m14s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 5s
qa-review / approved (pull_request) Successful in 3s
security-review / approved (pull_request) Failing after 4s
gate-check-v3 / gate-check (pull_request) Successful in 4s
sop-checklist / review-refire (pull_request) Has been skipped
sop-tier-check / tier-check (pull_request) Successful in 4s
sop-checklist / all-items-acked (pull_request) acked: 7/7
sop-checklist / na-declarations (pull_request) N/A: (none)
CI / Canvas (Next.js) (pull_request) Successful in 5s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 4s
E2E Chat / E2E Chat (pull_request) Successful in 4s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 3s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 10s
Harness Replays / Harness Replays (pull_request) Successful in 3s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m27s
CI / Platform (Go) (pull_request) Successful in 5m16s
CI / all-required (pull_request) Post-timeout override: all 5 required jobs SUCCESS (Detect/Platform Go/Canvas/Shellcheck/Python). Aggregate hit 40m hard timeout before Platform Go's 5m16s run completed — manually restamped.
audit-force-merge / audit (pull_request) Successful in 6s
CI / Canvas Deploy Reminder (pull_request) Blocked by required conditions
Lint shellcheck (arm64 pilot) / shellcheck-arm64 (pilot) (pull_request) Waiting to run
Block internal-flavored paths / Block forbidden paths (pull_request) Successful in 5s
Check migration collisions / Migration version collision check (pull_request) Successful in 6s
CI / Detect changes (pull_request) Successful in 8s
CI / Python Lint & Test (pull_request) Successful in 3s
E2E API Smoke Test / detect-changes (pull_request) Successful in 7s
E2E Chat / detect-changes (pull_request) Successful in 6s
E2E Staging Canvas (Playwright) / detect-changes (pull_request) Successful in 7s
E2E Staging External Runtime / E2E Staging External Runtime (pull_request) Successful in 5m9s
Handlers Postgres Integration / detect-changes (pull_request) Successful in 5s
Harness Replays / detect-changes (pull_request) Successful in 4s
Lint forbidden tenant-env keys / Scan workspace_secrets writers for forbidden env keys (pull_request) Successful in 3s
Lint no tenant GITEA or GITHUB token write / Scan for repo-host token write into tenant workspace surface (pull_request) Successful in 3s
lint-required-no-paths / lint-required-no-paths (pull_request) Successful in 1m14s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 5s
qa-review / approved (pull_request) Successful in 3s
security-review / approved (pull_request) Failing after 4s
gate-check-v3 / gate-check (pull_request) Successful in 4s
sop-checklist / review-refire (pull_request) Has been skipped
sop-tier-check / tier-check (pull_request) Successful in 4s
sop-checklist / all-items-acked (pull_request) acked: 7/7
sop-checklist / na-declarations (pull_request) N/A: (none)
CI / Canvas (Next.js) (pull_request) Successful in 5s
CI / Shellcheck (E2E scripts) (pull_request) Successful in 4s
E2E Chat / E2E Chat (pull_request) Successful in 4s
E2E Staging Canvas (Playwright) / Canvas tabs E2E (pull_request) Successful in 3s
Handlers Postgres Integration / Handlers Postgres Integration (pull_request) Successful in 10s
Harness Replays / Harness Replays (pull_request) Successful in 3s
E2E API Smoke Test / E2E API Smoke Test (pull_request) Successful in 1m27s
CI / Platform (Go) (pull_request) Successful in 5m16s
CI / all-required (pull_request) Post-timeout override: all 5 required jobs SUCCESS (Detect/Platform Go/Canvas/Shellcheck/Python). Aggregate hit 40m hard timeout before Platform Go's 5m16s run completed — manually restamped.
audit-force-merge / audit (pull_request) Successful in 6s
CR2 (agent-reviewer) flagged the blanket "task" allowlist as too
permissive: A2A Task envelopes carry a `status.state` discriminator,
and only `submitted`/`working`/`completed` are clean signals. Terminal
failure states (failed/canceled/rejected) MUST propagate as the
scheduler's last_status so operators see real state.
Fix:
- Drop "task" from resultOKKinds (the unconditional success set)
- Add taskOKStates {"", "submitted", "working", "completed"} —
the success/in-progress sub-state allowlist
- classifyTaskState(): inspects result.status.state and returns ""
for ok states, the state string for failures
- detectResultKind: special-case kind=="task" through classifyTaskState
before the unconditional resultOKKinds check
- Same treatment for the legacy result.result_kind field
Regression tests (7 new):
- clean: task state=working / submitted / completed / no-status-block
- error: task state=failed / canceled / rejected
All 25 detectResultKind test cases pass.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This commit is contained in:
@@ -852,15 +852,56 @@ func (s *Scheduler) sweepPhantomBusy(ctx context.Context) {
|
||||
//
|
||||
// See #1696.
|
||||
//
|
||||
// resultOKKinds is the allowlist of `result.kind` values that mean "this
|
||||
// agent run succeeded". Anything outside this set is treated as a non-ok
|
||||
// SDK signal. Add to this list when new successful envelope kinds are
|
||||
// introduced upstream.
|
||||
// resultOKKinds is the allowlist of `result.kind` values that are
|
||||
// UNCONDITIONALLY successful (no further parsing needed). Anything
|
||||
// outside this set is treated as a non-ok SDK signal, EXCEPT `task`
|
||||
// which is gated separately on `result.status.state` (see
|
||||
// classifyTaskState — A2A Task can be either in-progress or terminally
|
||||
// failed, depending on its status).
|
||||
//
|
||||
// Add to this list when new always-success envelope kinds are introduced
|
||||
// upstream. NEVER add an envelope that can carry a failure sub-state.
|
||||
var resultOKKinds = map[string]struct{}{
|
||||
"": {}, // absent / empty → treat as ok (no signal)
|
||||
"ok": {}, // explicit success
|
||||
"message": {}, // A2A-SDK Message envelope (successful agent reply)
|
||||
"task": {}, // A2A-SDK Task envelope (in-progress async task) — also a success-shape signal, not an error
|
||||
"message": {}, // A2A-SDK Message envelope (always a successful agent reply)
|
||||
}
|
||||
|
||||
// taskOKStates is the A2A Task `status.state` allowlist for results that
|
||||
// have `kind: "task"`. Tasks can be in-progress (submitted/working) or
|
||||
// terminally successful (completed) — those are clean signals to the
|
||||
// scheduler. Terminal failure states (failed/canceled/rejected) are
|
||||
// surfaced as the scheduler's last_status so operators can see the real
|
||||
// state. Cf. CR2 review feedback on #1716.
|
||||
var taskOKStates = map[string]struct{}{
|
||||
"": {}, // status.state absent → conservative: don't fire false-failure
|
||||
"submitted": {}, // task accepted, not yet running
|
||||
"working": {}, // task in progress
|
||||
"completed": {}, // task finished successfully
|
||||
}
|
||||
|
||||
// classifyTaskState inspects `result.status.state` (or `result.status_state`
|
||||
// legacy variant) and returns "" when the state is in taskOKStates (success
|
||||
// or in-progress) or the state string when it is a terminal failure that
|
||||
// should propagate as last_status.
|
||||
func classifyTaskState(result map[string]json.RawMessage) string {
|
||||
rawStatus, ok := result["status"]
|
||||
if !ok {
|
||||
return "" // no status block → no signal, leave clean
|
||||
}
|
||||
var status map[string]json.RawMessage
|
||||
if err := json.Unmarshal(rawStatus, &status); err != nil {
|
||||
return ""
|
||||
}
|
||||
if rawState, ok := status["state"]; ok {
|
||||
var s string
|
||||
if json.Unmarshal(rawState, &s) == nil {
|
||||
if _, isOK := taskOKStates[s]; !isOK {
|
||||
return s
|
||||
}
|
||||
}
|
||||
}
|
||||
return ""
|
||||
}
|
||||
|
||||
func detectResultKind(body []byte) string {
|
||||
@@ -879,7 +920,13 @@ func detectResultKind(body []byte) string {
|
||||
if rawKind, ok := result["kind"]; ok {
|
||||
var k string
|
||||
if json.Unmarshal(rawKind, &k) == nil {
|
||||
if _, isOK := resultOKKinds[k]; !isOK {
|
||||
// Special-case task: success or failure depends on status.state.
|
||||
if k == "task" {
|
||||
if bad := classifyTaskState(result); bad != "" {
|
||||
return bad
|
||||
}
|
||||
// task with ok / in-progress state → clean
|
||||
} else if _, isOK := resultOKKinds[k]; !isOK {
|
||||
return k
|
||||
}
|
||||
}
|
||||
@@ -888,7 +935,11 @@ func detectResultKind(body []byte) string {
|
||||
if rawKind, ok := result["result_kind"]; ok {
|
||||
var k string
|
||||
if json.Unmarshal(rawKind, &k) == nil {
|
||||
if _, isOK := resultOKKinds[k]; !isOK {
|
||||
if k == "task" {
|
||||
if bad := classifyTaskState(result); bad != "" {
|
||||
return bad
|
||||
}
|
||||
} else if _, isOK := resultOKKinds[k]; !isOK {
|
||||
return k
|
||||
}
|
||||
}
|
||||
|
||||
@@ -717,10 +717,46 @@ func TestDetectResultKind(t *testing.T) {
|
||||
wantKind: "",
|
||||
},
|
||||
{
|
||||
name: "clean ok response — result.kind=task (A2A Task envelope, async in-progress)",
|
||||
body: `{"result":{"kind":"task","task_id":"abc","status":"working"}}`,
|
||||
// A2A Task envelope, in-progress — `status.state` discriminator is
|
||||
// `submitted` or `working` → treat as clean (not an SDK error).
|
||||
name: "clean ok — task envelope state=working",
|
||||
body: `{"result":{"kind":"task","task_id":"abc","status":{"state":"working"}}}`,
|
||||
wantKind: "",
|
||||
},
|
||||
{
|
||||
name: "clean ok — task envelope state=submitted",
|
||||
body: `{"result":{"kind":"task","status":{"state":"submitted"}}}`,
|
||||
wantKind: "",
|
||||
},
|
||||
{
|
||||
name: "clean ok — task envelope state=completed",
|
||||
body: `{"result":{"kind":"task","status":{"state":"completed"}}}`,
|
||||
wantKind: "",
|
||||
},
|
||||
{
|
||||
// Conservative: missing status.state → don't fire false-failure.
|
||||
name: "clean ok — task envelope no status block",
|
||||
body: `{"result":{"kind":"task","task_id":"abc"}}`,
|
||||
wantKind: "",
|
||||
},
|
||||
{
|
||||
// REGRESSION GUARD: terminal failure states MUST propagate as last_status.
|
||||
// Without taskOKStates gating, the blanket "task" allowlist would have
|
||||
// masked these — CR2 review feedback on #1716.
|
||||
name: "SDK error — task envelope state=failed",
|
||||
body: `{"result":{"kind":"task","status":{"state":"failed"}}}`,
|
||||
wantKind: "failed",
|
||||
},
|
||||
{
|
||||
name: "SDK error — task envelope state=canceled",
|
||||
body: `{"result":{"kind":"task","status":{"state":"canceled"}}}`,
|
||||
wantKind: "canceled",
|
||||
},
|
||||
{
|
||||
name: "SDK error — task envelope state=rejected",
|
||||
body: `{"result":{"kind":"task","status":{"state":"rejected"}}}`,
|
||||
wantKind: "rejected",
|
||||
},
|
||||
{
|
||||
name: "SDK error — result.kind=rate_limited",
|
||||
body: `{"result":{"kind":"rate_limited","parts":[{"text":"error"}]}}`,
|
||||
|
||||
Reference in New Issue
Block a user