chore: third-pass review polish — empty-stream gate test + Callable type

Pass 3 review came back Approve with two optional polish items. Both taken to fully converge the loop: 1. Regression test for the empty-stream wedge-clear gate (added in 3c4eef49). A degenerate stream that iterates without raising but emits NEITHER an AssistantMessage NOR a ResultMessage must NOT clear the wedge flag — pre-set wedge persists, the next heartbeat still reports runtime_state="wedged". Pins the gate against future regression. 2. Replaced the type annotation `"dict[str, callable[[dict], str]]"` (lowercase `callable`, string-quoted) with the proper `dict[str, Callable[[dict], str]]` using `Callable` from `collections.abc`. Benign before (`from __future__ import annotations` makes the annotation a string Python never evaluates), but pyright/mypy may flag the lowercase form. 65 Python tests pass. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-25 08:52:32 -07:00 · 2026-04-25 08:52:32 -07:00 · 2ee4b67cab
commit 2ee4b67cab
parent 3c4eef49aa
2 changed files with 37 additions and 2 deletions
--- a/workspace/claude_sdk_executor.py
+++ b/workspace/claude_sdk_executor.py
@ -29,7 +29,7 @@ import asyncio
 import logging
 import os
 import sys
-from collections.abc import AsyncIterator
+from collections.abc import AsyncIterator, Callable
 from dataclasses import dataclass
 from typing import TYPE_CHECKING, Any

@ -160,7 +160,7 @@ def _reset_sdk_wedge_for_test() -> None:
 # table falls through to a generic "🛠 <tool>(…)" line. Order keys by
 # tool frequency so a future contributor can see the high-traffic
 # tools first.
-_TOOL_USE_SUMMARIZERS: "dict[str, callable[[dict], str]]" = {
+_TOOL_USE_SUMMARIZERS: dict[str, Callable[[dict], str]] = {
    "Read":  lambda i: f"📄 Read {i.get('file_path', '?')}",
    "Write": lambda i: f"✍️  Write {i.get('file_path', '?')}",
    "Edit":  lambda i: f"✏️  Edit {i.get('file_path', '?')}",
--- a/workspace/tests/test_claude_sdk_executor.py
+++ b/workspace/tests/test_claude_sdk_executor.py
@ -1354,3 +1354,38 @@ async def test_execute_clears_wedge_on_successful_query():
            assert _executor_mod.wedge_reason() == ""
        finally:
            _executor_mod._reset_sdk_wedge_for_test()
+
+
+@pytest.mark.asyncio
+async def test_execute_does_not_clear_wedge_on_empty_stream():
+    """Regression for the gate added in 3c4eef49: a stream that
+    iterates without raising but emits NEITHER an AssistantMessage
+    NOR a ResultMessage (degenerate or stub-driven shape) must NOT
+    clear the wedge flag. A real successful query yields at least
+    one of those; treating an empty stream as "recovered" would
+    falsely flip the workspace back to online without any evidence
+    the SDK is actually working."""
+    _executor_mod._reset_sdk_wedge_for_test()
+    _executor_mod._mark_sdk_wedged("pre-existing wedge — must not clear on empty stream")
+    assert _executor_mod.is_wedged() is True
+
+    e = _make_executor()
+    ctx = _make_context(["test prompt"])
+    eq = _make_event_queue()
+
+    async def empty_query(prompt, options):
+        # Iterator returns without yielding — the degenerate case.
+        if False:
+            yield  # pragma: no cover
+
+    with patch("claude_sdk_executor.recall_memories", new=AsyncMock(return_value="")), \
+         patch("claude_sdk_executor.read_delegation_results", return_value=""), \
+         patch("claude_sdk_executor.commit_memory", new=AsyncMock()), \
+         patch("claude_sdk_executor.set_current_task", new=AsyncMock()), \
+         patch("claude_agent_sdk.query", new=empty_query):
+        try:
+            await e.execute(ctx, eq)
+            assert _executor_mod.is_wedged() is True, \
+                "wedge must persist when the stream emitted no content"
+        finally:
+            _executor_mod._reset_sdk_wedge_for_test()