docs(workspace-runtime): correct smoke-gate caveat factual errors

Two errors in the merged caveat (#107): 1. Claimed the stub RequestContext "carries an empty user message" — actually carries "smoke test" text (smoke_mode.py:76 calls `new_text_message("smoke test")`, with the explicit comment that it's "enough that extract_message_text(context) returns non-empty input"). Adapter authors gating smoke-mode behavior on extract_message_text(ctx) == "" would have a logic that never fires. 2. Described only the timeout-pass path. The harness also returns 0 on ANY non-import exception (smoke_mode.py:135-143) — the bare `except Exception` block treats RuntimeError, auth errors, validation errors etc. as "downstream of the import gate" and exits clean. Spelling out all three pass cases (clean return, timeout, non-import exception) is the honest description. Caught while re-reading smoke_mode.py to verify claims for a review pass — found I had asserted both behaviors from memory without checking, exactly the failure mode my e2e-test memory just got a worked-example update about. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-01 00:00:02 -07:00 · 2026-05-01 00:00:02 -07:00 · 28600d7956
commit 28600d7956
parent 36ab08129c
1 changed files with 7 additions and 1 deletions
--- a/content/docs/agent-runtime/workspace-runtime.md
+++ b/content/docs/agent-runtime/workspace-runtime.md
@ -117,7 +117,13 @@ fi

 A green gate means **"imports are healthy enough that `executor.execute()` reaches its body"** — that's the regression class the gate exists to catch (lazy `from x import y` inside an `if`-branch, or `importlib.import_module()` on a path that breaks after a wheel bump).

-It does **not** prove that `execute()` produces the right output for real input. Adapters that make real I/O calls inside `execute()` (subprocess to a gateway, httpx call to an upstream LLM) will time out under the harness's default 5s window, and the gate treats a clean timeout as success. The stub `RequestContext` carries an empty user message and the harness never inspects what `execute()` writes back.
+It does **not** prove that `execute()` produces the right output for real input. The harness reports PASS in three distinct cases:
+
+1. **Clean return** — execute() ran to completion within the timeout.
+2. **Timeout** — execute() was still running when the timer fired (typical for adapters that do real I/O inside execute(): subprocess to a gateway, httpx call to an upstream LLM).
+3. **Any non-import exception** — execute() raised `RuntimeError`, auth errors, validation errors, etc. The harness only fails on `ImportError`/`ModuleNotFoundError`.
+
+The stub `RequestContext` carries a non-empty `"smoke test"` text message (so adapters relying on `extract_message_text(ctx)` returning input still work), and the harness never drains the `EventQueue` — what `execute()` writes back is ignored.

 If you need correctness coverage, write a separate integration test that runs the workspace against real or mocked infrastructure — the smoke gate is a strict subset.