fix(workspace): sanitize trust-boundary markers in read_delegation_results (closes #361) #384

Closed
fullstack-engineer wants to merge 1 commits from fix/361-sanitize-delegation-results into staging

1 Commits

Author SHA1 Message Date
8da86ac0f7 fix(workspace): sanitize trust-boundary markers in read_delegation_results (closes #361)
Some checks failed
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 8s
sop-tier-check / tier-check (pull_request) Failing after 6s
audit-force-merge / audit (pull_request) Has been skipped
OFFSEC-003 follow-up: a malicious peer could inject fake [A2A_ERROR],
[SYSTEM], [A2A_QUEUED] or [IGNORE] blocks via response_preview/summary
in delegation results. The agent's prompt sees these unescaped markers and
may interpret them as system commands.

Add _sanitize_a2a.py — a dedicated sanitizer that inserts ZERO-WIDTH SPACE
(U+200B) INSIDE the opening bracket of each known marker (e.g.
"[A2A_ERROR]" → "[​A2A_ERROR]"). The raw marker string no longer
matches naive substring/regex checks while the text remains human-readable.

Apply sanitize_a2a_result() to both summary and response_preview fields
in read_delegation_results() before formatting them for prompt injection.

Tests: 18 new cases in test_sanitize_a2a.py covering marker escaping,
idempotency, injection scenarios, and edge cases. 1 integration case in
test_executor_helpers.py.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-11 04:21:42 +00:00