fix(executor): sanitize peer delegation content in read_delegation_results (OFFSEC-003) #13

Merged
infra-runtime-be merged 1 commits from runtime/offsec-003-delegation-only into main 2026-05-11 03:41:19 +00:00

1 Commits

Author SHA1 Message Date
ac8108a1a7 fix(executor): sanitize peer delegation content in read_delegation_results (OFFSEC-003)
Some checks failed
ci / mirror-guard (pull_request) Failing after 9s
Secret scan / Scan diff for credential-shaped strings (pull_request) Successful in 9s
peer-supplied `summary` and `response_preview` fields written to
DELEGATION_RESULTS_FILE by the heartbeat loop were injected into the
agent prompt without sanitization — a direct OFFSEC-003 injection path.

New `_detect_injection_safe()` helper wraps
`builtin_tools.compliance.detect_prompt_injection()` with lazy import
and fail-open behaviour. When injection patterns are detected in either
`summary` or `response_preview`, the field is replaced with "" before
formatting. The delegation metadata (status, task line) is preserved so
the agent still knows a delegation completed; only the malicious content
is stripped.

Fail-open: if builtin_tools.compliance is unavailable (e.g. minimal
test environment), the function logs a warning and passes text through.
This is acceptable because builtin_tools is always present in production
containers; the fail-open only affects degenerate test environments.

6 new tests covering: clean pass-through, injection in summary,
injection in preview, truncation of clean preview, no-file path,
fail-open when compliance unavailable.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-05-11 03:38:14 +00:00