From 2c8792d3e019811bfbbe63b858862f9ad9149c1d Mon Sep 17 00:00:00 2001 From: Hongming Wang Date: Tue, 28 Apr 2026 14:02:50 -0700 Subject: [PATCH] fix(ci): printf format-string sink + filename word-split in secret-scan MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Two latent bash bugs in the canonical secret-scan workflow caught during the post-merge review of molecule-controlplane #301 (a private consumer that inlined this workflow's logic and got both fixes there). Same bugs apply here; fixing in canonical means every public consumer (gh-identity, github-app-auth, the 8 workspace template repos) inherits the fix on their next workflow_call. Bug 1: `printf "$OFFENDING"` is a format-string sink. OFFENDING is built from filenames: `${f} (matched: ${pattern})\n`. When passed to printf as the first argument, `%` characters in a filename are interpreted as conversion specifiers — corrupting the error message or printing `%(missing)` artifacts. No filename in the current tree triggers it, but a future test fixture, build artifact, or contributor-supplied path could. Fix: `printf '%b' "$OFFENDING"` interprets the literal `\n` we appended without treating OFFENDING as a format string. Bug 2: `for f in $CHANGED` word-splits on whitespace. Filenames containing spaces would split into multiple tokens. The self-exclude check (`[ "$f" = "$SELF" ] && continue`) and the diff lookup would both operate on partial-path tokens. No filename in the current tree has whitespace, but the failure would be silent if one ever did. Fix: `while IFS= read -r f; do ... done <<< "$CHANGED"` reads whole lines as filenames. Added `[ -z "$f" ] && continue` to match the original `for` loop's implicit empty-input skip. Both fixes are mechanically straightforward (~16 lines net diff, mostly comments documenting the why). No behavior change for filenames in the current tree; strictly better for the edge cases. The same fixes already shipped in molecule-controlplane via #301 which inlined a copy of this workflow. The runtime's bundled pre-commit hook (molecule-ai-workspace-runtime: molecule_runtime/scripts/pre-commit-checks.sh) likely has the same bugs — flagged as a follow-up. Co-Authored-By: Claude Opus 4.7 (1M context) --- .github/workflows/secret-scan.yml | 19 ++++++++++++++++--- 1 file changed, 16 insertions(+), 3 deletions(-) diff --git a/.github/workflows/secret-scan.yml b/.github/workflows/secret-scan.yml index 2d1e557e..cebf89e9 100644 --- a/.github/workflows/secret-scan.yml +++ b/.github/workflows/secret-scan.yml @@ -148,7 +148,13 @@ jobs: SELF=".github/workflows/secret-scan.yml" OFFENDING="" - for f in $CHANGED; do + # `while IFS= read -r` (not `for f in $CHANGED`) so filenames + # containing whitespace don't word-split silently — a path + # with a space would otherwise produce two iterations on + # tokens that aren't real filenames, breaking the + # self-exclude + diff lookup. + while IFS= read -r f; do + [ -z "$f" ] && continue [ "$f" = "$SELF" ] && continue if [ -n "$DIFF_RANGE" ]; then ADDED=$(git diff --no-color --unified=0 "$BASE" "$HEAD" -- "$f" 2>/dev/null | grep -E '^\+[^+]' || true) @@ -164,11 +170,18 @@ jobs: break fi done - done + done <<< "$CHANGED" if [ -n "$OFFENDING" ]; then echo "::error::Credential-shaped strings detected in diff additions:" - printf "$OFFENDING" + # `printf '%b' "$OFFENDING"` interprets backslash escapes + # (the literal `\n` we appended above becomes a newline) + # WITHOUT treating OFFENDING as a format string. Plain + # `printf "$OFFENDING"` is a format-string sink: a filename + # containing `%` would be interpreted as a conversion + # specifier, corrupting the error message (or printing + # `%(missing)` artifacts). + printf '%b' "$OFFENDING" echo "" echo "The actual matched values are NOT echoed here, deliberately —" echo "round-tripping a leaked credential into CI logs widens the blast"