fix(workspace): credential helper security hardening (#1797)

Four findings from security audit (internal/security/credential-token-backlog.md):

1. STDERR LEAK — molecule-git-token-helper.sh:146,153 logged ${response}
   on platform errors. The response body MAY contain the token in some
   failure modes (alternate JSON key shape on partial success). Now:
   - capture curl's stderr to a tmp file (not $response) so we can log
     the curl error message without ever interpolating the response body
   - on empty-token branch, log only response size (bytes) for debug
2. CHMOD 600 — already in place at lines 116, 124, 223 (verified, no change)
3. RESPAWN SUPERVISION — entrypoint.sh wrapped daemon launch in a
   while-true bash loop with 30s back-off. Without this, a daemon crash
   silently leaves the workspace stuck on an expired token until the
   container restarts. Logs to /home/agent/.gh-token-refresh.log
   (agent-writable; /var/log is root-owned).
4. JITTER — molecule-gh-token-refresh.sh: added 0..120s random offset to
   each sleep so 39 containers don't synchronize their refresh requests
   against the platform endpoint.

Also:
- Daemon now sends helper output to /dev/null instead of merging stderr,
  belt-and-suspenders against any future helper change that might write
  the token to stdout.
- Daemon log lines include rc=$? on failure for actionable triage.

Inherent risks (org-wide token blast, prompt-injection theft, bearer
in volume, no audit log) tracked in internal/security/credential-token-backlog.md
as separate roadmap items.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: molecule-ai[bot] <276602405+molecule-ai[bot]@users.noreply.github.com>
This commit is contained in:
Hongming Wang 2026-04-23 11:14:55 -07:00 committed by GitHub
parent cfdaefe5bc
commit 925a71887d
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
3 changed files with 55 additions and 30 deletions

View File

@ -71,11 +71,20 @@ fi
# Now running as agent (uid 1000)
# --- Start background token refresh daemon ---
# --- Start background token refresh daemon (with respawn supervision) ---
# Keeps gh CLI and git credentials fresh across the 60-min token TTL.
# Runs in the background; entrypoint continues to exec molecule-runtime.
# Wrapped in a respawn loop so a daemon crash doesn't silently leave the
# workspace stuck on an expired token. Runs in the background; entrypoint
# continues to exec molecule-runtime.
if [ -x /app/scripts/molecule-gh-token-refresh.sh ]; then
nohup /app/scripts/molecule-gh-token-refresh.sh > /dev/null 2>&1 &
nohup bash -c '
while true; do
/app/scripts/molecule-gh-token-refresh.sh
rc=$?
echo "[molecule-gh-token-refresh] daemon exited rc=$rc — respawning in 30s" >&2
sleep 30
done
' > /home/agent/.gh-token-refresh.log 2>&1 &
fi
# --- Initial gh auth setup ---

View File

@ -2,53 +2,56 @@
# molecule-gh-token-refresh.sh — background daemon that keeps GitHub
# credentials fresh inside Molecule AI workspace containers.
#
# Runs as a background process started by entrypoint.sh. Every
# REFRESH_INTERVAL_SEC (default 45 min = 2700s) it calls the credential
# helper's _refresh_gh action which:
# 1. Fetches a fresh installation token from the platform API
# 2. Updates the local cache (used by git credential helper)
# 3. Runs `gh auth login --with-token` so `gh` CLI stays authenticated
# 4. Writes ~/.gh_token for any scripts that read it
# Started by entrypoint.sh under a respawn wrapper. Every
# REFRESH_INTERVAL_SEC + jitter (default 45 min ± 2 min) it calls the
# credential helper's _refresh_gh action.
#
# The daemon logs to stderr (captured by Docker) and is designed to be
# fire-and-forget — if a single refresh fails, it logs the error and
# retries on the next interval. The credential helper itself has a
# fallback chain (cache > API > env var) so a missed refresh is not
# immediately fatal.
# # Jitter
# A 0..120s random offset prevents 39 containers from synchronizing
# their refresh requests against /workspaces/:id/github-installation-token.
#
# Usage (from entrypoint.sh):
# nohup /app/scripts/molecule-gh-token-refresh.sh &
# # Security
# - This daemon NEVER prints token values. Failures log the helper's
# exit code only, not its stderr, so token bytes can't leak via the
# docker log pipeline.
# - The helper script is responsible for chmod 600 on cache files.
#
set -uo pipefail
HELPER_SCRIPT="/app/scripts/molecule-git-token-helper.sh"
HELPER_SCRIPT="${TOKEN_HELPER_SCRIPT:-/app/scripts/molecule-git-token-helper.sh}"
REFRESH_INTERVAL_SEC="${TOKEN_REFRESH_INTERVAL_SEC:-2700}" # 45 min
JITTER_MAX_SEC="${TOKEN_REFRESH_JITTER_SEC:-120}"
INITIAL_DELAY_SEC="${TOKEN_REFRESH_INITIAL_DELAY_SEC:-60}"
log() {
echo "[molecule-gh-token-refresh] $(date -u '+%Y-%m-%dT%H:%M:%SZ') $*" >&2
}
# Wait a short time before the first refresh to let the container finish
# booting and .auth_token to be written by the runtime's register call.
INITIAL_DELAY_SEC="${TOKEN_REFRESH_INITIAL_DELAY_SEC:-60}"
log "starting (interval=${REFRESH_INTERVAL_SEC}s, initial_delay=${INITIAL_DELAY_SEC}s)"
jittered_sleep() {
local base="$1"
local jitter=$((RANDOM % (JITTER_MAX_SEC + 1)))
sleep $((base + jitter))
}
log "starting (interval=${REFRESH_INTERVAL_SEC}s ± ${JITTER_MAX_SEC}s, initial_delay=${INITIAL_DELAY_SEC}s)"
sleep "${INITIAL_DELAY_SEC}"
# Initial refresh — prime the cache + gh auth immediately after boot.
# Discard helper output to /dev/null so token can't leak via docker logs.
log "initial token refresh"
if bash "${HELPER_SCRIPT}" _refresh_gh 2>&1; then
if bash "${HELPER_SCRIPT}" _refresh_gh >/dev/null 2>&1; then
log "initial refresh succeeded"
else
log "initial refresh failed (will retry in ${REFRESH_INTERVAL_SEC}s)"
log "initial refresh failed (rc=$?) — will retry in ~${REFRESH_INTERVAL_SEC}s"
fi
# Steady-state loop.
while true; do
sleep "${REFRESH_INTERVAL_SEC}"
jittered_sleep "${REFRESH_INTERVAL_SEC}"
log "periodic token refresh"
if bash "${HELPER_SCRIPT}" _refresh_gh 2>&1; then
if bash "${HELPER_SCRIPT}" _refresh_gh >/dev/null 2>&1; then
log "refresh succeeded"
else
log "refresh failed (will retry in ${REFRESH_INTERVAL_SEC}s)"
log "refresh failed (rc=$?) — will retry in ~${REFRESH_INTERVAL_SEC}s"
fi
done

View File

@ -138,19 +138,32 @@ _fetch_token_from_api() {
return 1
fi
# NOTE: capture stderr to a tmp file (NOT $response) so the response
# body — which contains the token on success — never lands in error
# log lines via $response interpolation.
local _err_file
_err_file=$(mktemp)
response=$(curl -sf \
-H "Authorization: Bearer ${bearer}" \
-H "Accept: application/json" \
--max-time 10 \
"${ENDPOINT}" 2>&1) || {
echo "[molecule-git-token-helper] platform request failed: ${response}" >&2
"${ENDPOINT}" 2>"${_err_file}") || {
local _curl_rc=$?
local _err_msg
_err_msg=$(cat "${_err_file}")
rm -f "${_err_file}"
echo "[molecule-git-token-helper] platform request failed (curl rc=${_curl_rc}): ${_err_msg}" >&2
return 1
}
rm -f "${_err_file}"
# Parse {"token":"ghs_...","expires_at":"..."} with sed (no jq dependency).
token=$(echo "${response}" | sed -n 's/.*"token":"\([^"]*\)".*/\1/p')
if [ -z "${token}" ]; then
echo "[molecule-git-token-helper] empty token in platform response: ${response}" >&2
# SECURITY: the response body MAY contain a token under a different
# JSON key name. Never include $response in this error message —
# log only the size as a coarse debugging signal.
echo "[molecule-git-token-helper] empty token in platform response (body=${#response} bytes)" >&2
return 1
fi