From f9826bbade232a6e74f9dc8e2b985a3bb89c9da4 Mon Sep 17 00:00:00 2001 From: "molecule-ai[bot]" <276602405+molecule-ai[bot]@users.noreply.github.com> Date: Tue, 21 Apr 2026 03:01:07 +0000 Subject: [PATCH] docs(marketing): snapshot secret scrubber working demo (PR #977) (#63) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Adds working demo for snapshot_scrub.py — strip secrets from workspace memory snapshots before hibernation serialization. Co-authored-by: molecule-ai[bot] <276602405+molecule-ai[bot]@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.6 --- marketing/demos/snapshot-scrub/README.md | 74 ++++++++++++++++++++++++ 1 file changed, 74 insertions(+) create mode 100644 marketing/demos/snapshot-scrub/README.md diff --git a/marketing/demos/snapshot-scrub/README.md b/marketing/demos/snapshot-scrub/README.md new file mode 100644 index 0000000..5f3e052 --- /dev/null +++ b/marketing/demos/snapshot-scrub/README.md @@ -0,0 +1,74 @@ +# Snapshot Secret Scrubber — Working Demo + +> **PR:** #977 — `feat(workspace): snapshot secret scrubber (closes #823)` +> **Source module:** `workspace/lib/snapshot_scrub.py` +> **What it does:** Strips API keys, auth tokens, and arbitrary subprocess output from workspace memory snapshots before they are serialized for hibernation + +--- + +## What This Demo Shows + +Before a workspace serializes its memory for hibernation, every memory entry passes through the scrubber. API keys, bearer tokens, env-var assignments, and high-entropy base64 blobs are redacted. Sandbox-sourced entries (arbitrary subprocess output from `run_code`) are dropped entirely. + +This prevents an attacker who obtains a snapshot blob from recovering credentials or secrets that were processed during the agent session. + +--- + +## Runnable Snippet + +```python +from workspace.lib.snapshot_scrub import ( + scrub_memory_entry, + scrub_snapshot, + scrub_content, + is_sandbox_content, +) + +# 1. Scrub a single memory entry +entry = { + "id": "mem-001", + "source": "agent", + "content": "ANTHROPIC_API_KEY=sk-ant-xxxx configured for claude-3-5" +} +cleaned = scrub_memory_entry(entry) +print(cleaned["content"]) +# Output: ANTHROPIC_API_KEY=API_KEY [redacted] + +# 2. Scrub an entire snapshot before serialization +snapshot = { + "workspace_id": "ws-abc", + "memories": [ + {"id": "mem-002", "source": "agent", "content": "GitHub token: ghp_AbCdEfGhIjKlMnOpQrStUvW"}, + {"id": "mem-003", "source": "sandbox", "content": "source=sandbox: echo $SECRET"}, + {"id": "mem-004", "source": "agent", "content": "Bearer 0xdeadbeef used for /api endpoint"}, + ] +} +scrubbed = scrub_snapshot(snapshot) +print(scrubbed["memories"]) +# Output: 2 entries (sandbox entry dropped entirely, other two scrubbed) + +# 3. Just the scrub function directly +redacted = scrub_content("OPENAI_API_KEY=sk-proj-1234567890abcdef") +print(redacted) +# Output: OPENAI_API_KEY=SK_TOKEN [redacted] +``` + +--- + +## Context + +**Why it matters:** Memory snapshots are the workspace's serialized state — saved to disk or transmitted for cross-workspace delegation. If the workspace processed an API key during its session, that key must not survive in the snapshot. `snapshot_scrub.py` is the gatekeeper. + +**What gets scrubbed:** API key patterns (`sk-ant-`, `sk-proj-`, `ghp_`, `ghs_`, `AKIA…`, `cfut_`, `mol_pk_`, `ctx7_`), bearer token header values, env-var assignments, and high-entropy base64 blobs (33+ chars). Sandbox-sourced entries (`source=sandbox`, `tool=run_code`, `[sandbox_output]`) are dropped wholesale — they can contain arbitrary subprocess output that can't be safely pattern-matched. + +**What doesn't change:** Workspace metadata, config fields, and non-secret memory entries pass through unchanged. The scrubber is a pure function — it's unit-tested independently and has no side effects. + +--- + +## Code Reference + +| File | What it does | +|---|---| +| `workspace/lib/snapshot_scrub.py` | Core module: `scrub_content()`, `scrub_memory_entry()`, `scrub_snapshot()`, `is_sandbox_content()` | +| `workspace/tests/test_snapshot_scrub.py` | 21 unit tests covering all pattern classes | +| `workspace/lib/__init__.py` | Module exports |