docs(blog): add audit chain verification explainer

HMAC-SHA256 immutable ledger architecture + PR #1339 panic fix. Companion to org-scoped API keys post. Enterprise/compliance audience. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-21 11:08:01 +00:00 · 2026-04-21 11:08:01 +00:00 · e20ec33d33
commit e20ec33d33
parent 9ef87a4f1e
1 changed files with 129 additions and 0 deletions
--- a/docs/blog/2026-04-21-audit-chain-verification/index.md
+++ b/docs/blog/2026-04-21-audit-chain-verification/index.md
@ -0,0 +1,129 @@
+---
+title: "How Molecule AI's Audit Ledger Works: HMAC Chains and the Fix That Made It Production-Ready"
+date: 2026-04-21
+slug: audit-chain-verification
+description: "Every agent decision logged, chained with HMAC-SHA256, and verified tamper-evident. Here's the architecture behind Molecule AI's audit trail — and the panic bug fix that shipped in PR #1339."
+tags: [security, audit, HMAC, enterprise, compliance]
+---
+
+# How Molecule AI's Audit Ledger Works: HMAC Chains and the Fix That Made It Production-Ready
+
+Every time an agent in your Molecule AI org does something — delegates a task, calls a tool, reads a secret, or makes an external API call — that event is written to an append-only audit log. That log is chained with HMAC-SHA256 so that any tampering with past entries is detectable, provable, and logged.
+
+This post explains how that system works and what changed in PR #1339.
+
+---
+
+## The problem with plain audit logs
+
+A standard audit log is a list of events with timestamps. It's useful for debugging, but it has a structural weakness: nothing stops someone with database access from editing past rows. A malicious actor — or a buggy cleanup script — can remove or modify entries, and the log looks perfectly fine.
+
+For production multi-agent systems, that matters. Your compliance team needs to know: *did that agent actually call the API it was supposed to call, or did it skip the approval step?* A plain log can't answer that with confidence.
+
+Molecule AI's audit ledger is built to answer that question.
+
+---
+
+## HMAC-SHA256 chain architecture
+
+The audit ledger is an **append-only, chain-verified log**. Each entry contains:
+
+- The event data (who did what, when, what the result was)
+- An HMAC-SHA256 of the current entry, signed with a server-side secret
+- The HMAC of the *previous* entry embedded as part of the signing context
+
+This creates a chain — like a blockchain, but not distributed. Every entry's HMAC depends on the previous entry's HMAC, which depends on the one before that, and so on back to the genesis entry.
+
+```
+Entry 0: HMAC₀ = HMAC(genesis_payload + genesis_secret)
+Entry 1: HMAC₁ = HMAC(event₁ + HMAC₀ + secret)
+Entry 2: HMAC₂ = HMAC(event₂ + HMAC₁ + secret)
+...
+```
+
+If you change *any* past entry, its HMAC changes. That breaks the chain at the next verification step. The tampered entry is detectable.
+
+---
+
+## Verifying the chain
+
+`verifyAuditChain` walks the log from the beginning, recomputing each HMAC and comparing it against the stored value. If every entry verifies, the chain is intact — no tampering.
+
+If an entry fails to verify, the function returns `false`. Your observability stack picks this up and can alert, halt, or log the discrepancy. The audit trail isn't just a record of what happened — it's a proof that the record hasn't been altered.
+
+This is what compliance auditors want: not a log, but a **tamper-evident log with cryptographic guarantees**.
+
+---
+
+## What org-scoped keys add
+
+Org-scoped API keys are the attribution layer on top of the integrity layer.
+
+Each org key carries a name, a hash, and a prefix. Every authenticated call carries that prefix in the audit row:
+
+```
+org-token:mole_a1b2 POST /workspaces/ws_abc123/secrets 200 3ms
+```
+
+Combined with the HMAC chain, you get two guarantees simultaneously:
+1. **Integrity** — the audit log hasn't been tampered with (HMAC chain)
+2. **Attribution** — you know exactly which named key (and therefore which integration) made each call (org API keys)
+
+For teams running SOC 2 or ISO 27001, this is the difference between "here's a log" and "here's a cryptographically verifiable, attributable record of everything that happened."
+
+---
+
+## The bug PR #1339 fixed
+
+In Go, slicing a string beyond its length causes a panic:
+
+```go
+// This panics if len(ev.HMAC) < 12
+log.Printf("expected: %q  got: %q", ev.HMAC[:12], expected[:12])
+```
+
+`verifyAuditChain` was using `[:12]` to truncate HMACs for log readability — 12 characters is enough to identify a key without printing the full hash. But if an audit row had been corrupted (a database write failure, a migration bug, manual intervention), the stored HMAC could be shorter than 12 bytes. When that row was processed, the verification pass would panic and crash.
+
+A tamper attempt wouldn't just fail verification — it would take down the verification process.
+
+**The fix (PR #1339):** add a length check before truncation.
+
+```go
+storedPrefix := ev.HMAC
+if len(storedPrefix) > 12 {
+    storedPrefix = storedPrefix[:12]
+}
+computedPrefix := expected
+if len(computedPrefix) > 12 {
+    computedPrefix = computedPrefix[:12]
+}
+log.Printf("expected: %q  got: %q", storedPrefix, computedPrefix)
+```
+
+The logic is unchanged — if the HMAC is long enough, the same 12-char prefix is logged. If it's short or missing, a shorter prefix (or none) is logged. Either way, the chain verification still runs, and mismatches still fail correctly.
+
+The panic is gone. The integrity guarantee holds.
+
+---
+
+## What this means for production deployments
+
+If you're running Molecule AI in a production environment:
+
+- **The audit log is tamper-evident by construction.** You can verify the chain integrity programmatically at any point and alert on failures.
+- **Org-scoped keys give you per-integration attribution.** A compromised CI key is identifiable, revocable, and its entire call history is reconstructable.
+- **PR #1339 ensures the verification pass itself is hardened.** Corrupt rows — whether from a bug, a migration, or an attack — are handled gracefully, not catastrophically.
+
+The combination of HMAC chain + org-scoped key attribution + immediate revocation is the foundation of Molecule AI's production trust model for enterprise teams.
+
+---
+
+## Next steps
+
+- [Org-scoped API keys guide](/docs/guides/org-api-keys) — mint your first named key
+- [Architecture: Org API Keys](/docs/architecture/org-api-keys) — the full design
+- [Platform API Reference](/docs/api-reference) — audit log endpoints
+
+---
+
+*HMAC-SHA256 audit ledger shipped in PR #594. HMAC truncation guard shipped in PR #1339. Org-scoped API keys shipped in PRs #1105, #1107, #1109, #1110.*