docs(blog): add audit chain verification explainer
HMAC-SHA256 immutable ledger architecture + PR #1339 panic fix. Companion to org-scoped API keys post. Enterprise/compliance audience. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
parent
9ef87a4f1e
commit
e20ec33d33
129
docs/blog/2026-04-21-audit-chain-verification/index.md
Normal file
129
docs/blog/2026-04-21-audit-chain-verification/index.md
Normal file
@ -0,0 +1,129 @@
|
||||
---
|
||||
title: "How Molecule AI's Audit Ledger Works: HMAC Chains and the Fix That Made It Production-Ready"
|
||||
date: 2026-04-21
|
||||
slug: audit-chain-verification
|
||||
description: "Every agent decision logged, chained with HMAC-SHA256, and verified tamper-evident. Here's the architecture behind Molecule AI's audit trail — and the panic bug fix that shipped in PR #1339."
|
||||
tags: [security, audit, HMAC, enterprise, compliance]
|
||||
---
|
||||
|
||||
# How Molecule AI's Audit Ledger Works: HMAC Chains and the Fix That Made It Production-Ready
|
||||
|
||||
Every time an agent in your Molecule AI org does something — delegates a task, calls a tool, reads a secret, or makes an external API call — that event is written to an append-only audit log. That log is chained with HMAC-SHA256 so that any tampering with past entries is detectable, provable, and logged.
|
||||
|
||||
This post explains how that system works and what changed in PR #1339.
|
||||
|
||||
---
|
||||
|
||||
## The problem with plain audit logs
|
||||
|
||||
A standard audit log is a list of events with timestamps. It's useful for debugging, but it has a structural weakness: nothing stops someone with database access from editing past rows. A malicious actor — or a buggy cleanup script — can remove or modify entries, and the log looks perfectly fine.
|
||||
|
||||
For production multi-agent systems, that matters. Your compliance team needs to know: *did that agent actually call the API it was supposed to call, or did it skip the approval step?* A plain log can't answer that with confidence.
|
||||
|
||||
Molecule AI's audit ledger is built to answer that question.
|
||||
|
||||
---
|
||||
|
||||
## HMAC-SHA256 chain architecture
|
||||
|
||||
The audit ledger is an **append-only, chain-verified log**. Each entry contains:
|
||||
|
||||
- The event data (who did what, when, what the result was)
|
||||
- An HMAC-SHA256 of the current entry, signed with a server-side secret
|
||||
- The HMAC of the *previous* entry embedded as part of the signing context
|
||||
|
||||
This creates a chain — like a blockchain, but not distributed. Every entry's HMAC depends on the previous entry's HMAC, which depends on the one before that, and so on back to the genesis entry.
|
||||
|
||||
```
|
||||
Entry 0: HMAC₀ = HMAC(genesis_payload + genesis_secret)
|
||||
Entry 1: HMAC₁ = HMAC(event₁ + HMAC₀ + secret)
|
||||
Entry 2: HMAC₂ = HMAC(event₂ + HMAC₁ + secret)
|
||||
...
|
||||
```
|
||||
|
||||
If you change *any* past entry, its HMAC changes. That breaks the chain at the next verification step. The tampered entry is detectable.
|
||||
|
||||
---
|
||||
|
||||
## Verifying the chain
|
||||
|
||||
`verifyAuditChain` walks the log from the beginning, recomputing each HMAC and comparing it against the stored value. If every entry verifies, the chain is intact — no tampering.
|
||||
|
||||
If an entry fails to verify, the function returns `false`. Your observability stack picks this up and can alert, halt, or log the discrepancy. The audit trail isn't just a record of what happened — it's a proof that the record hasn't been altered.
|
||||
|
||||
This is what compliance auditors want: not a log, but a **tamper-evident log with cryptographic guarantees**.
|
||||
|
||||
---
|
||||
|
||||
## What org-scoped keys add
|
||||
|
||||
Org-scoped API keys are the attribution layer on top of the integrity layer.
|
||||
|
||||
Each org key carries a name, a hash, and a prefix. Every authenticated call carries that prefix in the audit row:
|
||||
|
||||
```
|
||||
org-token:mole_a1b2 POST /workspaces/ws_abc123/secrets 200 3ms
|
||||
```
|
||||
|
||||
Combined with the HMAC chain, you get two guarantees simultaneously:
|
||||
1. **Integrity** — the audit log hasn't been tampered with (HMAC chain)
|
||||
2. **Attribution** — you know exactly which named key (and therefore which integration) made each call (org API keys)
|
||||
|
||||
For teams running SOC 2 or ISO 27001, this is the difference between "here's a log" and "here's a cryptographically verifiable, attributable record of everything that happened."
|
||||
|
||||
---
|
||||
|
||||
## The bug PR #1339 fixed
|
||||
|
||||
In Go, slicing a string beyond its length causes a panic:
|
||||
|
||||
```go
|
||||
// This panics if len(ev.HMAC) < 12
|
||||
log.Printf("expected: %q got: %q", ev.HMAC[:12], expected[:12])
|
||||
```
|
||||
|
||||
`verifyAuditChain` was using `[:12]` to truncate HMACs for log readability — 12 characters is enough to identify a key without printing the full hash. But if an audit row had been corrupted (a database write failure, a migration bug, manual intervention), the stored HMAC could be shorter than 12 bytes. When that row was processed, the verification pass would panic and crash.
|
||||
|
||||
A tamper attempt wouldn't just fail verification — it would take down the verification process.
|
||||
|
||||
**The fix (PR #1339):** add a length check before truncation.
|
||||
|
||||
```go
|
||||
storedPrefix := ev.HMAC
|
||||
if len(storedPrefix) > 12 {
|
||||
storedPrefix = storedPrefix[:12]
|
||||
}
|
||||
computedPrefix := expected
|
||||
if len(computedPrefix) > 12 {
|
||||
computedPrefix = computedPrefix[:12]
|
||||
}
|
||||
log.Printf("expected: %q got: %q", storedPrefix, computedPrefix)
|
||||
```
|
||||
|
||||
The logic is unchanged — if the HMAC is long enough, the same 12-char prefix is logged. If it's short or missing, a shorter prefix (or none) is logged. Either way, the chain verification still runs, and mismatches still fail correctly.
|
||||
|
||||
The panic is gone. The integrity guarantee holds.
|
||||
|
||||
---
|
||||
|
||||
## What this means for production deployments
|
||||
|
||||
If you're running Molecule AI in a production environment:
|
||||
|
||||
- **The audit log is tamper-evident by construction.** You can verify the chain integrity programmatically at any point and alert on failures.
|
||||
- **Org-scoped keys give you per-integration attribution.** A compromised CI key is identifiable, revocable, and its entire call history is reconstructable.
|
||||
- **PR #1339 ensures the verification pass itself is hardened.** Corrupt rows — whether from a bug, a migration, or an attack — are handled gracefully, not catastrophically.
|
||||
|
||||
The combination of HMAC chain + org-scoped key attribution + immediate revocation is the foundation of Molecule AI's production trust model for enterprise teams.
|
||||
|
||||
---
|
||||
|
||||
## Next steps
|
||||
|
||||
- [Org-scoped API keys guide](/docs/guides/org-api-keys) — mint your first named key
|
||||
- [Architecture: Org API Keys](/docs/architecture/org-api-keys) — the full design
|
||||
- [Platform API Reference](/docs/api-reference) — audit log endpoints
|
||||
|
||||
---
|
||||
|
||||
*HMAC-SHA256 audit ledger shipped in PR #594. HMAC truncation guard shipped in PR #1339. Org-scoped API keys shipped in PRs #1105, #1107, #1109, #1110.*
|
||||
Loading…
Reference in New Issue
Block a user