audit: phase 1 structured audit-log — emit pkg + secrets wire-in #1572
Reference in New Issue
Block a user
Delete Branch "infra-sre/audit-log-phase1-emit-secrets"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Phase 1 of the structured audit-log to Loki (see internal RFC
rfcs/audit-log-to-loki.md).What
New
workspace-server/internal/auditpackage:Emit(ctx, event_type, fields)— single entrypoint, never errors, never blocks.HashValuePrefix(v, n)— sha256 prefix for value identification without exposure.WithUserID/WithActorKind/WithCorrelationID/WithWorkspaceID.audit:-prefixed stdout (ships via existing tenant Vector docker-logs source → Loki, no obs-stack change), (2) best-effort append to/var/log/molecule-audit.jsonl(durable forensic copy, Phase-2 file-source target).Wired into
internal/handlers/secrets.go:Set(PUT/POST/workspaces/:id/secrets) →secret.setevent, scope=workspace.Delete(DELETE/workspaces/:id/secrets/:key) →secret.deleteevent, scope=workspace (only on rows>0).SetGlobal→secret.setevent, scope=global, actor=admin.DeleteGlobal→secret.deleteevent, scope=global, actor=admin.Why these handlers first
Secret rotation is the highest-security-value handler on the surface — "who rotated
STRIPE_LIVE_KEYat 03:00Z" is unanswerable today.Schema (v1, stable)
Cardinality
Fleet active streams ≈ tenants(~10) × event_types(~20) = ~200.
workspace_id/user_id/correlation_idstay INSIDE the JSON body (queryable via| json), not labels.Security contract
TestEmit_NeverIncludesSecretValues_OnlyHashasserts raw values never reach the audit line. Reviewers checking new event_types must verify this.Tests
TestEmit_WritesAuditPrefixedLineToStdoutTestEmit_AppendsToJSONLFileTestEmit_DefaultsActorToUserWhenUnsetTestEmit_FieldsWorkspaceIDOverridesContextTestEmit_NeverIncludesSecretValues_OnlyHashTestEmit_FileAppendFailureDoesNotBlockStdoutTestEmit_Concurrent_NoInterleavedLinesTestHashValuePrefix_StableAndBoundedgo vetclean. Existing secret-handler tests pass unchanged.Reviewers
docker_logssource capturesmolecule-tenantstdout — no obs-stack PR needed for Phase 1)Phase 1 only — DO NOT MERGE
Orchestrator 30-min cap. Surfacing for genuine non-author review per CTO
feedback_never_admin_merge_bypass.Add internal/audit with single Emit(ctx, event_type, fields) entrypoint that ships JSON-encoded records via two transports: 1. audit:-prefixed stdout line — tenant Vector docker-logs source already ships this to Loki. No obs-stack change required. 2. Best-effort append to /var/log/molecule-audit.jsonl — durable forensic copy, target for the dedicated Vector file source in Phase 2. Schema is stable v1 (ts, event_type, workspace_id, user_id, actor_kind, correlation_id, fields). Cardinality budget keeps workspace_id + user_id + correlation_id OUT of Loki labels (JSON body only) — fleet active-stream count ~200, well within Loki headroom. Phase 1 wires secret.set and secret.delete on the workspace-scoped (POST/PUT/DELETE /workspaces/:id/secrets) and admin-scoped (POST/DELETE /admin/secrets, /settings/secrets) handlers. value_hash is the first 8 hex chars of sha256(value) — never the raw value. Tests cover: stdout emit, JSONL append, file-failure fallback, concurrent integrity, hash bounds, raw-value-never-emitted contract. Vet + handler-secret tests pass. See: rfc internal/rfcs/audit-log-to-loki.md/qa-recheck
/security-recheck
5-axis (core-security):
Secrets: This is the entire POINT of the PR — structured audit emit MUST NOT log raw secret values. Verified:
secrets.goSet/SetGlobal callaudit.HashValuePrefix(body.Value, 8)→ sha256 prefix only.key(the name) but not any value.TestEmit_NeverIncludesSecretValues_OnlyHashis a contract test guarding against regression.log.Printf("audit: %s", payload)— no separate transport that bypasses the hash.Privilege: New
auditpackage: no new I/O endpoints, no DB writes, no auth-related changes. Best-effort emission (never errors out of request path) — emit cannot become a DoS vector if Loki/filesystem are unreachable. mode 0o640 on the JSONL file.Input handling:
Fieldsmap carries event-specific payload; the per-event fields go viajson.Marshal(no string interpolation).marshal_errfallback emits a degraded marker so the event boundary stays visible even on malformed input. No reflection-via-untrusted-data.Tenant isolation: All four wired sites (Set / Delete / SetGlobal / DeleteGlobal) emit workspace_id (workspace scope) or admin actor_kind (global scope). Loki labels stay LOW-cardinality (event_type top-20); workspace_id + user_id are inside JSON body, NOT labels (avoids the 100k-stream chunk limit class incident). Aligned with
reference_obs_system_accesslabel posture.Supply-chain: New imports:
crypto/sha256,encoding/hex(stdlib) and the package itself. No third-party additions, no go.mod changes (verified — no go.mod diff in PR).Staticcheck SA4000 fixup landed at
b5b95devia core-devops (test/var rebinding, behavior-preserving).APPROVE.
5-axis (core-qa, test/verification lens):
Test coverage: 8 tests in emit_test.go covering:
Schema stability: Wire fields documented as a stable contract ("extend by appending; never rename"). RFC3339Nano UTC timestamps. record struct keeps field order stable.
Failure-mode tests:
TestEmit_FileAppendFailureDoesNotBlockStdoutpoints at/proc/this/is/not/writableand asserts stdout still fires. This is critical because the package documents "MUST NOT fail the user request".Concurrency:
TestEmit_Concurrent_NoInterleavedLines(50-goroutine fan-out) validates thefileMulock against half-line interleave. Asserts every line is valid JSON post-write — direct test of the JSONL invariant.Staticcheck SA4000: Original
HashValuePrefix("x", 8) != HashValuePrefix("x", 8)was flagged tautological. Fixup commitb5b95de1binds both calls to vars (a,b) before comparing — test intent (call-stability) preserved, lint silenced. CI/Platform (Go) green atb5b95de.APPROVE.