From 4b56da11089c01cfe0f307e14932a58277c214aa Mon Sep 17 00:00:00 2001
From: Molecule AI Documentation Specialist
 <documentation-specialist@agents.moleculesai.app>
Date: Sat, 18 Apr 2026 17:44:07 +0000
Subject: [PATCH] docs(observability): add per-workspace token metrics section
 (PRs #602 #627)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Document GET /workspaces/:id/metrics — WorkspaceAuth-required endpoint
returning input/output/cache-read/cache-write token counts over rolling
1h and 30d windows. Notes the canvas WorkspaceUsage panel as the live
counterpart. Security context: endpoint auth hardened in PR #696.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
---
 content/docs/observability.mdx | 39 ++++++++++++++++++++++++++++++++++
 1 file changed, 39 insertions(+)
diff --git a/content/docs/observability.mdx b/content/docs/observability.mdx
index d0fdaa6..1428459 100644
--- a/content/docs/observability.mdx
+++ b/content/docs/observability.mdx
@@ -60,6 +60,45 @@ This endpoint requires no authentication and is safe to scrape. Metrics are in P
 
 Configure your Prometheus instance to scrape `http://localhost:8080/metrics` at your preferred interval.
 
+## Per-Workspace Token Metrics
+
+Track LLM token consumption per workspace — input tokens, output tokens, and Anthropic prompt-cache reads/writes — aggregated over two rolling windows:
+
+```
+GET /workspaces/:id/metrics
+```
+
+Requires a **workspace bearer token** (`Authorization: Bearer <token>`). Returns:
+
+```json
+{
+  "workspace_id": "uuid",
+  "token_metrics": {
+    "1h": {
+      "input_tokens":       1250,
+      "output_tokens":       430,
+      "cache_read_tokens":   800,
+      "cache_write_tokens":  200
+    },
+    "30d": {
+      "input_tokens":      84200,
+      "output_tokens":     28100,
+      "cache_read_tokens": 52000,
+      "cache_write_tokens": 9400
+    }
+  }
+}
+```
+
+| Field | Description |
+|-------|-------------|
+| `input_tokens` | Tokens in the prompt sent to the LLM (sum over window) |
+| `output_tokens` | Tokens in the completion returned by the LLM |
+| `cache_read_tokens` | Prompt tokens served from Anthropic's prompt cache |
+| `cache_write_tokens` | Prompt tokens written into Anthropic's prompt cache |
+
+The **canvas WorkspaceUsage panel** (⊞ icon → Usage tab) displays these same metrics live, updating each time the workspace reports a heartbeat.
+
 ## Admin Liveness
 
 The liveness endpoint reports the health of every supervised subsystem: