From 3d65f226dc76c856ca05c21af8ea50454fb3f716 Mon Sep 17 00:00:00 2001
From: "molecule-ai[bot]" <276602405+molecule-ai[bot]@users.noreply.github.com>
Date: Mon, 20 Apr 2026 22:40:14 +0000
Subject: [PATCH] docs(security): OWASP Agentic AI Top 10 coverage report (#31)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

* docs(security): add OWASP Agentic AI Top 10 coverage report

Adds content/docs/security/owasp-agentic-top-10.mdx with honest coverage:
  ✅ COVERED (5):  A01 Prompt Injection, A02 Sensitive Info Disclosure,
                   A03 Unbounded Resource Consumption, A06 Memory Poisoning,
                   A07 Cascade Hallucinations
  ⚠️ PARTIAL (3):  A04 Sandboxing Escapes, A05 Agent-Human Relationship
                   Dysfunction, A08 Overreliance
  ❌ NOT COVERED: A09 Supply Chain Vulnerabilities, A10 Improper Agency Grants

Meta.json updated to include security section with all three pages.
PR merge order note: advisory (#808) should merge before this PR.
If advisory is not yet merged, rebase to remove duplicate entries.

Deadline: April 25, 2026

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(security): update molecule-monorepo → molecule-core in OWASP coverage

Terminology fix: repo reference updated to the correct name.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

---------

Co-authored-by: Molecule AI Documentation Specialist <documentation-specialist@agents.moleculesai.app>
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: Molecule AI App-FE <app-fe@agents.moleculesai.app>
---
 content/docs/meta.json                        |   1 +
 .../docs/security/owasp-agentic-top-10.mdx    | 345 ++++++++++++++++++
 2 files changed, 346 insertions(+)
 create mode 100644 content/docs/security/owasp-agentic-top-10.mdx

diff --git a/content/docs/meta.json b/content/docs/meta.json
index 0783505..7457849 100644
--- a/content/docs/meta.json
+++ b/content/docs/meta.json
@@ -21,6 +21,7 @@
     "---Security---",
     "security/index",
     "security/safe-mcp-advisory",
+    "security/owasp-agentic-top-10",
     "---Runtimes---",
     "google-adk",
     "hermes",
diff --git a/content/docs/security/owasp-agentic-top-10.mdx b/content/docs/security/owasp-agentic-top-10.mdx
new file mode 100644
index 0000000..cf4b32e
--- /dev/null
+++ b/content/docs/security/owasp-agentic-top-10.mdx
@@ -0,0 +1,345 @@
+---
+title: OWASP Agentic AI Top 10 Coverage
+description: Mapping the OWASP Agentic AI Top 10 to Molecule AI security controls — honest coverage report.
+---
+
+## Overview
+
+This page documents Molecule AI's coverage of the
+[OWASP Agentic AI Top 10](https://owasp.org/agentic-ai-top-10/) security risks
+for AI agents and agentic systems. Coverage is assessed against the platform as
+shipped — not the roadmap or planned features.
+
+**Honest verdict: 5 COVERED / 3 PARTIAL / 2 NOT COVERED**
+
+| OWASP ID | Risk | Status |
+|---|---|---|
+| [A01](#a01-prompt-injection) | Prompt Injection | ✅ COVERED |
+| [A02](#a02-sensitive-information-disclosure) | Sensitive Information Disclosure | ✅ COVERED |
+| [A03](#a03-unbounded-resource-consumption) | Unbounded Resource Consumption | ✅ COVERED |
+| [A04](#a04-sandboxing-escapes) | Sandboxing Escapes | ⚠️ PARTIAL |
+| [A05](#a05-agent-human-relationship-dysfunction) | Agent-Human Relationship Dysfunction | ⚠️ PARTIAL |
+| [A06](#a06-memory-poisoning) | Memory Poisoning | ✅ COVERED |
+| [A07](#a07-cascade-hallucinations) | Cascade Hallucinations | ✅ COVERED |
+| [A08](#a08-overreliance) | Overreliance | ⚠️ PARTIAL |
+| [A09](#a09-supply-chain-vulnerabilities) | Supply Chain Vulnerabilities | ❌ NOT COVERED |
+| [A10](#a10-improper-agency-grants) | Improper Agency Grants | ❌ NOT COVERED |
+
+---
+
+## A01 — Prompt Injection ✅ COVERED
+
+**Risk:** An attacker embeds malicious instructions in external data (files, web
+content, user messages) that the agent treats as authoritative commands.
+
+**Molecule AI controls:**
+
+- **Workspace isolation:** Each workspace runs in its own container with an
+  isolated filesystem. A prompt injection in workspace A cannot reach workspace
+  B's memory or secrets.
+- **Secrets never in tool context:** Secrets stored via the platform API are
+  injected into the container's environment at runtime — they are never passed
+  as tool arguments or embedded in LLM prompts where external data might
+  reference them.
+- **A2A peer validation:** A2A messages between workspaces include sender identity
+  verification. Agents cannot impersonate another workspace's agent.
+- **Admin-level input filtering:** The platform API applies input validation
+  before data reaches agent prompts.
+
+**Residual risk:** Prompt injection within a single workspace (e.g., a
+malicious file processed by the agent) is not neutralized — this is the
+responsibility of the agent's own prompt engineering and the LLM's alignment.
+
+---
+
+## A02 — Sensitive Information Disclosure ✅ COVERED
+
+**Risk:** An agent exposes confidential data — credentials, PII, internal
+documents — through tool calls, logs, or responses.
+
+**Molecule AI controls:**
+
+- **Encrypted secrets at rest:** Workspace secrets are encrypted with
+  `SECRETS_ENCRYPTION_KEY` (AES-256) before storage. Plaintext never hits the
+  database.
+- **Secrets scoped per-workspace:** A token scoped to workspace A cannot access
+  workspace B's secrets.
+- **Memory access controls:** The MCP server's memory tools respect workspace
+  boundaries. Agents cannot read another workspace's memory unless explicitly
+  shared via the `memory_set` peer API.
+- **Langfuse observability:** Traces are visible to platform operators; audit
+  logs show which agent accessed which secret key. Agents should not log
+  secrets — this is enforced through pre-commit hooks in the workspace template
+  (the `sk-ant-` / `ghp_` / `AKIA` pattern detector).
+- **Token display-once policy:** Workspace bearer tokens are returned in plaintext
+  exactly once at creation and never shown again.
+
+**Residual risk:** If an agent deliberately calls a tool that prints a secret
+value (e.g., `echo $SECRET` in a shell tool), the platform cannot prevent this.
+Agent behavior inside the workspace is ultimately constrained by the tools
+exposed and the LLM's instruction following.
+
+---
+
+## A03 — Unbounded Resource Consumption ✅ COVERED
+
+**Risk:** An agent makes excessive LLM calls, processes unbounded data, or holds
+memory in a loop, causing cost overruns or DoS.
+
+**Molecule AI controls:**
+
+- **Tier-based resource limits:** Each workspace tier has defined memory and CPU
+  caps enforced by the container scheduler. A runaway agent hits OOM before
+  consuming unbounded resources.
+- **Rate limiting:** The platform enforces `RATE_LIMIT` requests/min per client.
+  This caps the rate at which agents can issue tool calls or make API requests.
+- **Activity retention and cleanup:** `ACTIVITY_RETENTION_DAYS` (default 7) and
+  `ACTIVITY_CLEANUP_INTERVAL_HOURS` (default 6) automatically purge old activity
+  logs, preventing unbounded log growth.
+- **Workspace hibernation:** Idle workspaces can be hibernated, releasing
+  container resources until the next task arrives.
+- **LLM cost tracking:** Workspace usage is tracked per-token-model, giving
+  operators visibility into spend per workspace.
+
+**Residual risk:** The platform does not enforce per-request token budgets or
+LLM call counts within a task. A sophisticated agent can still issue many
+calls within a single request burst. Operators should monitor Langfuse traces
+for unusual activity patterns.
+
+---
+
+## A04 — Sandboxing Escapes ⚠️ PARTIAL
+
+**Risk:** An agent escapes the container sandbox and accesses the host system,
+neighboring containers, or the internal network.
+
+**Molecule AI controls:**
+
+- **Container isolation:** Workspace containers are isolated Docker containers
+  on the host. They do not run as privileged and have a non-root default user.
+- **Bind-mount scoping:** The workspace directory is the only host path bind-mounted
+  into the container. Other host paths are not accessible.
+- **Network namespace isolation:** Workspace containers are on a Docker bridge
+  network. Direct access to host services requires explicit platform routing.
+
+**Gaps:**
+
+- **Privileged tier (TIER4):** `TIER4_MEMORY_MB` workspaces run with fewer
+  restrictions. A compromised agent in a TIER4 workspace has more ability to
+  probe the host. This is a known trade-off for full-host workloads.
+- **No seccomp/AppArmor/SELinux profiles:** The platform does not currently
+  apply mandatory access control profiles beyond Docker's default isolation.
+- **No egress filtering by default:** Workspace containers can reach arbitrary
+  external URLs unless the operator configures network-level egress rules.
+
+**Recommendation:** For untrusted agents, restrict to TIER2 or below. Configure
+egress filtering at the Docker host or Kubernetes network policy level.
+
+---
+
+## A05 — Agent-Human Relationship Dysfunction ⚠️ PARTIAL
+
+**Risk:** The human operator loses meaningful oversight of agent actions — the
+agent acts without notification, makes irreversible decisions, or misrepresents
+its reasoning.
+
+**Molecule AI controls:**
+
+- **A2A `notify_user`:** Agents can push notifications to the canvas, keeping the
+  human informed of progress and key decisions. This is an opt-in capability for
+  agents to use.
+- **Langfuse observability:** All LLM calls and tool executions are traced.
+  Platform operators can review the full decision trace for any workspace.
+- **Manual override endpoints:** Admins can pause, resume, or terminate any
+  workspace through the `/admin/*` API endpoints.
+- **Activity logs:** All agent actions are logged with timestamps and caller identity.
+
+**Gaps:**
+
+- **`notify_user` is not mandatory:** The workspace template does not require
+  agents to notify humans of significant actions. An agent can run without
+  ever pushing a canvas notification.
+- **No confirmation gates:** The platform does not provide a mechanism for an
+  agent to pause and wait for human approval before taking a consequential
+  action (e.g., deleting a file, sending an external API request).
+- **No explanation requirements:** Agents are not required to log their reasoning
+  before taking actions. Langfuse traces show tool calls but not the agent's
+  internal chain-of-thought unless the agent explicitly logs it.
+
+**Recommendation:** Configure agents to call `notify_user` at key decision
+points. Monitor Langfuse for silent agent activity.
+
+---
+
+## A06 — Memory Poisoning ✅ COVERED
+
+**Risk:** An attacker manipulates the agent's memory store to inject malicious
+instructions or biases that the agent reads back and acts on.
+
+**Molecule AI controls:**
+
+- **Memory write authorization:** `memory_set` and `memory_set_peer` require
+  valid workspace authentication. External attackers cannot write to a
+  workspace's memory without a valid token.
+- **Secrets excluded from memory:** Secrets are stored separately from the
+  general-purpose memory store and are not readable via the memory tools.
+- **Per-workspace memory isolation:** Memory keys are namespaced to the
+  workspace. Agents in workspace A cannot write to workspace B's memory unless
+  an explicit A2A `memory_set_peer` call is made from B to A.
+- **Semantic search gating:** The `search_memory` tool operates only on the
+  authenticated workspace's memory. Cross-workspace search is not permitted
+  without explicit peer delegation.
+
+**Residual risk:** A compromised or malicious agent within a workspace can
+overwrite its own memory with poisoned data. This is an agent-level concern,
+not a platform-level control.
+
+---
+
+## A07 — Cascade Hallucinations ✅ COVERED
+
+**Risk:** An agent generates incorrect outputs that are fed downstream as
+ground-truth, compounding errors across multiple agent calls or tool chains.
+
+**Molecule AI controls:**
+
+- **Langfuse trace visibility:** All agent outputs and tool call results are
+  captured in Langfuse traces. Operators can identify hallucinated outputs
+  by reviewing traces, especially when downstream tool calls fail or produce
+  implausible results.
+- **A2A result attribution:** A2A delegation responses include the source
+  workspace identity and the full execution trace. Consumers of A2A results
+  can audit where the data came from.
+- **Human review via canvas:** Results surfaced via `notify_user` or displayed
+  in the canvas are visible to humans who can flag hallucinated outputs.
+- **Activity logs for audit:** All tool call results are logged. If a downstream
+  agent acts on hallucinated data, the chain of events is traceable.
+
+**Residual risk:** The platform does not automatically detect or flag
+hallucinations — it provides observability. It is the operator's responsibility
+to configure confidence thresholds, set up automated result validation where
+possible, and review traces for signs of cascade errors.
+
+---
+
+## A08 — Overreliance ⚠️ PARTIAL
+
+**Risk:** Users or automated systems trust an agent's outputs without adequate
+verification, leading to harmful decisions based on incorrect agent outputs.
+
+**Molecule AI controls:**
+
+- **Observable decision traces:** Langfuse traces show the full chain of
+  reasoning and tool calls. Downstream consumers can audit outputs before
+  acting on them.
+- **Canvas notification clarity:** `notify_user` messages are human-readable
+  summaries — not raw JSON — which can include uncertainty indicators if the
+  agent is prompted to include them.
+- **Tier-based capability limits:** Higher tiers require explicit admin approval
+  to activate, ensuring operators are aware when a workspace has elevated
+  capabilities.
+
+**Gaps:**
+
+- **No automated output verification:** The platform does not provide a
+  built-in mechanism for agents to self-verify outputs (e.g., cross-checking a
+  code generation against a linter before returning).
+- **No confidence scoring surface:** The platform does not currently surface
+  LLM confidence or probability scores in a structured way. Agents that
+  include confidence in their outputs are relying on prompting alone.
+- **No policy enforcement on agent outputs:** There is no platform-level
+  mechanism to reject agent outputs that violate defined policies before they
+  are acted upon.
+
+**Recommendation:** Prompt agents to include uncertainty flags and self-check
+steps. Configure downstream systems to require human review for high-stakes
+agent outputs.
+
+---
+
+## A09 — Supply Chain Vulnerabilities ❌ NOT COVERED
+
+**Risk:** Vulnerable or malicious dependencies in the agent toolchain — workspace
+runtime packages, plugins, adapter libraries, or LLM provider SDKs.
+
+**Molecule AI's position:** This risk is inherited from the broader software
+supply chain and is not specifically addressed by the platform at this time.
+
+**What operators must manage independently:**
+
+- Workspace runtime dependencies (`molecule-ai-workspace-runtime` and its
+  transitive dependencies)
+- Plugin dependencies (see
+  [SAFE-MCP Advisory: G-01](/docs/security/safe-mcp-advisory#g-01-unpinned-npm-mcp-packages--high))
+- Workspace template adapter dependencies (Python packages installed by
+  adapter-specific Dockerfiles)
+- LLM provider SDKs and their transitive dependencies
+
+**Mitigation operators should apply:**
+
+- Pin all Python and npm dependencies to exact versions in workspace templates
+  and plugins
+- Use `npm ci` / `pip freeze` and commit lockfiles
+- Subscribe to security advisories for all runtime dependencies
+- Scan container images for known CVEs before deploying
+
+---
+
+## A10 — Improper Agency Grants ❌ NOT COVERED
+
+**Risk:** An agent is granted more agency (capability to take actions, access
+resources, make changes) than it needs — creating blast radius if the agent is
+compromised or misbehaves.
+
+**Molecule AI's position:** The platform provides the building blocks for
+least-privilege agent design (tier-based caps, per-workspace secrets, scoped
+tokens, memory isolation) but does not enforce least-privilege agency at the
+agent action level.
+
+**Gaps:**
+
+- **No action-level RBAC:** The MCP server exposes all 87 tools to all
+  authenticated workspaces. There is no mechanism to restrict a specific
+  agent's access to a subset of tools (e.g., blocking `delete_workspace` or
+  `send_channel_message` for a read-only agent).
+- **No approval workflow for high-impact actions:** The platform does not
+  support requiring human approval before an agent executes a high-impact tool
+  (e.g., deleting a resource, sending an external API request, modifying a
+  secret).
+- **Admin tokens are all-or-nothing:** The `ADMIN_TOKEN` gates all `/admin/*`
+  endpoints. There is no concept of scoped admin tokens with per-endpoint
+  permissions.
+- **Plugins have full workspace access:** Once a plugin is installed, it
+  executes within the workspace context with access to all workspace tools and
+  secrets.
+
+**Recommendation:** Apply defense in depth — restrict MCP tool exposure at the
+agent configuration level, use workspace tiers to limit container capabilities,
+and review plugin manifests before installation (see
+[SAFE-MCP Advisory: G-02](/docs/security/safe-mcp-advisory#g-02-no-manifest-signing--high)).
+
+---
+
+## Coverage methodology
+
+This report was produced by Research Lead (2026-04-18) reviewing platform source
+code, configuration defaults, and the deployed security posture against each
+OWASP Agentic AI Top 10 category.
+
+**"COVERED"** means the platform provides specific, built-in controls that
+mitigate the risk, even if residual risk remains at the agent behavior level.
+
+**"PARTIAL"** means the platform provides some controls but significant gaps
+remain that operators must address through configuration or complementary
+ tooling.
+
+**"NOT COVERED"** means the risk is not addressed by the platform as shipped.
+Operators must manage it independently.
+
+---
+
+## Reporting gaps
+
+If you believe a coverage assessment is incorrect or want to propose a new
+control for a gap, open an issue in `Molecule-AI/molecule-core` tagged
+`security` or reach out through your support channel.