docs(security): OWASP Agentic AI Top 10 coverage report (#31)

* docs(security): add OWASP Agentic AI Top 10 coverage report

Adds content/docs/security/owasp-agentic-top-10.mdx with honest coverage:
   COVERED (5):  A01 Prompt Injection, A02 Sensitive Info Disclosure,
                   A03 Unbounded Resource Consumption, A06 Memory Poisoning,
                   A07 Cascade Hallucinations
  ⚠️ PARTIAL (3):  A04 Sandboxing Escapes, A05 Agent-Human Relationship
                   Dysfunction, A08 Overreliance
   NOT COVERED: A09 Supply Chain Vulnerabilities, A10 Improper Agency Grants

Meta.json updated to include security section with all three pages.
PR merge order note: advisory (#808) should merge before this PR.
If advisory is not yet merged, rebase to remove duplicate entries.

Deadline: April 25, 2026

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

* fix(security): update molecule-monorepo → molecule-core in OWASP coverage

Terminology fix: repo reference updated to the correct name.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

---------

Co-authored-by: Molecule AI Documentation Specialist <documentation-specialist@agents.moleculesai.app>
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: Molecule AI App-FE <app-fe@agents.moleculesai.app>
This commit is contained in:
molecule-ai[bot] 2026-04-20 22:40:14 +00:00 committed by GitHub
parent 016e301dc3
commit 3d65f226dc
No known key found for this signature in database
GPG Key ID: B5690EEEBB952194
2 changed files with 346 additions and 0 deletions

View File

@ -21,6 +21,7 @@
"---Security---",
"security/index",
"security/safe-mcp-advisory",
"security/owasp-agentic-top-10",
"---Runtimes---",
"google-adk",
"hermes",

View File

@ -0,0 +1,345 @@
---
title: OWASP Agentic AI Top 10 Coverage
description: Mapping the OWASP Agentic AI Top 10 to Molecule AI security controls — honest coverage report.
---
## Overview
This page documents Molecule AI's coverage of the
[OWASP Agentic AI Top 10](https://owasp.org/agentic-ai-top-10/) security risks
for AI agents and agentic systems. Coverage is assessed against the platform as
shipped — not the roadmap or planned features.
**Honest verdict: 5 COVERED / 3 PARTIAL / 2 NOT COVERED**
| OWASP ID | Risk | Status |
|---|---|---|
| [A01](#a01-prompt-injection) | Prompt Injection | ✅ COVERED |
| [A02](#a02-sensitive-information-disclosure) | Sensitive Information Disclosure | ✅ COVERED |
| [A03](#a03-unbounded-resource-consumption) | Unbounded Resource Consumption | ✅ COVERED |
| [A04](#a04-sandboxing-escapes) | Sandboxing Escapes | ⚠️ PARTIAL |
| [A05](#a05-agent-human-relationship-dysfunction) | Agent-Human Relationship Dysfunction | ⚠️ PARTIAL |
| [A06](#a06-memory-poisoning) | Memory Poisoning | ✅ COVERED |
| [A07](#a07-cascade-hallucinations) | Cascade Hallucinations | ✅ COVERED |
| [A08](#a08-overreliance) | Overreliance | ⚠️ PARTIAL |
| [A09](#a09-supply-chain-vulnerabilities) | Supply Chain Vulnerabilities | ❌ NOT COVERED |
| [A10](#a10-improper-agency-grants) | Improper Agency Grants | ❌ NOT COVERED |
---
## A01 — Prompt Injection ✅ COVERED
**Risk:** An attacker embeds malicious instructions in external data (files, web
content, user messages) that the agent treats as authoritative commands.
**Molecule AI controls:**
- **Workspace isolation:** Each workspace runs in its own container with an
isolated filesystem. A prompt injection in workspace A cannot reach workspace
B's memory or secrets.
- **Secrets never in tool context:** Secrets stored via the platform API are
injected into the container's environment at runtime — they are never passed
as tool arguments or embedded in LLM prompts where external data might
reference them.
- **A2A peer validation:** A2A messages between workspaces include sender identity
verification. Agents cannot impersonate another workspace's agent.
- **Admin-level input filtering:** The platform API applies input validation
before data reaches agent prompts.
**Residual risk:** Prompt injection within a single workspace (e.g., a
malicious file processed by the agent) is not neutralized — this is the
responsibility of the agent's own prompt engineering and the LLM's alignment.
---
## A02 — Sensitive Information Disclosure ✅ COVERED
**Risk:** An agent exposes confidential data — credentials, PII, internal
documents — through tool calls, logs, or responses.
**Molecule AI controls:**
- **Encrypted secrets at rest:** Workspace secrets are encrypted with
`SECRETS_ENCRYPTION_KEY` (AES-256) before storage. Plaintext never hits the
database.
- **Secrets scoped per-workspace:** A token scoped to workspace A cannot access
workspace B's secrets.
- **Memory access controls:** The MCP server's memory tools respect workspace
boundaries. Agents cannot read another workspace's memory unless explicitly
shared via the `memory_set` peer API.
- **Langfuse observability:** Traces are visible to platform operators; audit
logs show which agent accessed which secret key. Agents should not log
secrets — this is enforced through pre-commit hooks in the workspace template
(the `sk-ant-` / `ghp_` / `AKIA` pattern detector).
- **Token display-once policy:** Workspace bearer tokens are returned in plaintext
exactly once at creation and never shown again.
**Residual risk:** If an agent deliberately calls a tool that prints a secret
value (e.g., `echo $SECRET` in a shell tool), the platform cannot prevent this.
Agent behavior inside the workspace is ultimately constrained by the tools
exposed and the LLM's instruction following.
---
## A03 — Unbounded Resource Consumption ✅ COVERED
**Risk:** An agent makes excessive LLM calls, processes unbounded data, or holds
memory in a loop, causing cost overruns or DoS.
**Molecule AI controls:**
- **Tier-based resource limits:** Each workspace tier has defined memory and CPU
caps enforced by the container scheduler. A runaway agent hits OOM before
consuming unbounded resources.
- **Rate limiting:** The platform enforces `RATE_LIMIT` requests/min per client.
This caps the rate at which agents can issue tool calls or make API requests.
- **Activity retention and cleanup:** `ACTIVITY_RETENTION_DAYS` (default 7) and
`ACTIVITY_CLEANUP_INTERVAL_HOURS` (default 6) automatically purge old activity
logs, preventing unbounded log growth.
- **Workspace hibernation:** Idle workspaces can be hibernated, releasing
container resources until the next task arrives.
- **LLM cost tracking:** Workspace usage is tracked per-token-model, giving
operators visibility into spend per workspace.
**Residual risk:** The platform does not enforce per-request token budgets or
LLM call counts within a task. A sophisticated agent can still issue many
calls within a single request burst. Operators should monitor Langfuse traces
for unusual activity patterns.
---
## A04 — Sandboxing Escapes ⚠️ PARTIAL
**Risk:** An agent escapes the container sandbox and accesses the host system,
neighboring containers, or the internal network.
**Molecule AI controls:**
- **Container isolation:** Workspace containers are isolated Docker containers
on the host. They do not run as privileged and have a non-root default user.
- **Bind-mount scoping:** The workspace directory is the only host path bind-mounted
into the container. Other host paths are not accessible.
- **Network namespace isolation:** Workspace containers are on a Docker bridge
network. Direct access to host services requires explicit platform routing.
**Gaps:**
- **Privileged tier (TIER4):** `TIER4_MEMORY_MB` workspaces run with fewer
restrictions. A compromised agent in a TIER4 workspace has more ability to
probe the host. This is a known trade-off for full-host workloads.
- **No seccomp/AppArmor/SELinux profiles:** The platform does not currently
apply mandatory access control profiles beyond Docker's default isolation.
- **No egress filtering by default:** Workspace containers can reach arbitrary
external URLs unless the operator configures network-level egress rules.
**Recommendation:** For untrusted agents, restrict to TIER2 or below. Configure
egress filtering at the Docker host or Kubernetes network policy level.
---
## A05 — Agent-Human Relationship Dysfunction ⚠️ PARTIAL
**Risk:** The human operator loses meaningful oversight of agent actions — the
agent acts without notification, makes irreversible decisions, or misrepresents
its reasoning.
**Molecule AI controls:**
- **A2A `notify_user`:** Agents can push notifications to the canvas, keeping the
human informed of progress and key decisions. This is an opt-in capability for
agents to use.
- **Langfuse observability:** All LLM calls and tool executions are traced.
Platform operators can review the full decision trace for any workspace.
- **Manual override endpoints:** Admins can pause, resume, or terminate any
workspace through the `/admin/*` API endpoints.
- **Activity logs:** All agent actions are logged with timestamps and caller identity.
**Gaps:**
- **`notify_user` is not mandatory:** The workspace template does not require
agents to notify humans of significant actions. An agent can run without
ever pushing a canvas notification.
- **No confirmation gates:** The platform does not provide a mechanism for an
agent to pause and wait for human approval before taking a consequential
action (e.g., deleting a file, sending an external API request).
- **No explanation requirements:** Agents are not required to log their reasoning
before taking actions. Langfuse traces show tool calls but not the agent's
internal chain-of-thought unless the agent explicitly logs it.
**Recommendation:** Configure agents to call `notify_user` at key decision
points. Monitor Langfuse for silent agent activity.
---
## A06 — Memory Poisoning ✅ COVERED
**Risk:** An attacker manipulates the agent's memory store to inject malicious
instructions or biases that the agent reads back and acts on.
**Molecule AI controls:**
- **Memory write authorization:** `memory_set` and `memory_set_peer` require
valid workspace authentication. External attackers cannot write to a
workspace's memory without a valid token.
- **Secrets excluded from memory:** Secrets are stored separately from the
general-purpose memory store and are not readable via the memory tools.
- **Per-workspace memory isolation:** Memory keys are namespaced to the
workspace. Agents in workspace A cannot write to workspace B's memory unless
an explicit A2A `memory_set_peer` call is made from B to A.
- **Semantic search gating:** The `search_memory` tool operates only on the
authenticated workspace's memory. Cross-workspace search is not permitted
without explicit peer delegation.
**Residual risk:** A compromised or malicious agent within a workspace can
overwrite its own memory with poisoned data. This is an agent-level concern,
not a platform-level control.
---
## A07 — Cascade Hallucinations ✅ COVERED
**Risk:** An agent generates incorrect outputs that are fed downstream as
ground-truth, compounding errors across multiple agent calls or tool chains.
**Molecule AI controls:**
- **Langfuse trace visibility:** All agent outputs and tool call results are
captured in Langfuse traces. Operators can identify hallucinated outputs
by reviewing traces, especially when downstream tool calls fail or produce
implausible results.
- **A2A result attribution:** A2A delegation responses include the source
workspace identity and the full execution trace. Consumers of A2A results
can audit where the data came from.
- **Human review via canvas:** Results surfaced via `notify_user` or displayed
in the canvas are visible to humans who can flag hallucinated outputs.
- **Activity logs for audit:** All tool call results are logged. If a downstream
agent acts on hallucinated data, the chain of events is traceable.
**Residual risk:** The platform does not automatically detect or flag
hallucinations — it provides observability. It is the operator's responsibility
to configure confidence thresholds, set up automated result validation where
possible, and review traces for signs of cascade errors.
---
## A08 — Overreliance ⚠️ PARTIAL
**Risk:** Users or automated systems trust an agent's outputs without adequate
verification, leading to harmful decisions based on incorrect agent outputs.
**Molecule AI controls:**
- **Observable decision traces:** Langfuse traces show the full chain of
reasoning and tool calls. Downstream consumers can audit outputs before
acting on them.
- **Canvas notification clarity:** `notify_user` messages are human-readable
summaries — not raw JSON — which can include uncertainty indicators if the
agent is prompted to include them.
- **Tier-based capability limits:** Higher tiers require explicit admin approval
to activate, ensuring operators are aware when a workspace has elevated
capabilities.
**Gaps:**
- **No automated output verification:** The platform does not provide a
built-in mechanism for agents to self-verify outputs (e.g., cross-checking a
code generation against a linter before returning).
- **No confidence scoring surface:** The platform does not currently surface
LLM confidence or probability scores in a structured way. Agents that
include confidence in their outputs are relying on prompting alone.
- **No policy enforcement on agent outputs:** There is no platform-level
mechanism to reject agent outputs that violate defined policies before they
are acted upon.
**Recommendation:** Prompt agents to include uncertainty flags and self-check
steps. Configure downstream systems to require human review for high-stakes
agent outputs.
---
## A09 — Supply Chain Vulnerabilities ❌ NOT COVERED
**Risk:** Vulnerable or malicious dependencies in the agent toolchain — workspace
runtime packages, plugins, adapter libraries, or LLM provider SDKs.
**Molecule AI's position:** This risk is inherited from the broader software
supply chain and is not specifically addressed by the platform at this time.
**What operators must manage independently:**
- Workspace runtime dependencies (`molecule-ai-workspace-runtime` and its
transitive dependencies)
- Plugin dependencies (see
[SAFE-MCP Advisory: G-01](/docs/security/safe-mcp-advisory#g-01-unpinned-npm-mcp-packages--high))
- Workspace template adapter dependencies (Python packages installed by
adapter-specific Dockerfiles)
- LLM provider SDKs and their transitive dependencies
**Mitigation operators should apply:**
- Pin all Python and npm dependencies to exact versions in workspace templates
and plugins
- Use `npm ci` / `pip freeze` and commit lockfiles
- Subscribe to security advisories for all runtime dependencies
- Scan container images for known CVEs before deploying
---
## A10 — Improper Agency Grants ❌ NOT COVERED
**Risk:** An agent is granted more agency (capability to take actions, access
resources, make changes) than it needs — creating blast radius if the agent is
compromised or misbehaves.
**Molecule AI's position:** The platform provides the building blocks for
least-privilege agent design (tier-based caps, per-workspace secrets, scoped
tokens, memory isolation) but does not enforce least-privilege agency at the
agent action level.
**Gaps:**
- **No action-level RBAC:** The MCP server exposes all 87 tools to all
authenticated workspaces. There is no mechanism to restrict a specific
agent's access to a subset of tools (e.g., blocking `delete_workspace` or
`send_channel_message` for a read-only agent).
- **No approval workflow for high-impact actions:** The platform does not
support requiring human approval before an agent executes a high-impact tool
(e.g., deleting a resource, sending an external API request, modifying a
secret).
- **Admin tokens are all-or-nothing:** The `ADMIN_TOKEN` gates all `/admin/*`
endpoints. There is no concept of scoped admin tokens with per-endpoint
permissions.
- **Plugins have full workspace access:** Once a plugin is installed, it
executes within the workspace context with access to all workspace tools and
secrets.
**Recommendation:** Apply defense in depth — restrict MCP tool exposure at the
agent configuration level, use workspace tiers to limit container capabilities,
and review plugin manifests before installation (see
[SAFE-MCP Advisory: G-02](/docs/security/safe-mcp-advisory#g-02-no-manifest-signing--high)).
---
## Coverage methodology
This report was produced by Research Lead (2026-04-18) reviewing platform source
code, configuration defaults, and the deployed security posture against each
OWASP Agentic AI Top 10 category.
**"COVERED"** means the platform provides specific, built-in controls that
mitigate the risk, even if residual risk remains at the agent behavior level.
**"PARTIAL"** means the platform provides some controls but significant gaps
remain that operators must address through configuration or complementary
tooling.
**"NOT COVERED"** means the risk is not addressed by the platform as shipped.
Operators must manage it independently.
---
## Reporting gaps
If you believe a coverage assessment is incorrect or want to propose a new
control for a gap, open an issue in `Molecule-AI/molecule-core` tagged
`security` or reach out through your support channel.