🛡 Agent Safety

Ten things to think about before you ship an agent.

Every idea on the marketplace is implicitly an agent design. We run a deterministic classifier over each one and surface the 3 biggest risk categories on the result page. Below: each category, a one-sentence mitigation pointer, and the live ideas that classify there.

This taxonomy is an internal stable shape derived from the OWASP GenAI / MCP working-group efforts. As the upstream spec stabilises we'll publish the cross-reference table — until then treat these codes as a checklist, not a standard.

MCP-1Tool description tampering

1 idea on the marketplace

An attacker rewrites a tool's description so the LLM mis-uses it. Mitigation: pin tool descriptions, reject runtime mutation.

MCP-2Cross-server prompt injection

25 ideas on the marketplace

A second MCP server's output reaches the agent's context and steers it. Mitigation: source-tag every tool result, refuse cross-server instructions.

MCP-3Excessive agency

19 ideas on the marketplace

The agent has more tool privileges than the user task requires. Mitigation: per-task capability scoping, explicit confirmation for destructive ops.

MCP-4Inadequate auth / authz

0 ideas on the marketplace

MCP server trusts a caller without verifying identity / scope. Mitigation: signed tokens, per-tool scopes, no implicit shared secrets.

MCP-5Insecure tool composition

19 ideas on the marketplace

Chaining tools enables an effect neither alone permits (read+exfiltrate). Mitigation: dataflow review, taint tracking, capability slicing.

MCP-6Sensitive data exposure

1 idea on the marketplace

PII / secrets leak via logs, error messages, or tool inputs. Mitigation: redaction layer, content filter on tool args.

MCP-7Unbounded resource consumption

8 ideas on the marketplace

An agent loop or tool call consumes runaway tokens / compute. Mitigation: per-session caps, circuit-breakers on tool retries.

MCP-8Untrusted output rendering

0 ideas on the marketplace

Tool output rendered to the user as HTML / Markdown enables XSS. Mitigation: sanitise on render, never trust tool stdout.

MCP-9Insufficient observability

0 ideas on the marketplace

No audit trail for tool calls / decisions. Mitigation: structured per-call logs with input/output hashes, retention.

MCP-10Supply-chain compromise

2 ideas on the marketplace

Malicious or impostor MCP server installed via marketplace. Mitigation: signed manifests, reproducible builds, allow-list of providers.