🛡 Agent Safety

Ten things to think about before you ship an agent.

Every idea on the marketplace is implicitly an agent design. We run a deterministic classifier over each one and surface the 3 biggest risk categories on the result page. Below: each category, a one-sentence mitigation pointer, and the live ideas that classify there.

This taxonomy is an internal stable shape derived from the OWASP GenAI / MCP working-group efforts. As the upstream spec stabilises we'll publish the cross-reference table — until then treat these codes as a checklist, not a standard.

MCP-1Tool description tampering

0 ideas on the marketplace

An attacker rewrites a tool's description so the LLM mis-uses it. Mitigation: pin tool descriptions, reject runtime mutation.

MCP-2Cross-server prompt injection

12 ideas on the marketplace

A second MCP server's output reaches the agent's context and steers it. Mitigation: source-tag every tool result, refuse cross-server instructions.

SnapBot Expressivo 9000 — An agent to automate Snapchat with expressions
ITVagueness 404 — IT company
GooglyGPT Explainer 101 — Difference between chatgpt gemini and open ai
HomeBodyBot 5000 — How to go to college in different city but at the same time I don't want to say …

MCP-2 aggregate page →

MCP-3Excessive agency

9 ideas on the marketplace

The agent has more tool privileges than the user task requires. Mitigation: per-task capability scoping, explicit confirmation for destructive ops.

SnapBot Expressivo 9000 — An agent to automate Snapchat with expressions
ITVagueness 404 — IT company
GooglyGPT Explainer 101 — Difference between chatgpt gemini and open ai
HomeBodyBot 5000 — How to go to college in different city but at the same time I don't want to say …

MCP-3 aggregate page →

MCP-4Inadequate auth / authz

0 ideas on the marketplace

MCP server trusts a caller without verifying identity / scope. Mitigation: signed tokens, per-tool scopes, no implicit shared secrets.

MCP-5Insecure tool composition

11 ideas on the marketplace

Chaining tools enables an effect neither alone permits (read+exfiltrate). Mitigation: dataflow review, taint tracking, capability slicing.

SnapBot Expressivo 9000 — An agent to automate Snapchat with expressions
ITVagueness 404 — IT company
GooglyGPT Explainer 101 — Difference between chatgpt gemini and open ai
HomeBodyBot 5000 — How to go to college in different city but at the same time I don't want to say …

MCP-5 aggregate page →

MCP-6Sensitive data exposure

1 idea on the marketplace

PII / secrets leak via logs, error messages, or tool inputs. Mitigation: redaction layer, content filter on tool args.

BossWhisperer 9000 — I have a boss who always undermines my solutions and efforts

MCP-6 aggregate page →

MCP-7Unbounded resource consumption

2 ideas on the marketplace

An agent loop or tool call consumes runaway tokens / compute. Mitigation: per-session caps, circuit-breakers on tool retries.

AttributionCop 9000 — # AI-Powered Attribution Trust Platform ## Executive Summary Today, attribution …
npm install ai && Cry Later — Roast the AI-agent idea implied by my GitHub repo "vercel/ai". What it does: The…

MCP-7 aggregate page →

MCP-8Untrusted output rendering

0 ideas on the marketplace

Tool output rendered to the user as HTML / Markdown enables XSS. Mitigation: sanitise on render, never trust tool stdout.

MCP-9Insufficient observability

0 ideas on the marketplace

No audit trail for tool calls / decisions. Mitigation: structured per-call logs with input/output hashes, retention.

MCP-10Supply-chain compromise

1 idea on the marketplace

Malicious or impostor MCP server installed via marketplace. Mitigation: signed manifests, reproducible builds, allow-list of providers.

NiceTriBot 9000 — Heyyy do uh know ur developer

MCP-10 aggregate page →