SpendSherlock 5000 — bootstrap

Paste-into-Claude-Code starter. The CLAUDE.md below contains the idea spec, agent-readiness sub-scores, suggested tools, and smoke evals — deterministic, no AI hallucination.
↓ Download CLAUDE.md JSON bundle Open in Claude Code →
# SpendSherlock 5000

> Generated by [whycantwehaveanagentforthis.com](https://whycantwehaveanagentforthis.com/result/spendsherlock-5000-classify-expenses). Roasted, scored, ready to scaffold.

## What you are building

**Problem:** classify expenses

**Verdict:** ALREADY EXISTS — _"Bro, Mint did this in 2006. You just reinvented the wheel, but flatter."_

**Summary:** An AI agent that automatically reads, categorizes, and tags financial transactions into expense buckets using LLMs and merchant data enrichment.

## Agent-readiness score

Overall: **75/100** (band B)

| Dimension | Score | Why |
|---|---|---|
| Memory required | 25/25 | Stateless or single-session — minimal memory layer. |
| Tool count | 11/25 | Crowded market: at least 8 integrations to compete. |
| Policy surface | 15/25 | Mid-size policy surface — define refusal categories before launch. |
| Eval coverage | 24/25 | Established eval pattern — golden datasets and public benchmarks already exist. |

> Ready to scaffold today. SpendSherlock 5000 could be a working prototype in a week.

## Suggested tools

- fetch (HTTP GET on a URL allow-list)
- search (Brave / Tavily / Exa for competitor research)
- database (Postgres / Supabase for user state)

## Smoke evals

- The agent introduces itself as "SpendSherlock 5000" and refuses tasks outside the stated scope.
- Given the canonical problem ("classify expenses"), the agent produces a plan in ≤ 200 tokens.
- When asked "what's different from Mint?", the agent gives a concrete differentiator, not a marketing line.
- When asked about Ramp's threat, the agent acknowledges the risk honestly.
- No private personal data appears in any output (PII redaction smoke test).

## Stack

- Model: `claude-sonnet-4-6` (Anthropic). Override via `ANTHROPIC_MODEL` env.
- Suggested stack: `Next.js`, `Claude API or GPT-4o`, `Plaid API`, `Supabase`, `Vercel`
- Solo build estimate: 1-2 weekends, honestly

## Kill prediction

Ramp could obsolete this in Already happening. Ramp gives away AI expense categorization for free bundled with a corporate card that also earns your company cashback. You're charging for what they give away as a customer acquisition cost.

**Survival strategy:** Go hyper-vertical. 'Expense classification for independent film productions' or 'for AWS cost centers' — somewhere Ramp won't bother. Own a weird niche so completely that the big players don't care.

## Hand-off

- Read the full analysis: https://whycantwehaveanagentforthis.com/result/spendsherlock-5000-classify-expenses
- Open in Anthropic Managed Agents: see the deeplink on the result page
- Claim this idea: https://whycantwehaveanagentforthis.com/result/spendsherlock-5000-classify-expenses#claim