# GrepFirst McFallthrough

> Generated by [whycantwehaveanagentforthis.com](https://whycantwehaveanagentforthis.com/result/grepfirst-mcfallthrough-small-opensource-retrieval). Roasted, scored, ready to scaffold.

## What you are building

**Problem:** I want a small open-source MCP retrieval router that defaults to grep and only falls through to vector search when the query actually looks semantic. Worth shipping?

**Verdict:** ACTUALLY NOT BAD — _"Finally, someone who remembers that grep is O(n) and your RAM isn't free."_

**Summary:** An MCP-native retrieval router that classifies incoming queries as lexical vs. semantic and dispatches to ripgrep or a vector store accordingly, with zero config defaults.

## Agent-readiness score

Overall: **58/100** (band C)

| Dimension | Score | Why |
|---|---|---|
| Memory required | 21/25 | Some cross-session state — start with Redis, graduate to a vector store. |
| Tool count | 9/25 | Crowded market: at least 8 integrations to compete. |
| Policy surface | 11/25 | Mid-size policy surface — define refusal categories before launch. |
| Eval coverage | 17/25 | Eval scaffolding doable — write 50 paired examples and grade with an LLM-as-judge. |

> Worth building, but plan for the long-tail. GrepFirst McFallthrough needs runway, not just speed.

## Suggested tools

- fetch (HTTP GET on a URL allow-list)
- search (Brave / Tavily / Exa for competitor research)
- database (Postgres / Supabase for user state)
- vector-store (embedding-based retrieval)

## Smoke evals

- The agent introduces itself as "GrepFirst McFallthrough" and refuses tasks outside the stated scope.
- Given the canonical problem ("I want a small open-source MCP retrieval router that defaults to grep and only f"), the agent produces a plan in ≤ 200 tokens.
- When asked "what's different from Exa?", the agent gives a concrete differentiator, not a marketing line.
- When asked about Anthropic's threat, the agent acknowledges the risk honestly.
- No private personal data appears in any output (PII redaction smoke test).

## Stack

- Model: `claude-sonnet-4-6` (Anthropic). Override via `ANTHROPIC_MODEL` env.
- Suggested stack: `TypeScript MCP SDK`, `ripgrep (subprocess)`, `chromadb (sqlite-vec as lighter alt)`, `sentence-transformers or OpenAI embeddings for fallthrough`, `zod for query schema validation`
- Solo build estimate: 1 focused weekend for v0.1, 3 weeks to handle edge cases you'll regret ignoring

## Kill prediction

Anthropic could obsolete this in 9-18 months. They add a retrieval_strategy hint to the MCP spec or ship a reference retrieval server with hybrid routing baked in, making your router redundant by default

**Survival strategy:** Own the classification logic as a reusable library that works regardless of transport — make GrepFirst the algorithm, not just the MCP server

## Hand-off

- Read the full analysis: https://whycantwehaveanagentforthis.com/result/grepfirst-mcfallthrough-small-opensource-retrieval
- Open in Anthropic Managed Agents: see the deeplink on the result page
- Claim this idea: https://whycantwehaveanagentforthis.com/result/grepfirst-mcfallthrough-small-opensource-retrieval#claim
