RepoVoice - MCP Server That Gives Claude Live Memory of Your Entire Codebase
Cursor's tab context is 30 files. Your monorepo has 3,000. RepoVoice is an MCP server that indexes your entire codebase into a vector store and exposes it as live RAG tools to Claude and Cursor — so your AI coding assistant finally knows what that function three layers deep actually does.
Difficulty
intermediate
Category
MCP & Integrations
Market Demand
Very High
Revenue Score
7/10
Platform
MCP Server
Vibe Code Friendly
No
Hackathon Score
🏆 8/10
What is it?
The single biggest complaint from Cursor and Claude Code users in April 2026 is that context window limits make AI assistants blind to code outside the current tab, causing hallucinated imports, duplicate utilities, and wrong architecture suggestions. RepoVoice solves this by running a local MCP server that watches your repo, chunks and embeds every file using OpenAI Embeddings or a local model, and exposes two MCP tools to Claude: search_codebase and get_file_context. Claude calls these tools mid-conversation to retrieve the exact file sections relevant to your question. No cloud upload, fully local, works with any repo size. Built with the MCP SDK, LangChain, and a local Chroma vector store — shippable as a CLI install in 2 weeks.
Why now?
The MCP protocol hit broad adoption in early 2026 with Claude Desktop and Claude Code both shipping native MCP support, making local MCP servers a first-class citizen of the AI coding workflow for the first time.
- ▸MCP server exposing search_codebase and get_file_context tools to Claude and Cursor.
- ▸Local Chroma vector store with incremental re-indexing via watchdog file watcher.
- ▸Semantic search over full codebase returning ranked file chunks with line numbers.
- ▸CLI setup in one command: pip install repovoice then repovoice start in any project root.
Target Audience
Cursor and Claude Code users working on codebases over 50 files — estimated 300,000+ active Cursor subscribers as of April 2026.
Example Use Case
A backend dev on a 2,000-file Django monorepo asks Claude 'where is user authentication handled?' and RepoVoice returns the exact 3 files with line references in under 2 seconds — no more grepping for 20 minutes.
User Stories
- ▸As a Cursor user on a large monorepo, I want Claude to search my entire codebase for relevant functions, so that I stop getting hallucinated imports from files Claude has never seen.
- ▸As a Claude Desktop user, I want a local MCP server that indexes my project without uploading code, so that I can use AI assistance on proprietary code safely.
- ▸As a developer switching between projects, I want RepoVoice to auto-index on startup per project root, so that I never manage index state manually.
Acceptance Criteria
Indexing: done when a 500-file repo indexes fully in under 3 minutes on first run. Search Tool: done when search_codebase returns top 5 relevant chunks with file path and line range. File Watcher: done when a saved file re-indexes within 10 seconds without full re-index. Claude Integration: done when Claude Desktop lists RepoVoice tools and calls search_codebase in a live session.
Is it worth building?
$49 one-time x 500 sales = $24,500 in month 2. $9/month x 300 subscribers = $2,700 MRR ongoing by month 4.
Unit Economics
CAC: $5 via organic X demo video. LTV: $49 one-time or $108 (12 months at $9/month). Payback: immediate. Gross margin: 90%.
Business Model
One-time CLI license at $49, or $9/month for cloud-synced multi-machine support.
Monetization Path
Free 7-day trial via pip install converts at 22% to one-time purchase when users experience first cross-file RAG answer.
Revenue Timeline
First dollar: week 2 via early access license. $1k revenue: month 1. $3k MRR: month 3. $8k MRR: month 6.
Estimated Monthly Cost
OpenAI Embeddings API: $15 (for dev and support use), Vercel (license server): $20, Stripe fees: ~$10. Total: ~$45/month at launch.
Profit Potential
Solid indie product at $3k–$8k MRR with low churn given workflow lock-in.
Scalability
High — can expand to GitHub-hosted remote repos, team shared indexes, and IDE plugins.
Success Metrics
Week 2: 200 pip installs. Month 1: 50 paid licenses. Month 3: 150 monthly subscribers.
Launch & Validation Plan
Post a demo video on X showing Claude answering a cross-file architecture question using RepoVoice, collect 100 likes before writing line one.
Customer Acquisition Strategy
First customer: post a 60-second demo video on X tagging Cursor and Anthropic accounts showing RepoVoice finding a function across 500 files, offer free license to first 20 DMs. Ongoing: r/cursor, r/ClaudeAI, Hacker News Show HN, pip install organic discovery.
What's the competition?
Competition Level
Low
Similar Products
Cursor's built-in codebase indexing covers only the IDE context. Codeium has repo search but no MCP tool exposure. Greptile offers cloud repo search but requires uploading code — RepoVoice is local-first and MCP-native.
Competitive Advantage
Fully local — no cloud upload, no privacy risk. Works with any LLM host that supports MCP. Cursor's built-in indexing only works inside Cursor; RepoVoice works with Claude.ai desktop, Claude Code, and any MCP-compatible host.
Regulatory Risks
Low regulatory risk — all processing is local by default, no code uploaded to third-party servers unless user opts into OpenAI Embeddings API.
What's the roadmap?
Feature Roadmap
V1 (launch): local Chroma index, MCP tools, CLI, OpenAI embeddings, Stripe license. V2 (month 2-3): local embedding model option (no OpenAI), multi-project profiles, index export. V3 (month 4+): remote GitHub repo indexing, team shared server, VS Code extension.
Milestone Plan
Phase 1 (Week 1-2): MCP server with two tools, Chroma indexing, file watcher, CLI live. Phase 2 (Week 3): Stripe license gate, 7-day trial, PyPI published, demo video posted. Phase 3 (Month 2): 50 paid licenses, local embedding model option, HN Show HN launch.
How do you build it?
Tech Stack
MCP SDK (Python), LangChain, Chroma DB (local vector store), OpenAI Embeddings API, watchdog for file watching, Click for CLI — build with Cursor for MCP server logic, no UI needed.
Suggested Frameworks
MCP SDK Python, LangChain, Chroma DB
Time to Ship
2 weeks
Required Skills
MCP SDK, LangChain RAG pipeline, Python CLI, Chroma vector store.
Resources
Anthropic MCP SDK docs, LangChain Chroma integration guide, OpenAI Embeddings API docs, Click CLI docs.
MVP Scope
server.py (MCP server main), tools/search_codebase.py, tools/get_file_context.py, indexer/chroma_indexer.py, indexer/file_watcher.py, embeddings/openai_embedder.py, cli/main.py (Click CLI), config/settings.py, tests/test_tools.py, README.md.
Core User Journey
pip install repovoice -> repovoice start in project root -> open Claude Desktop -> ask cross-file question -> get answer with file references in under 3 seconds.
Architecture Pattern
CLI start command -> watchdog indexes repo files -> LangChain chunks files -> OpenAI Embeddings API embeds chunks -> Chroma stores vectors locally -> MCP server exposes tools -> Claude calls search_codebase tool -> Chroma returns top-k chunks -> Claude uses context in response.
Data Model
Project has one ChromaCollection. ChromaCollection has many EmbeddedChunks. EmbeddedChunk has one FilePath, one LineRange, and one EmbeddingVector.
Integration Points
MCP SDK for Claude and Cursor tool exposure, LangChain for chunking and retrieval, Chroma DB for local vector storage, OpenAI Embeddings API for embedding, Stripe for license payments, watchdog for file change detection.
V1 Scope Boundaries
V1 excludes: remote GitHub repo indexing, team shared index server, IDE plugin, non-Python repos special handling, multi-project switching.
Success Definition
A developer installs RepoVoice via pip, runs it against their repo, asks Claude a cross-file question, gets a correct answer with line references, and pays for a license within 7 days without any founder contact.
Challenges
OpenAI Embeddings API cost scales with repo size — 1M token repo costs ~$0.10 to index but ongoing re-index on every save adds up. The real distribution challenge is convincing Cursor users to install a local server when Cursor's own codebase indexing feature exists and is improving fast.
Avoid These Pitfalls
Do not try to stream large repos synchronously — implement async incremental indexing or users will wait 10 minutes on first run and uninstall. Do not compete with Cursor's built-in feature on their own turf — market to Claude Desktop and Claude Code users instead. First 50 installs come from the demo video, not organic pip discovery.
Security Requirements
No code content transmitted except to OpenAI Embeddings API if user opts in. Stripe license key validated locally with hash check. No auth server needed for local-only mode. GDPR: all data stays on device by default.
Infrastructure Plan
PyPI for distribution. Stripe for license validation via lightweight Vercel edge function. No database needed for local-only V1. Sentry SDK embedded in CLI for opt-in error reporting. Total: ~$45/month.
Performance Targets
500-file repo indexed in under 3 minutes. Search tool response under 500ms. File watcher re-index under 10 seconds per changed file. Supports repos up to 10,000 files.
Go-Live Checklist
- ☐Local-only privacy audit complete
- ☐Stripe license flow tested
- ☐PyPI package published and installable
- ☐README with one-command setup published
- ☐Privacy policy for opt-in telemetry published
- ☐5 beta users tested on real repos
- ☐Rollback: prior PyPI version available
- ☐Demo video posted on X
- ☐HN Show HN post drafted.
How to build it, step by step
1. Install MCP SDK with pip install mcp and scaffold server.py with two tool definitions. 2. Build chroma_indexer.py using LangChain RecursiveCharacterTextSplitter and Chroma. 3. Add openai_embedder.py calling OpenAI Embeddings API for each chunk. 4. Implement file_watcher.py with watchdog to trigger incremental re-index on file save. 5. Build search_codebase tool that runs semantic similarity search against Chroma and returns top 5 chunks. 6. Build get_file_context tool that retrieves full file section by path and line range. 7. Wire tools into MCP server with proper input schemas using Cursor. 8. Build CLI with Click exposing repovoice start and repovoice index commands. 9. Add Stripe license check on startup with a 7-day trial grace period. 10. Package with PyPI and publish, post demo video on X.
Generated
April 7, 2026
Model
claude-sonnet-4-6