CodingIdeas.ai
← Back to Ideas

RepoVoice - MCP Server That Gives Claude Live Memory of Your Entire Codebase

Cursor's tab context is 30 files. Your monorepo has 3,000. RepoVoice is an MCP server that indexes your entire codebase into a vector store and exposes it as live RAG tools to Claude and Cursor — so your AI coding assistant finally knows what that function three layers deep actually does.

Difficulty

intermediate

Category

MCP & Integrations

Market Demand

Very High

Revenue Score

7/10

Platform

MCP Server

Vibe Code Friendly

No

Hackathon Score

🏆 8/10

What is it?

The single biggest complaint from Cursor and Claude Code users in April 2026 is that context window limits make AI assistants blind to code outside the current tab, causing hallucinated imports, duplicate utilities, and wrong architecture suggestions. RepoVoice solves this by running a local MCP server that watches your repo, chunks and embeds every file using OpenAI Embeddings or a local model, and exposes two MCP tools to Claude: search_codebase and get_file_context. Claude calls these tools mid-conversation to retrieve the exact file sections relevant to your question. No cloud upload, fully local, works with any repo size. Built with the MCP SDK, LangChain, and a local Chroma vector store — shippable as a CLI install in 2 weeks.

Why now?

The MCP protocol hit broad adoption in early 2026 with Claude Desktop and Claude Code both shipping native MCP support, making local MCP servers a first-class citizen of the AI coding workflow for the first time.

  • MCP server exposing search_codebase and get_file_context tools to Claude and Cursor.
  • Local Chroma vector store with incremental re-indexing via watchdog file watcher.
  • Semantic search over full codebase returning ranked file chunks with line numbers.
  • CLI setup in one command: pip install repovoice then repovoice start in any project root.

Target Audience

Cursor and Claude Code users working on codebases over 50 files — estimated 300,000+ active Cursor subscribers as of April 2026.

Example Use Case

A backend dev on a 2,000-file Django monorepo asks Claude 'where is user authentication handled?' and RepoVoice returns the exact 3 files with line references in under 2 seconds — no more grepping for 20 minutes.

User Stories

  • As a Cursor user on a large monorepo, I want Claude to search my entire codebase for relevant functions, so that I stop getting hallucinated imports from files Claude has never seen.
  • As a Claude Desktop user, I want a local MCP server that indexes my project without uploading code, so that I can use AI assistance on proprietary code safely.
  • As a developer switching between projects, I want RepoVoice to auto-index on startup per project root, so that I never manage index state manually.

Acceptance Criteria

Indexing: done when a 500-file repo indexes fully in under 3 minutes on first run. Search Tool: done when search_codebase returns top 5 relevant chunks with file path and line range. File Watcher: done when a saved file re-indexes within 10 seconds without full re-index. Claude Integration: done when Claude Desktop lists RepoVoice tools and calls search_codebase in a live session.

Is it worth building?

$49 one-time x 500 sales = $24,500 in month 2. $9/month x 300 subscribers = $2,700 MRR ongoing by month 4.

Unit Economics

CAC: $5 via organic X demo video. LTV: $49 one-time or $108 (12 months at $9/month). Payback: immediate. Gross margin: 90%.

Business Model

One-time CLI license at $49, or $9/month for cloud-synced multi-machine support.

Monetization Path

Free 7-day trial via pip install converts at 22% to one-time purchase when users experience first cross-file RAG answer.

Revenue Timeline

First dollar: week 2 via early access license. $1k revenue: month 1. $3k MRR: month 3. $8k MRR: month 6.

Estimated Monthly Cost

OpenAI Embeddings API: $15 (for dev and support use), Vercel (license server): $20, Stripe fees: ~$10. Total: ~$45/month at launch.

Profit Potential

Solid indie product at $3k–$8k MRR with low churn given workflow lock-in.

Scalability

High — can expand to GitHub-hosted remote repos, team shared indexes, and IDE plugins.

Success Metrics

Week 2: 200 pip installs. Month 1: 50 paid licenses. Month 3: 150 monthly subscribers.

Launch & Validation Plan

Post a demo video on X showing Claude answering a cross-file architecture question using RepoVoice, collect 100 likes before writing line one.

Customer Acquisition Strategy

First customer: post a 60-second demo video on X tagging Cursor and Anthropic accounts showing RepoVoice finding a function across 500 files, offer free license to first 20 DMs. Ongoing: r/cursor, r/ClaudeAI, Hacker News Show HN, pip install organic discovery.

What's the competition?

Competition Level

Low

Similar Products

Cursor's built-in codebase indexing covers only the IDE context. Codeium has repo search but no MCP tool exposure. Greptile offers cloud repo search but requires uploading code — RepoVoice is local-first and MCP-native.

Competitive Advantage

Fully local — no cloud upload, no privacy risk. Works with any LLM host that supports MCP. Cursor's built-in indexing only works inside Cursor; RepoVoice works with Claude.ai desktop, Claude Code, and any MCP-compatible host.

Regulatory Risks

Low regulatory risk — all processing is local by default, no code uploaded to third-party servers unless user opts into OpenAI Embeddings API.

What's the roadmap?

Feature Roadmap

V1 (launch): local Chroma index, MCP tools, CLI, OpenAI embeddings, Stripe license. V2 (month 2-3): local embedding model option (no OpenAI), multi-project profiles, index export. V3 (month 4+): remote GitHub repo indexing, team shared server, VS Code extension.

Milestone Plan

Phase 1 (Week 1-2): MCP server with two tools, Chroma indexing, file watcher, CLI live. Phase 2 (Week 3): Stripe license gate, 7-day trial, PyPI published, demo video posted. Phase 3 (Month 2): 50 paid licenses, local embedding model option, HN Show HN launch.

How do you build it?

Tech Stack

MCP SDK (Python), LangChain, Chroma DB (local vector store), OpenAI Embeddings API, watchdog for file watching, Click for CLI — build with Cursor for MCP server logic, no UI needed.

Suggested Frameworks

MCP SDK Python, LangChain, Chroma DB

Time to Ship

2 weeks

Required Skills

MCP SDK, LangChain RAG pipeline, Python CLI, Chroma vector store.

Resources

Anthropic MCP SDK docs, LangChain Chroma integration guide, OpenAI Embeddings API docs, Click CLI docs.

MVP Scope

server.py (MCP server main), tools/search_codebase.py, tools/get_file_context.py, indexer/chroma_indexer.py, indexer/file_watcher.py, embeddings/openai_embedder.py, cli/main.py (Click CLI), config/settings.py, tests/test_tools.py, README.md.

Core User Journey

pip install repovoice -> repovoice start in project root -> open Claude Desktop -> ask cross-file question -> get answer with file references in under 3 seconds.

Architecture Pattern

CLI start command -> watchdog indexes repo files -> LangChain chunks files -> OpenAI Embeddings API embeds chunks -> Chroma stores vectors locally -> MCP server exposes tools -> Claude calls search_codebase tool -> Chroma returns top-k chunks -> Claude uses context in response.

Data Model

Project has one ChromaCollection. ChromaCollection has many EmbeddedChunks. EmbeddedChunk has one FilePath, one LineRange, and one EmbeddingVector.

Integration Points

MCP SDK for Claude and Cursor tool exposure, LangChain for chunking and retrieval, Chroma DB for local vector storage, OpenAI Embeddings API for embedding, Stripe for license payments, watchdog for file change detection.

V1 Scope Boundaries

V1 excludes: remote GitHub repo indexing, team shared index server, IDE plugin, non-Python repos special handling, multi-project switching.

Success Definition

A developer installs RepoVoice via pip, runs it against their repo, asks Claude a cross-file question, gets a correct answer with line references, and pays for a license within 7 days without any founder contact.

Challenges

OpenAI Embeddings API cost scales with repo size — 1M token repo costs ~$0.10 to index but ongoing re-index on every save adds up. The real distribution challenge is convincing Cursor users to install a local server when Cursor's own codebase indexing feature exists and is improving fast.

Avoid These Pitfalls

Do not try to stream large repos synchronously — implement async incremental indexing or users will wait 10 minutes on first run and uninstall. Do not compete with Cursor's built-in feature on their own turf — market to Claude Desktop and Claude Code users instead. First 50 installs come from the demo video, not organic pip discovery.

Security Requirements

No code content transmitted except to OpenAI Embeddings API if user opts in. Stripe license key validated locally with hash check. No auth server needed for local-only mode. GDPR: all data stays on device by default.

Infrastructure Plan

PyPI for distribution. Stripe for license validation via lightweight Vercel edge function. No database needed for local-only V1. Sentry SDK embedded in CLI for opt-in error reporting. Total: ~$45/month.

Performance Targets

500-file repo indexed in under 3 minutes. Search tool response under 500ms. File watcher re-index under 10 seconds per changed file. Supports repos up to 10,000 files.

Go-Live Checklist

  • Local-only privacy audit complete
  • Stripe license flow tested
  • PyPI package published and installable
  • README with one-command setup published
  • Privacy policy for opt-in telemetry published
  • 5 beta users tested on real repos
  • Rollback: prior PyPI version available
  • Demo video posted on X
  • HN Show HN post drafted.

How to build it, step by step

1. Install MCP SDK with pip install mcp and scaffold server.py with two tool definitions. 2. Build chroma_indexer.py using LangChain RecursiveCharacterTextSplitter and Chroma. 3. Add openai_embedder.py calling OpenAI Embeddings API for each chunk. 4. Implement file_watcher.py with watchdog to trigger incremental re-index on file save. 5. Build search_codebase tool that runs semantic similarity search against Chroma and returns top 5 chunks. 6. Build get_file_context tool that retrieves full file section by path and line range. 7. Wire tools into MCP server with proper input schemas using Cursor. 8. Build CLI with Click exposing repovoice start and repovoice index commands. 9. Add Stripe license check on startup with a 7-day trial grace period. 10. Package with PyPI and publish, post demo video on X.

Generated

April 7, 2026

Model

claude-sonnet-4-6

← Back to All Ideas