DeckSpeak — Turn Any Pitch Deck Into a Narrated Video Walkthrough in 4 Minutes

Q: Who can build DeckSpeak — Turn Any Pitch Deck Into a Narrated Video Walkthrough in 4 Minutes?

This is a intermediate level project. Early-stage founders, accelerator applicants, and sales teams sending decks to prospects — roughly 500k founders actively pitching in the US at any given time.

Q: How does DeckSpeak — Turn Any Pitch Deck Into a Narrated Video Walkthrough in 4 Minutes make money?

Credit-based + subscription. 1 free deck then $29/month for 10 decks/month or $9 per single deck credit.

Founders send pitch decks as static PDFs and investors ghost them — not because the idea is bad, but because nobody wants to read 14 slides at midnight. DeckSpeak converts a pitch deck PDF into a narrated MP4 where each slide gets a timed AI voiceover, so your deck sells itself without a Zoom call.

𝕏 Post Reddit HN

Difficulty

intermediate

What is it?

Warm intros lead to Zoom calls, but cold PDF decks lead to silence. Investors and accelerator managers receive 50+ decks per week and skim or skip most of them. A narrated video walkthrough with a confident AI voice takes 30 seconds per slide to produce and is 10x more likely to be watched to completion than a static PDF. DeckSpeak extracts slide images from a PDF, generates a contextually aware narration script via Claude for each slide, synthesizes audio via ElevenLabs, and stitches the final MP4 using ffmpeg on the server. Founders get a shareable link and a downloadable MP4 in under 4 minutes. No design skills, no recording setup, no editor required. The May 2026 vibe-coding wave is full of founders pitching AI tools — this is the tool they use to pitch those tools.

Why now?

ElevenLabs v3 voices launched in early 2026 with studio-quality output at $0.003 per character — making per-deck unit economics profitable at $9 per single run for the first time.

▸PDF upload with per-slide image extraction using pdf2pic and poppler
▸Claude generates contextually aware narration script per slide using slide visual content and sequence awareness
▸ElevenLabs synthesizes natural-sounding audio per slide narration in under 60 seconds
▸ffmpeg stitches slide images and audio into a clean MP4 with slide transitions — shareable link returned

Target Audience

Early-stage founders, accelerator applicants, and sales teams sending decks to prospects — roughly 500k founders actively pitching in the US at any given time.

Example Use Case

A YC applicant uploads their 12-slide deck, gets a narrated MP4 in 3 minutes, embeds it in their application email, and receives a reply from the partner within 24 hours — where every previous static PDF sent got no response.

User Stories

▸As a founder applying to accelerators, I want my pitch deck automatically narrated and converted to video, so that partners engage with my deck instead of skimming or skipping it.
▸As a sales rep, I want to turn my product deck into a shareable narrated video, so that prospects understand the value prop without a live call.
▸As a startup founder, I want a shareable video link I can embed in a cold email, so that my pitch is watched instead of ignored.

Done When

✓Upload: done when a PDF up to 20 slides uploads successfully and the UI shows a processing status indicator within 5 seconds.
✓Processing: done when the shareable MP4 link appears in the UI within 4 minutes of upload for a 12-slide deck.
✓Video quality: done when the public share page plays a video where each slide is visible for at least 3 seconds with synchronized narration audio and no audio gaps.
✓Billing: done when a user completing their first free deck sees a paywall on the second upload and Stripe checkout processes successfully unlocking the second conversion.

Is it worth building?

$29/month x 200 founders = $5,800 MRR. Math: YC Startup School has 50k+ active founders — 0.4% conversion via targeted ProductHunt launch is 200 paying users.

Unit Economics

CAC: $12 via ProductHunt and Twitter demo. LTV: $348 (12 months at $29/month). Payback: 0.5 months. Gross margin: 76%.

Business Model

Credit-based + subscription

Monetization Path

1 free deck then $29/month for 10 decks/month or $9 per single deck credit.

Revenue Timeline

First dollar: week 2 via credit purchase after demo post. $1k MRR: month 2. $5k MRR: month 5.

Estimated Monthly Cost

Claude API: $25, ElevenLabs: $22, Cloudflare R2: $15, Vercel: $20, Supabase: $25, ffmpeg server (Railway): $20. Total: ~$127/month.

Profit Potential

Full-time viable at $8k-$15k MRR.

Scalability

High — add custom voice cloning so founders narrate in their own voice, team plans for sales decks, and LinkedIn native video export.

Success Metrics

Week 1: 50 free decks processed. Week 3: 15 paid conversions. Month 2: 100 subscribers at $29/month = $2,900 MRR.

Launch & Validation Plan

Process 5 real founder decks for free, share results on Twitter and in YC Startup School Slack, collect 20 signups before building billing.

Customer Acquisition Strategy

First customer: post a narrated version of a well-known public pitch deck (like a Sequoia template) to Twitter and r/startups as a demo — link to the upload form and offer first 20 users free. Then: ProductHunt launch, YC Startup School community, accelerator partner referrals.

What's the competition?

Competition Level

Low

What's the roadmap?

Feature Roadmap

V1 (launch): PDF upload, Claude narration, ElevenLabs TTS, ffmpeg MP4, shareable link. V2 (month 2-3): voice style selector, custom intro and outro slide. V3 (month 4+): voice cloning, LinkedIn video export, team plans.

Milestone Plan

Phase 1 (Week 1-2): full PDF to MP4 pipeline working end-to-end on Railway. Phase 2 (Week 3-4): Stripe billing, Supabase auth, share page, deploy to Vercel plus Railway. Phase 3 (Month 2): ProductHunt launch, 50 paying users, voice style options added.

How do you build it?

Tech Stack

Next.js, Claude API, ElevenLabs TTS, pdf2pic (poppler), ffmpeg via fluent-ffmpeg, Supabase, Cloudflare R2, Stripe — build with Cursor for backend pipeline, v0 for share page UI.

Suggested Frameworks

Anthropic SDK, ElevenLabs SDK, fluent-ffmpeg

Time to Ship

2 weeks

Required Skills

PDF image extraction, Claude API, ElevenLabs TTS API, ffmpeg video stitching, Cloudflare R2 storage.

Resources

ElevenLabs API docs, Anthropic Claude docs, fluent-ffmpeg npm docs, pdf2pic npm package, Cloudflare R2 quickstart.

MVP Scope

app/page.tsx (upload + status + shareable link), app/api/process/route.ts (PDF to MP4 pipeline), app/share/[id]/page.tsx (public video player page), lib/pdf.ts (pdf2pic extraction), lib/claude.ts (narration script prompt), lib/elevenlabs.ts (TTS synthesis), lib/ffmpeg.ts (video stitch), lib/db/schema.ts (decks + slides schema), lib/r2.ts (Cloudflare R2 upload), .env.example.

Core User Journey

Upload PDF -> wait 4 minutes -> receive shareable video link -> send to investor.

Architecture Pattern

PDF upload -> R2 storage -> pdf2pic extracts slide PNGs -> Claude generates narration JSON -> ElevenLabs synthesizes MP3 per slide -> ffmpeg stitches MP4 -> R2 stores MP4 -> shareable link returned to user.

Data Model

User has many Decks. Deck has many Slides. Slide has one NarrationScript, one AudioFile, and one ImageFile. Deck has one OutputMP4 and a status enum (processing, complete, failed).

Integration Points

Claude API for narration script generation, ElevenLabs for TTS audio synthesis, Cloudflare R2 for PDF and MP4 storage, Supabase for deck metadata and user accounts, Stripe for billing, ffmpeg for video stitching.

V1 Scope Boundaries

V1 excludes: custom voice cloning, background music, animated transitions, team plans, LinkedIn direct upload, batch processing, multi-language narration.

Success Definition

A founder uploads their pitch deck, receives a shareable MP4 link without any founder help, and sends it to an investor who watches it to completion.

Challenges

The hardest non-technical problem is convincing founders that an AI voice is good enough to represent them to investors — the ElevenLabs voice quality must be genuinely impressive on first listen or the product dies at demo.

Avoid These Pitfalls

Do not run ffmpeg on Vercel serverless — it will timeout on decks over 6 slides, use a Railway or Fly.io persistent server for the processing job. Do not skip slide content analysis in the Claude prompt or narration will be generic and founders will churn after one use. Finding first 10 paying founders requires a genuinely impressive live demo — budget 50% of week one on prompt quality and voice selection.

Security Requirements

Supabase Auth with Google OAuth, RLS on decks and slides tables scoped to user_id, R2 objects accessed via signed URLs only, uploaded PDFs auto-deleted after 7 days, rate limit processing endpoint to 5 req/min per user.

Infrastructure Plan

Vercel for Next.js frontend, Railway for ffmpeg processing server, Supabase for Postgres and auth, Cloudflare R2 for file storage, GitHub Actions for CI, Sentry for error tracking — total ~$127/month.

Performance Targets

200 DAU at launch, full pipeline complete under 4 minutes for a 12-slide deck, share page video load under 3 seconds, ffmpeg server handles 5 concurrent jobs without queue backup.

Go-Live Checklist

☐Full pipeline tested on 10 different decks including edge cases.
☐Stripe credit and subscription flow tested end-to-end.
☐Sentry live on Vercel and Railway.
☐R2 signed URL access verified on share page.
☐Custom domain with SSL configured.
☐Privacy policy with 7-day file deletion policy published.
☐5 founder beta users watched their video and confirmed narration accuracy.
☐Rollback: redeploy previous Railway and Vercel releases independently.
☐ProductHunt and Twitter launch posts with embedded demo video drafted.

First Run Experience

On first run: a pre-processed demo deck video (Airbnb original pitch deck narrated) plays automatically on the landing page. User can immediately upload their own PDF with no account required for the first free conversion. No manual config required: ElevenLabs voice is pre-selected, slide timing is fixed at 4 seconds per slide, and the share link appears automatically when processing completes.

How to build it, step by step

1. Define schema: decks(id, user_id, status, output_url, created_at) and slides(id, deck_id, order, image_url, narration_text, audio_url) in Supabase. 2. Set up Cloudflare R2 bucket and generate API keys. 3. Run npx create-next-app deckspeak and install @anthropic-ai/sdk, elevenlabs, fluent-ffmpeg, pdf2pic, @supabase/supabase-js, stripe. 4. Build lib/pdf.ts that converts uploaded PDF to PNG images per slide using pdf2pic with 150dpi resolution. 5. Write Claude prompt in lib/claude.ts that takes all slide images as base64 and returns a JSON array of narration scripts, one per slide, aware of deck sequence and story arc. 6. Build lib/elevenlabs.ts that synthesizes each narration string to MP3 using ElevenLabs streaming API. 7. Build lib/ffmpeg.ts on a Railway Node.js server that stitches slide PNGs and MP3s into a final MP4 with 4-second per slide timing. 8. Build app/api/process/route.ts that orchestrates the full pipeline and updates deck status in Supabase on completion. 9. Build app/share/[id]/page.tsx as a public HTML5 video player page using the output_url from R2. 10. Verify: upload a real 10-slide PDF, wait under 4 minutes, click the shareable link, watch the narrated video play correctly on a fresh browser tab without any manual steps.

Generated

May 4, 2026

Model

claude-sonnet-4-6

← Back to All Ideas

ContractDiff — Paste Two Contract Versions, Get a Plain-English Redline Report in 30 Seconds

Disclaimer: Ideas on this site are AI-generated and may contain inaccuracies. Revenue estimates, market demand figures, and financial projections are illustrative assumptions only — not financial advice. Do your own research before making any business or investment decisions. Technology availability, pricing, and market conditions change rapidly; always verify details independently.