Vapi
Developer platform for building, testing, and deploying voice AI agents that handle phone calls.
Quick take
Vapi is the most flexible voice AI platform for developers who want control over every layer of the stack. If you know what LLM, TTS, and STT providers you want and you need real-time voice orchestration, Vapi is the right starting point. If you want an opinionated, works-out-of-the-box solution, Bland.ai or Retell AI are simpler choices.
Overview
Vapi is a developer platform for building voice AI agents that can make and receive phone calls, join meetings, and hold natural conversations. Think of it as the Twilio for voice AI: you define what the agent should say and do, pick your LLM, TTS, and STT providers, and Vapi handles the real-time orchestration. The platform has raised $40M and is growing fast in the voice AI space, competing with Retell AI and Bland.ai for developer mindshare.
Key strengths
Composability is the core advantage. Vapi lets you bring your own LLM (OpenAI, Anthropic, open-source), your own text-to-speech (ElevenLabs, PlayHT, Deepgram), and your own speech-to-text (Deepgram, AssemblyAI). This mix-and-match approach means you are not locked into one provider for any layer. Latency is competitive at under 500ms voice-to-voice in most configurations. The SDK supports web, mobile, and phone channels. The developer experience is clean: good docs, active Discord, and a dashboard for testing agents before deploying them.
Limitations
Pricing is usage-based and can be hard to predict. At ~$0.05/min plus LLM and TTS costs, a high-volume deployment gets expensive fast. The platform is still young; some developers report edge cases in call handling (transfers, DTMF tones, hold music detection) that more mature telephony platforms handle better. The voice AI space is moving so fast that today's architecture may need rethinking in 6-12 months.
Pricing breakdown
Usage-based: ~$0.05/min for Vapi orchestration, plus pass-through costs for your chosen LLM, TTS, and STT providers. A typical stack (GPT-4o + ElevenLabs + Deepgram) costs roughly $0.10-0.15/min all-in. Free tier available for development and testing.
Who should use Vapi
Developers building voice AI agents for customer service, appointment booking, lead qualification, or outbound calling. Teams that want control over their AI stack (choose your own LLM/TTS/STT) rather than an opinionated, closed platform. Not for non-technical teams; this is a developer tool.
Verdict
Vapi is the most flexible voice AI platform for developers who want control over every layer of the stack. If you know what LLM, TTS, and STT providers you want and you need real-time voice orchestration, Vapi is the right starting point. If you want an opinionated, works-out-of-the-box solution, Bland.ai or Retell AI are simpler choices.
Key features
- Voice AI agent builder
- Sub-second latency pipeline
- Multi-provider LLM support
- Phone number provisioning
- Function calling and tool use
Pros and cons
Pros
- + Best-in-class developer experience
- + Very low latency for natural conversations
- + Flexible provider choices for each component
Cons
- - Per-minute costs add up at scale
- - Requires development skills
- - Phone-focused, less suited for in-meeting use
What users say
A great open source product for building voice AI agents.
G2