Key takeaways
- Open-source framework with full provider flexibility — mix OpenAI, Deepgram, ElevenLabs, or any STT/LLM/TTS
- $122.5M+ raised including $100M in January 2026 led by Index Ventures, with OpenAI as a customer
- Self-hostable with optional LiveKit Cloud for managed infrastructure at $0.004/min audio
FAQ
What is LiveKit Agents?
LiveKit Agents is an open-source framework for building realtime AI agents that can see, hear, and speak, with flexible provider integration and optional managed cloud infrastructure.
How much does LiveKit cost?
Self-hosting is free (open source). LiveKit Cloud charges $0.004/minute for audio-only, $0.015/minute for video, plus provider costs for STT/LLM/TTS.
Can I self-host LiveKit?
Yes, LiveKit is fully open source under Apache 2.0. You can deploy on your own infrastructure with no licensing fees.
Executive Summary
LiveKit Agents is an open-source framework for building realtime voice, video, and physical AI agents. Unlike managed platforms like Vapi or ElevenLabs, LiveKit gives developers full control over provider selection and deployment. The company raised $100M in January 2026, bringing total funding to $122.5M+, and counts OpenAI among its customers.
| Attribute | Value |
|---|---|
| Company | LiveKit |
| Founded | 2021 |
| Funding | $122.5M+ |
| Investors | Index Ventures, Salesforce, Altimeter |
| License | Apache 2.0 (open source) |
| Notable Customer | OpenAI |
Product Overview
LiveKit started as realtime video/audio infrastructure and expanded into AI agents with the Agents framework. The framework allows developers to build AI-driven applications that can see (video), hear (audio), and speak (TTS) in realtime, with full flexibility in choosing AI providers.
The platform supports both managed cloud deployment (LiveKit Cloud) and self-hosting, making it attractive to teams wanting control without building infrastructure from scratch.
Key Capabilities
| Capability | Description |
|---|---|
| Voice Agents | Build conversational AI with any STT/LLM/TTS |
| Video Agents | Add vision capabilities to agents |
| Provider Flexibility | OpenAI, Deepgram, ElevenLabs, AssemblyAI, etc. |
| Multi-Agent | Handoff between specialized agents |
| SIP Integration | Connect to phone networks |
| Self-Hosting | Deploy on your own infrastructure |
Framework Components
| Component | Description |
|---|---|
| Python SDK | pip install livekit-agents |
| TypeScript SDK | Node.js agent development |
| Plugins | Pre-built integrations (OpenAI, Deepgram, etc.) |
| LiveKit Cloud | Optional managed infrastructure |
Technical Architecture
LiveKit Agents runs as a server-side process that connects to LiveKit's realtime infrastructure (cloud or self-hosted). Agents can use any combination of AI providers through plugins.
┌─────────────────────────────────────────────────┐
│ Client Applications │
│ Web | Mobile | Phone (SIP) │
├─────────────────────────────────────────────────┤
│ LiveKit Infrastructure │
│ (Cloud or Self-Hosted) │
│ ┌───────────────────────────────────────────┐ │
│ │ Realtime Media Routing │ │
│ │ Audio/Video Streams ↔ Agents │ │
│ └───────────────────────────────────────────┘ │
├─────────────────────────────────────────────────┤
│ LiveKit Agents │
│ ┌─────────────────────────────────────────────┐│
│ │ Agent Process ││
│ │ ┌─────────┐ ┌─────────┐ ┌─────────┐ ││
│ │ │ STT │ │ LLM │ │ TTS │ ││
│ │ │(plugin) │ │(plugin) │ │(plugin) │ ││
│ │ └─────────┘ └─────────┘ └─────────┘ ││
│ │ ↓ ↓ ↓ ││
│ │ Deepgram OpenAI ElevenLabs ││
│ │ AssemblyAI Anthropic Cartesia ││
│ │ Whisper Groq Rime ││
│ └─────────────────────────────────────────────┘│
└─────────────────────────────────────────────────┘
Deployment Options
| Option | Description | Cost |
|---|---|---|
| LiveKit Cloud | Managed infrastructure | $0.004-0.015/min |
| Self-Hosted | Your own servers | Infrastructure only |
| Hybrid | Mix of both | Varies |
Strengths
- Open source — Apache 2.0 license; no vendor lock-in; self-hostable
- Provider flexibility — Mix any STT, LLM, TTS providers through plugins
- Strong funding — $122.5M+ from top investors including Index and Salesforce
- OpenAI customer — Validation from industry leader
- Multi-modal — Voice, video, and physical agents in one framework
- Multi-agent — Built-in support for agent handoff and specialization
- Active community — 6K+ GitHub stars, active Discord
Cautions
- More complexity — Requires more setup than managed platforms like Vapi
- DIY responsibility — You manage provider accounts, rate limits, failover
- No turn-taking model — Relies on provider capabilities for natural conversation
- Learning curve — Framework concepts require developer investment
- Cloud costs stack — LiveKit + STT + LLM + TTS costs add up
- Less no-code — Primarily developer-focused; limited visual tools
Pricing & Licensing
Open Source (Self-Hosted):
- Framework: Free (Apache 2.0)
- Infrastructure: Your costs
LiveKit Cloud:
| Component | Cost |
|---|---|
| Audio-only | $0.004/minute |
| Video | $0.015/minute |
| Participant connection | $0.0005/minute (decreases with volume) |
| Egress (recording) | Per-minute based on format |
Provider costs (in addition to LiveKit):
- STT: ~$0.006-0.02/minute (Deepgram, AssemblyAI)
- LLM: ~$0.01-0.10/minute (OpenAI, Anthropic)
- TTS: ~$0.01-0.04/minute (ElevenLabs, Deepgram)
Typical total: $0.03-0.15/minute depending on providers.
Competitive Positioning
Direct Competitors
| Competitor | Differentiation |
|---|---|
| Vapi | Vapi is managed with more hand-holding; LiveKit is framework with more control |
| Retell AI | Retell is simpler and phone-focused; LiveKit supports video and self-hosting |
| ElevenLabs | ElevenLabs is end-to-end with best voices; LiveKit lets you choose any provider |
| Daily, Agora | Similar infrastructure; LiveKit has stronger AI agent focus |
When to Choose LiveKit Agents
- Choose LiveKit when: You want open-source control, provider flexibility, or need self-hosting
- Choose Vapi when: You want managed platform with less setup
- Choose Retell when: You want phone-focused with no-code option
- Choose ElevenLabs when: Voice quality is the top priority
Ideal Customer Profile
Best fit:
- Developer teams wanting full control over voice AI stack
- Organizations requiring self-hosted deployment
- Multi-modal applications (voice + video agents)
- Teams wanting to mix best-of-breed providers
- Companies avoiding vendor lock-in
Poor fit:
- Non-technical teams needing no-code solutions
- Teams wanting fastest path to production
- Simple phone automation use cases
- Organizations preferring single-vendor simplicity
Viability Assessment
| Factor | Assessment |
|---|---|
| Financial Health | Strong — $122.5M+ raised, top-tier investors |
| Market Position | Growing — OpenAI customer, active open-source community |
| Innovation Pace | Rapid — Regular framework updates, new plugins |
| Ecosystem | Extensive — Many provider plugins, active GitHub |
| Long-term Outlook | Very Positive — Well-funded, open-source moat |
LiveKit's combination of strong funding ($122.5M+), open-source model, and validation from OpenAI positions it well for long-term success in the voice AI infrastructure space.
Bottom Line
LiveKit Agents is the best choice for developer teams wanting maximum control over their voice AI stack. The open-source framework with provider flexibility means no vendor lock-in and the ability to mix best-of-breed components. Self-hosting options make it suitable for organizations with data sovereignty requirements.
The trade-off is complexity—more setup than managed platforms, more responsibility for provider management, and a steeper learning curve. For teams with developer resources who want control, it's excellent. For teams wanting the fastest path to production, managed alternatives like Vapi or Retell may be more practical.
Recommended for: Developer teams wanting open-source voice AI with provider flexibility, self-hosting options, and multi-modal capabilities.
Not recommended for: Non-technical teams, those wanting fastest setup, or simple phone-only use cases.
Research by Ry Walker Research