ElevenLabs Conversational AI | Ry Walker Research

Key takeaways

Conversational AI was renamed ElevenLabs Agents (branded ElevenAgents) — agents now talk, type, and take action across phone, web, and apps, with 2M+ agents created handling 33M+ conversations
State-of-the-art turn-taking model plus 10,000+ voices, 70+ languages, integrated RAG, and a visual workflow builder with testing/simulation suite
$500M Series D led by Sequoia at an $11B valuation (February 2026); ~$0.10/minute agent pricing, dropping to ~$0.08/minute on Business tier

FAQ

What is ElevenLabs Agents (formerly Conversational AI)?

ElevenLabs Agents — branded ElevenAgents — is the platform formerly called Conversational AI. It builds voice and chat agents with natural turn-taking, 70+ languages, integrated RAG, workflows, and enterprise security, deployed across phone, web, and apps.

How much does ElevenLabs Agents cost?

Agent conversations cost roughly $0.10/minute on standard plans and about $0.08/minute on the Business tier, billed from plan credits with bundled agent minutes per plan. Subscriptions range from Free to Starter ($6/mo), Creator ($22/mo), Pro ($99/mo), Scale ($299/mo), Business ($990/mo), and custom Enterprise. LLM token costs are metered separately.

What voices are available?

10,000+ voices including stock voices, community voices, and custom voice clones. Agents support 70+ languages with automatic real-time language detection and switching.

Executive Summary

ElevenLabs Conversational AI has been renamed ElevenLabs Agents (branded ElevenAgents), reflecting a scope expansion from voice-only agents to agents that "talk, type and take action across phone, web and apps."^[1] It remains the market-leading voice agent platform, combining best-in-class voice synthesis with a state-of-the-art turn-taking model, integrated RAG, a visual workflow builder, and enterprise features including HIPAA compliance and EU data residency.^[2]^[1] In February 2026 ElevenLabs raised a $500M Series D led by Sequoia at an $11B valuation — more than triple its January 2025 valuation.^[3]

Attribute	Value
Company	ElevenLabs
Founded	2022
Valuation	$11B (Series D, February 2026)^[3]
Total Funding	$781M across five rounds^[3]
Voices	10,000+
Languages	70+^[4]
Adoption	2M+ agents created, 33M+ conversations^[1]

Product Overview

ElevenLabs launched Conversational AI in January 2025, released version 2.0 five months later in May 2025,^[2] and then renamed the platform ElevenLabs Agents to reflect agents that handle voice, text, and actions across phone, web, and apps.^[1] The company now frames its business around three pillars — ElevenAgents, ElevenCreative, and ElevenAPI — with agents as the flagship. As of the Agents announcement, customers had created over 2 million agents collectively handling more than 33 million conversations.^[1]

The company closed a $500M Series D on February 4, 2026, led by Sequoia Capital at an $11B valuation, with Andreessen Horowitz and ICONIQ taking super pro-rata positions and new investors Lightspeed, Evantic Capital, and BOND joining.^[3]^[5]

Key Capabilities

Capability	Description
Turn-Taking Model	Detects conversational cues (um, ah) for natural flow^[2]
10K+ Voices	Stock, community, and custom voice clones
Integrated RAG	Low-latency knowledge retrieval with privacy
Multimodal	Voice-only, chat-only, or voice + text simultaneously
70+ Languages	Real-time language detection and switching^[4]
Workflows	Visual builder for branching business logic^[1]
Testing Suite	Agent simulation and evaluation before deployment^[1]
Integrations	Stripe, HubSpot, Zendesk, Twilio, Cal.com, Salesforce, plus 8,000+ apps via Zapier^[4]

Product Surfaces

Surface	Description	Availability
Web Widget	Embeddable voice agent	GA
Mobile SDKs	iOS and Android native	GA
Telephony	Twilio, SIP trunking	GA
Chat Mode	Text-only agents (added August 2025)	GA^[1]
Batch Calls	Automated outbound calling	GA
API	Full programmatic control	GA

Technical Architecture

ElevenLabs Agents combines the company's industry-leading TTS with a conversational engine that handles turn-taking, interruptions, and knowledge retrieval.

┌─────────────────────────────────────────────────┐
│             ElevenLabs Agents                   │
├─────────────────────────────────────────────────┤
│  ┌───────────────┐    ┌───────────────────────┐│
│  │ Turn-Taking   │    │    Voice Synthesis    ││
│  │ Model         │    │    (10K+ voices)      ││
│  └───────┬───────┘    └───────────┬───────────┘│
│          │                        │            │
│  ┌───────┴────────────────────────┴───────────┐│
│  │           Conversation Engine              ││
│  │  ┌─────────┐ ┌─────────┐ ┌─────────┐      ││
│  │  │   LLM   │ │   RAG   │ │  Tools  │      ││
│  │  └─────────┘ └─────────┘ └─────────┘      ││
│  └────────────────────────────────────────────┘│
├─────────────────────────────────────────────────┤
│  Web | Mobile | Telephony | Chat | API          │
└─────────────────────────────────────────────────┘

Enterprise Features

Feature	Description
HIPAA Compliance	Healthcare data privacy^[2]
EU Data Residency	Data sovereignty for EU^[2]
SSO/SAML	Enterprise authentication
SLA	Uptime guarantees
Guardrails	Safety and compliance validation^[4]

Strengths

Voice quality — Industry-leading TTS with 10,000+ natural-sounding voices
Turn-taking — State-of-the-art model detects conversational cues for natural flow^[2]
Voice cloning — Create custom voices from audio samples
Multilingual — 70+ languages with automatic real-time detection and switching^[4]
Integrated RAG — Low-latency knowledge retrieval built-in
Enterprise-ready — HIPAA, EU residency, SSO, SLAs, guardrails
Multimodal — Voice, chat, or both in the same agent
Proven scale — 2M+ agents created, 33M+ conversations handled^[1]
Capitalization — $781M raised, $11B valuation, Sequoia/a16z/NVIDIA backing^[3]

Cautions

No self-hosting — Cloud-only; no on-premise deployment option
Credit-based pricing — Can be complex to predict costs; LLM token costs are metered separately on top of per-minute rates^[6]
LLM dependency — Relies on external LLMs (OpenAI, Anthropic) for reasoning
Naming churn — Conversational AI → Agents Platform → ElevenAgents in roughly a year; docs URLs and SDK references have moved
Premium pricing — Higher cost than DIY STT+LLM+TTS pipelines; developers report the API cost constrains free-tier business models^[7]
Function calling — Tool orchestration still less sophisticated than OpenAI Realtime, though workflows narrow the gap

Pricing & Licensing

ElevenLabs uses a credit-based system with subscription plans:^[8]

Plan	Price	Credits
Free	$0/mo	10K credits
Starter	$6/mo	30K credits
Creator	$22/mo ($11 first month)	~120K credits
Pro	$99/mo	600K credits
Scale	$299/mo	1.8M credits
Business	$990/mo	6M credits
Enterprise	Custom	Custom

Agent costs: roughly $0.10/minute on standard paid plans, dropping to ~$0.08/minute on the Business tier. Each paid plan includes bundled agent minutes (e.g., ~275 on Creator, ~1,238 on Pro, ~12,375 on Business); LLM token costs are passed through separately.^[9]^[6]

What Developers Say

Community sentiment on Hacker News reflects both production adoption and cost/quality friction:^[7]

"Stack is Twilio for telephony, ElevenLabs for the voice agent, OpenAI for the chat layer." — talyuk, Hacker News (February 2026), describing a production phone-assistance deployment

"I built it using ElevenLabs' voice agent. Since their AI voice API is expensive, I can only offer a 3-day free trial." — jameshih, Hacker News (November 2025)

"Voice agents are a royal pita. They have trouble with any kind of British regional accent." — Leynos, Hacker News (May 2026), on speech recognition limits in voice agent stacks

"Elevenlabs conversational agents are priced at 0.08 per minute at the highest tier." — fakedang, Hacker News (June 2025), comparing pricing against OpenAI alternatives

The pattern: developers ship real products on the platform and praise the voice quality, but recurring complaints center on cost at volume and accent/recognition edge cases.

Competitive Positioning

Direct Competitors

Competitor	Differentiation
OpenAI Realtime API	OpenAI has better function calling; ElevenLabs has superior voice quality and variety
Vapi	Vapi orchestrates multiple providers; ElevenLabs is end-to-end with better voices
Retell AI	Retell has lower base pricing; ElevenLabs has more voices and turn-taking
AWS Nova Sonic	Nova Sonic has Bedrock integration; ElevenLabs has better voice quality

When to Choose ElevenLabs Agents

Choose ElevenLabs when: Voice quality is paramount, you need voice cloning, or want turn-taking detection
Choose OpenAI Realtime when: Function calling accuracy is critical
Choose Vapi when: You need provider flexibility
Choose Retell when: Cost is the primary concern

Ideal Customer Profile

Best fit:

Applications where voice quality is a key differentiator
Brands wanting unique voice identity (voice cloning)
Multilingual global deployments (70+ languages)
Healthcare applications (HIPAA compliance)
Entertainment, gaming, and creative applications
Teams wanting all-in-one voice agent platform

Poor fit:

Cost-sensitive high-volume applications
Teams requiring self-hosted deployment
Use cases needing sophisticated tool orchestration
Organizations avoiding vendor lock-in

Viability Assessment

Factor	Assessment
Financial Health	Excellent — $11B valuation, $781M raised, Sequoia/a16z/NVIDIA backing^[3]
Market Position	Leader — Dominant in TTS, 2M+ agents created on Agents platform^[1]
Innovation Pace	Rapid — v1 to v2 in 5 months; chat mode, workflows, testing suite, and a platform rebrand since
Ecosystem	Growing — SDKs, native integrations (Salesforce, Zendesk, HubSpot), 8,000+ apps via Zapier
Long-term Outlook	Very Positive — Clear market leader trajectory

ElevenLabs is the fastest-growing company in voice AI. Its February 2026 Series D more than tripled the company's valuation year-over-year to $11B, and the evolution from TTS to a full agents platform shows strong execution.^[3]^[5]

Bottom Line

ElevenLabs Agents (formerly Conversational AI) is the best choice when voice quality and naturalness are paramount. The combination of 10,000+ voices, state-of-the-art turn-taking detection, 70+ languages, and enterprise features (HIPAA, EU residency) makes it the most complete voice agent platform available — and the $500M Series D at $11B removes any viability concern.

The trade-offs are premium pricing (~$0.10/minute plus separately metered LLM costs), cloud-only deployment, and function calling that still trails OpenAI Realtime. For applications where voice quality differentiates the product — entertainment, brand voice, creative applications — ElevenLabs is the clear leader.

Recommended for: Applications prioritizing voice quality, brands wanting unique voice identity, multilingual deployments, and HIPAA-compliant healthcare applications.

Not recommended for: Cost-sensitive high-volume applications, teams requiring self-hosting, or use cases needing sophisticated tool orchestration.

Outlook: Strongly positive. With $781M raised, a three-pillar platform strategy (ElevenAgents, ElevenCreative, ElevenAPI), and 2M+ agents already created, ElevenLabs is consolidating its position as the default end-to-end voice agent platform.

Research by Ry Walker Research

Sources