ElevenLabs Documentation
Meet the models
Our most emotionally rich, expressive speech synthesis model
Lifelike, consistent quality speech synthesis model
Our fast, affordable speech synthesis model
High quality, low-latency model with a good balance of quality and speed
State-of-the-art speech recognition model
Real-time speech recognition model
Browse by capability
Convert text into lifelike speech
Transcribe spoken audio into text
Generate music from text
Create natural-sounding dialogue from text
Generate images and videos from text
Modify and transform voices
Isolate voices from background noise
Dub audio and videos seamlessly
Create cinematic sound effects
Clone and design custom voices
Transform and enhance existing voices
Align text to audio
Deploy intelligent voice agents


