.webp&w=3840&q=95)
How we engineered RAG to be 50% faster
Tips from latency-sensitive RAG systems in production
In November we announced our new, fastest model that generates speech at ≈400ms latency (+ network latency) and is over twice as fast as our V1 models.
Unfortunately users found that it struggled to pronounce long numbers. Give a listen to this generation of "The current stock price for NVIDIA is $867.49.":
Today we just released improved numbers pronunciation for our Turbo v2 model. Here's pronunciation after the change:
Thank you to all of the users who submitted feedback that inspired this fix - and please continue to share areas where our models can be improved.
Tips from latency-sensitive RAG systems in production
Predictable, created by Therapy Box, is one of the world’s leading AAC apps, empowering people with complex communication needs to express themselves with confidence and independence. At its core, Predictable helps people who cannot always rely on natural speech to communicate in ways that feel natural and personal. Now, by partnering with our ElevenLabs Impact Program, every Predictable user has free access to ElevenLabs voices.
Powered by ElevenLabs Agents