
Eleven v3 Audio Tags: Bringing multi-character dialogue to life
Create dynamic multi-character dialogue with Eleven v3 Audio Tags. Script overlapping voices, interruptions, and emotional shifts for natural, human-like AI conversations.
Przedstawiamy Eleven v3 Alpha
Wypróbuj v3Jamie upgraded their Speech to Text model to Scribe and got much better accuracy together with a 3x faster pipeline
Jamie is an AI Assistant for meetings that generates AI summaries and provides deep meeting insights. The team built a sophisticated LLM pipeline to generate a summary, extract the right action items and highlight key decisions.
In order to get the best meeting transcript, the team behind Jamie tried all major STT providers, but none of them provided the transcription & diarization quality they required. They had to build their own technical pipeline using a combination of open source models for diarization, and other models for transcription. This created a lot of engineering work and resources to maintain the system.
This was before ElevenLabs launched Scribe. Jamie immediately tried Scribe and the results were impressive. Scribe accurately captures overlapping speech, interruptions and non verbal audio events where other models struggled. The implementation was also remarkably smooth. It took the team only a few days to replace their custom pipeline with Scribe, and they just needed a few customizations to make it work exactly like their previous setup. Scribe is saving Jamie significant engineering resources. With Scribe, Jamie is getting massive improvements in the quality of the results without special infrastructure needs.
This had a direct impact on Jamie’s business. After releasing Scribe, the team stopped getting complaints about diarization results, like wrong number of speakers or incorrect assignments. The new pipeline is also 3 times faster with Scribe compared to their previous model, an hour meeting is processed in only 30-45 seconds. This directly results in an uptick in activation, as users get to see the transcripts and get the “aha” moment faster, and also led to an increase in the number of meetings recorded per user. Moreover, these results happened across multiple languages, like English, German, Spanish and Dutch.
Scribe is the first model to offer accuracy & quality for transcripts and diarization out-of-the-box at a very competitive price.
Egor Spirin, Head of Product & Engineering, meetjamie.ai
Create dynamic multi-character dialogue with Eleven v3 Audio Tags. Script overlapping voices, interruptions, and emotional shifts for natural, human-like AI conversations.
Learn how Voice Cloning works, how to use it, and how to get started.