Multilingual Speech-to-Speech Release

We're thrilled to unveil Eleven Multilingual v2 for Speech-to-Speech (STS), a significant enhancement over our previous Eleven English v2 STS model. This latest version not only elevates the overall performance but also introduces the capability to synthesize speech in 29 languages, broadening the horizons for seamless multilingual communication.



Dubbing Studio Release

Introducing Dubbing Studio. Our Dubbing Studio gives you precise control when translating your videos for global audiences.
The studio builds on our existing dubbing feature which automates video localization across 29 languages, it detects and labels each of your speakers and create an editable script of your content. In the editor, you can then:
  • Update transcriptions
  • Update translations
  • Change voices
  • Change timings
  • Regenerate the dialogue until the accent and tone are just right.

We look forward to seeing what you can create - bring your content to global audiences with the ElevenLabs Dubbing Studio.



Dubbing Studio Alpha

We're thrilled to announce the launch of Dubbing Studio, our latest state-of-the-art product which lets you edit results of automatic dubbing and much more. Dubbing Studio is now available to everyone with alpha access.



Voice Library 3.0

We are excited to announce major updates to our Voice Library. With this release, we are introducing:
  • Financial rewards for voice owners: Voice owners can now earn a percentage of the revenue generated by their voices. Earning characters is still available as an default option.
  • Custom rates: When financial rewards are enabled, voice owners can set custom rates for their voices. This makes voices more expensive for the users but lets the owners to earn more for each generation with their voice.
  • Notice periods: Voice can now have notice period which is activated when owner withdraws the voice from the library. In return, voice owner gets higher share of the revenue.
  • Moderation: Owners of the PVC voices can now enable moderation to prevent malicious generations made with the voice.
  • Search: We have also improved search capabilities. You can now perform case insensitive search over voice labels, name and description with typo tolerance.



New Voices

We're thrilled to announce the addition of six new voices to our pre-made collection, available for all users: Bill, Drew, George, Lily, Molly, and Paul. These voices are versatile and suitable for various styles such as News, Narration, and Documentaries. Please note that these voices might sounds slightly different next week as we upgrade our model. Concurrently, we're deprecating the voices of Elli, Bella, and Matthew as part of our ongoing effort to enhance and update our voice offerings.



Speech to Speech Release

We're excited to unveil our latest innovation: Speech to Speech, powered by our advanced Eleven English v2 model. This groundbreaking tool allows you to merge the content and style of an audio clip you upload with a selected voice of your choice. The Eleven English v2 model is now readily accessible to all our platform users. Experience the future of voice synthesis today!



AudioNative Release

We are very excited to announce the release of AudioNative, our state-of-the-art audio player. AudioNative is designed to provide publishers, bloggers, and content creators with a seamless way to integrate audio into their websites. With AudioNative, you can effortlessly convert your articles into audio and embed them into your website with a single line of code. AudioNative is now available to everyone with a Creator subscription or higher and comes with advanced metrics and analytics dashboard for each AudioNative enabled project.



Eleven v2 Turbo Release

We are excited to introduce our latest offering: the Eleven v2 Turbo model. Engineered exclusively for English, this model marries the exceptional speech quality of our Eleven Multilingual v2 with an impressive latency of approximately 400ms.



μ-law encoding output format

We've now incorporated the option for 8kHz μ-law encoded audio output through our API.



Audio normalization in Projects

We've enhanced our Projects application to now support automatic audio normalization to meet audiobook distributor standards.



Metadata in Projects

We've introduced a feature that allows users to specify metadata for Projects. This metadata will be automatically embedded in the downloaded files



Dubbing Improvements

We've enhanced the accuracy of our dubbing system's translations and addressed issues related to talking speed.



Dubbing Release

We're thrilled to announce the launch of "Dubbing," our latest state-of-the-art technology innovation. Designed with utmost precision and advanced algorithms, Dubbing seamlessly converts your content into a diverse range of languages with automatic voiceovers.



Pause support via API and Speech Synthesis editor

We've introduced a feature that allows for pauses in speech through our API and the Speech Synthesis page. To add a three-second pause using SSML, you can for example input: "Martin walked down the street <break time="1.5s"/> he looked into the camera and laughed." Breaks currently have a maximum duration of three seconds and can be passed directly into the text passed to our API. They will be billed at 10 characters each.



Projects Release

We're excited to introduce Projects – our state-of-the-art long-form speech synthesis editor. With Projects, you can effortlessly transform articles or entire books into audio in mere minutes. This feature is now available to everyone with a Creator subscription or higher.



Two-Factor Authentication

We added the ability for users to protect their accounts using Two-Factor Authentication.



PCM output format

We have introduced a new output audio format - PCM for normal, streaming and websockets endpoints. This format comes with 4 options for sampling rate: 16kHz, 22.05kHz, 24kHz and 44.1kHz. Default audio format remains to be mp3.



Voice Library 2.0

We have reworked the Voice Library to make it easier to use and more accessible. This includes new design, top level use-case categories, search, review of all existing voices and moderation queue for the new ones.



Eleven Multilingual v2

We are excited to introduce Eleven Multilingual v2, our latest state-of-the-art speech synthesis model. With unparalleled speech synthesis capabilities, Eleven Multilingual v2 supports 29 languages. In the upcoming days, we will release an update that further enhances its speed.



Sharing for Professionally Cloned Voices

It is now possible to share professionally cloned voices through the Voice Library.



Usage Analytics Dashboard

A dashboard displaying in-depth character usage trends and additional metrics is now accessible to users subscribed to the Independent Publisher tier and higher.



New High Quality Voices

We've enhanced our platform by incorporating over 30 additional voices, covering an extensive array of styles and applications, now readily accessible to all users at no cost. These voices are professionally crafted, boasting exceptional quality to elevate your user experience.



Projects Alpha

We're releasing Projects, our advanced speech synthesis editor, into alpha for select users.



AI Speech Classifier

We are thrilled to announce the release of our AI Speech Classifier. This tool empowers users to effortlessly discern whether an audio clip has been generated by ElevenLabs. Best of all, the AI Speech Classifier is accessible to everyone, absolutely free of charge.



Voice Library

We have introduced the Voice Library feature, which offers our subscribed users an exciting opportunity to share their voices and earn characters concurrently. Moving forward, we have plans to augment the Voice Library by incorporating the ability to share voices that have been professionally replicated with high fidelity. This will open up new avenues for users to exchange and experience a diverse range of high-quality voices.



Multilingual voice verification

The form for adding professional voices now includes a language selection option, which allows you to specify the language used in the uploaded samples. The text displayed in the voice verification prompt reflects the selected language.



Voice sharing

You can now share your generated voices and earn free characters for their usage by community!



History API supports pagination

The GET history items endpoint now supports pagination.



History API Updates

We added an endpoint to get a history item by its ID.

Both text-to-speech endpoints (streaming and non-streaming) now also return the history_item_id as response header. Please note that there is some slight delay until the sample is accessible through the history after the text-to-speech endpoint was called.



Multilingual Text To Speech

Our new multilingual model for Speech Synthesis now supports 7 additional languages: Spanish, French, Italian, German, Polish, Hindi and Portuguese. Multilingual TTS now also allows for generating speech in multiple languages using a single prompt.

Creator subscribers and above now enjoy higher quality 96 kbps audio outputs.


Professional Voice Cloning

Professional Voice Cloning will be released later this year, allowing users on the Creator, Independent Publisher and Growing Business plans to create a near-perfect digital version of their own voice. While Instant Cloning lets you clone voices from very short samples, Professional Cloning requires more audio data for training but it produces much higher fidelity output - the speech produced by the model is almost indistinguishable from the original.

You can submit your recordings for training right now through VoiceLab. We will sequentially release the Pro Voices to individual accounts on a first-come, first-serve basis, beginning in July. Read more in FAQ.



We are introducing a rewards system for providing feedback. With each generation, the audio player now displays thumbs-up/down icons next to the download button. Users can now earn character rewards for providing feedback on model output.


Introducing Voice Design

We are introducing Voice Design, a random voice generator that can create in infinite amount of voices based on your conditions.

Voice settings now appear in history items for API users.


Lowering limit to maximum character generation

We are lowering the maximum character limit to 2,500 characters within one generation. That limit is temporary across all tiers and we are working to increase it back up for creator tier over the next few days.


Speech Synthesis 2.0

With this release, the speech synthesis will get even better! We have added changes to how we train the model resulting in even better results on longer fragments. You can head to usualpanel, to test it out straight away! Our core changes include:
  • Support for cased input, this makes it easier for the model to read names (like OpenAI or ChatGPT), construct pauses between fragments or names
  • Longer & better training - the model performs seems to perform better on our long-form benchmarks, and on the loss functions
  • Necessary components to support infilling - contextual changes to fragments
  • Necessary components to extend the model across languages on the same platform

It is expected that your cloned or default Voices will result in minor changes. Enjoy!


Adjust Voice Settings

To allow you more control on your voices we are adding Voice Settings as part of Speech Synthesis, where you can control how similar and varied the voice is!

We are adding 2 settings that will help you control your voices - this has biggest effect on the voices you clone. One is Stability - which allows you decide on variability of the voice - whether you want it to be more stable and similar between regenerations, or more varied, or expressive. The more varied it is, it might result in slight artifacts and with higher stability, the speech will be cleaner. On the other hand you can also adjust the similarity - the more similar it is, the more it will depend on the voice samples you have uploaded. Similarly, if the samples aren't clean it might result in speech artifacts. If prefer for the voice to be more clear, but similar to generic voices then move this slider to the left!