Meet Eleven Music. Make the perfect song for any moment.

Introducing Studio: create high-quality audiobooks in minutes

Sep 19, 2023 • 6 minutes reading time

A one-stop solution for long-form audio creation

A recording studio with a microphone, headphones, a smartphone, a notebook, and a coffee cup on a wooden desk.

Today, we’re launching Studio - our advanced workflow for generating and editing long-form audio. Studio comes as the culmination of our research into long-form speech synthesis, audio conditioning and parallelized audio generation, allowing creators, publishers and independent authors to voice entire dialogue segments, news articles, and even AI audiobooks within minutes - all inside a single workflow.

Studio joins Speech Synthesis, VoiceLab and Voice Library as a tool in its own right; a one-stop solution for long-form audio creation. It also comes fully integrated with Professional Voice Cloning, Voice Library, and our multilingual model.

STUDIO

Screenshot of an audiobook editing interface with highlighted text and two book cover images titled "Discover Daily" and "Dune."

Your comprehensive workflow for turning books into audiobooks and scripts into podcasts

We’ve seen unprecedented demand for long-form audio generation from users

Our users faced several challenges prior to this release. Many grappled with stability issues and flow disruptions when generating lengthier content. There was also a noticeable disconnect when text fragments spoken by different speakers needed to be stitched together. Transitions between voices often lacked cohesion, making it difficult to craft a smooth, continuous dialogue. Regenerating entire audio fragments even when only a brief section was flawed proved inconvenient and inefficient. Users were also limited by certain text file formats which needed converting before they could be worked on inside the platform.

Studio now lets you generate an entire AI audiobook at the click of a button. You can breathe life into your narratives by assigning specific text fragments to particular speakers, all while maintaining contextual cohesion. You can also adjust pause lengths between text segments for improved control over pacing. Studio moreover introduces the ability for selective audio regeneration. You can now regenerate parts of larger text fragments without the need to redo those sequences in full. Those fragments will automatically match the cadence and intonation of the surrounding audio. A save and resume functionality has also been added. Finally, Studio now supports .epub, .pdf, and .txt file imports, as well as initializing a project from a URL.

Getting started

Navigating Studio is easy and intuitive.

Select Studio from the top bar menu.
Click Create New Project.
Choose how you’d like to initialize your Project.
Start crafting your text.
Click Convert to render your entire Project at once, or use Play & Regenerate to test specific fragments.

Narrative

00:00 / 00:00

Feature highlights

Studio provides a straightforward user experience, akin to using Google Docs, with an intuitive, user-centric interface supporting a variety of editing features:

Full conversion: Use a single button to render your entire Project at once, or use Play & Regenerate to test specific fragments.
Speaker assignment: Assign different text fragments to various speakers; choose default voices for headings and paragraphs.
Regenerate audio fragments: Seamlessly regenerate specific segments within larger audio fragments while keeping context intact.
Insert pauses (coming later this week): Manually adjust the length of pauses (up to 3s initially) between speech segments to fine-tune pacing.
Segment by chapter: Structure your text into sections to focus on a particular fragment one at a time.
Save and resume progress: Conveniently pause your work and resume right where you left off.
Import files: Studio supports .epub, .pdf and .txt files, as well as URLs for more streamlined workflow
Intelligent re-generation: When resuming work on an already generated project, you will only be charged for regenerating altered fragments, not the entire project

Compatibility

Studio stands alongside Speech Synthesis, VoiceLab, and Voice Library, serving as a comprehensive solution for long-form audio synthesis. Additionally, it's seamlessly integrated with Professional Voice Cloning, Voice Library, and our multilingual model.

Professional voice cloning: generate long-form audio content in your own voice. You can also share your pro voice clone via Voice Library and earn character rewards when others create projects using your voice.
Voice library: Choose the perfect voice for your narrative from the countless voices created by our community. Select the perfect storyteller voice for romantic tales, epic adventures, or futuristic sci-fi audiobooks. Introduce a range of characters, including Santa Claus, radio DJs, sports announcers, news broadcasters, or customer service agents.
Eleven multilingual: Whether you choose a pre-made voice, a cloned voice or your own voice, you can seamlessly have them speak all the languages supported by our multilingual model.

Studio is available today

With Studio, our goal was to design a tool that makes long-form audio generation as simple as possible. Drawing from fresh research and your feedback, we've developed a comprehensive solution which also seamlessly integrates with our existing ecosystem of tools. We can’t wait to hear you bring your stories to life! Interested in creating your own audiobook? Create an AI narrator with our AI Audiobooks tools.

STUDIO

Your comprehensive workflow for turning books into audiobooks and scripts into podcasts

Update: as of January 2025, Projects is now called Studio and is available to all free users.

Explore articles by the ElevenLabs team

Safety

Safety

Safety framework for AI voice agents

AI voice agents are increasingly being used in customer service, entertainment, and enterprise applications. With this shift comes the need for clear safeguards to ensure responsible use.

Product

Product

How we engineered RAG to be 50% faster

Tips from latency-sensitive RAG systems in production

Create with the highest quality AI Audio

Get started free

Already have an account? Log in