Learn how to translate audio and video while preserving the emotion, timing & tone of speakers.

Overview

ElevenLabs dubbing API translates audio and video across 32 languages while preserving the emotion, timing, tone and unique characteristics of each speaker. Our model separates each speaker’s dialogue from the soundtrack, allowing you to recreate the original delivery in another language. It can be used to:

  • Grow your addressable audience by 4x to reach international audiences
  • Adapt existing material for new markets while preserving emotional nuance
  • Offer content in multiple languages without re-recording voice talent

We also offer a fully managed dubbing service for video and podcast creators.

Usage

ElevenLabs dubbing can be used in two ways:

  • Dubbing Studio in the user interface for fast, interactive control and editing
  • Programmatic integration via our API for large-scale or automated workflows

The UI supports files up to 500MB and 45 minutes. The API supports files up to 1GB and 2.5 hours.

Key features

Speaker separation
Automatically detect multiple speakers, even with overlapping speech.

Multi-language output
Generate localized tracks in 32 languages.

Preserve original voices
Retain the speaker’s identity and emotional tone.

Keep background audio
Avoid re-mixing music, effects, or ambient sounds.

Customizable transcripts
Manually edit translations and transcripts as needed.

Supported file types
Videos and audio can be dubbed from various sources, including YouTube, X, TikTok, Vimeo, direct URLs, or file uploads.

Video transcript and translation editing
Our AI video translator lets you manually edit transcripts and translations to ensure your content is properly synced and localized. Adjust the voice settings to tune delivery, and regenerate speech segments until the output sounds just right.

A Creator plan or higher is required to dub audio files. For videos, a watermark option is available to reduce credit usage.

Cost

To reduce credit usage, you can:

  • Dub only a selected portion of your file
  • Use watermarks on video output (not available for audio)
  • Fine-tune transcripts and regenerate individual segments instead of the entire clip

Refer to our pricing page for detailed credit costs.

FAQ

Dubbing can be performed on all types of short and long form video and audio content. We recommend dubbing content with a maximum of 9 unique speakers at a time to ensure a high-quality dub.

Yes. Our models analyze each speaker’s original delivery to recreate the same tone, pace, and style in your target language.

We use advanced source separation to isolate individual voices from ambient sound. Multiple overlapping speakers can be split into separate tracks.

Via the user interface, the maximum file size is 500MB up to 45 minutes. Through the API, you can process files up to 1GB and 2.5 hours.

You can choose to dub only certain portions of your video/audio or tweak translations/voices in our interactive Dubbing Studio.

Built with