Dubbing Studio
Fine-grained control over your dubs.
Create a Dubbing Studio project
- Check the ‘Create Dubbing Studio’ box when creating a dub.
- Click on Create Dub. Once the Dubbing Studio project is created, you will be able to open it.
Core Concepts
Speaker Cards
Speaker Cards
Speaker cards show the original transcription and translation (if you add one) of dialogue from the source video. You can click ‘Transcribe Audio’ to retranscribe the original speech, or click the arrow to re-translate an existing transcription.
Edit Transcripts and Translations
Both transcriptions and translations can be edited freely - just click inside a speaker card and start typing to edit the text.
Speaker Identification
You can see the name of each speaker in the top left of the speaker card. To change the name of a speaker or reassign a clip to a different speaker, you’ll need to use the Timeline.
Timeline
Timeline
The timeline contains many important elements of Dubbing Studio, covered in more detail in different sections below:
Basic navigation
There are 3 main ways to navigate the timeline:
- Click and drag the cursor
- Horizontally scroll
- Input a specific timecode on the right side of the timeline
Adjust clips and regenerate audio
- Drag the handles on the left or right side of a clip to adjust its length.
- Click the refresh icon to regenerate the audio for that clip.
Dynamic vs. Fixed Generations
NOTE: By default, all regenerations in Dubbing Studio are Fixed Generations, which means that the system will keep the duration of the clip fixed regardless of how much text it contains. This can lead to speech speeding up or slowing down significantly if you adjust the length of a clip without changing the text, or if you add/remove a large number of words to a clip.
Consider a clip with the phrase ‘I’m doing well.’ If that clip were set to last 10 seconds and the audio were generated using Fixed Generations, the speech would sound slow and drawn out.
Alternatively, you can use Dynamic Generations by right clicking a segment and selecting it from the options. This will attempt to adjust the length of the clip to the length of the text and make the audio sound more natural.
But be careful – using Dynamic Generations could affect sync and timing in your videos. If, for example, you select Dynamic Generation for a clip with many words in it, and there is not enough room before the next clip for it to properly expand, the audio may not generate properly.
Stale Audio
Stale audio refers to audio that needs to be regenerated for one of many reasons (clip length changes, settings changes, transcription/translation changes, etc). You can regenerate stale clips individually or click ‘Generate Stale Audio’ to bulk generate all stale audio clips.
Clip History
You can right click a clip and select ‘Clip History’ to view previous generations and select the one that sounds best.
Split and Merge clips
- To split a clip, move the cursor to a specific timecode and click ‘Split’.
- To merge two clips, drag the ends of the clips together and click ‘Merge.’
As you split and merge clips, the speaker cards above the timeline will update to reflect these changes.
Reassign clips to different speakers
To reassign a clips to a different speaker, click the segment and drag it to another track.
Add additional audio tracks
Use the action buttons at the bottom of the timeline to add new audio tracks
Voice Settings
Voice Settings
Voice Selection
To select the voice that will be used to generate audio on a specific speaker track, click the settings cog icon on the left side of the timeline near the speaker name.
There are 3 main types of voices to choose from in Dubbing Studio:
- Clip clone - this creates a unique voice clone for each clip based on the source audio for that clip
- Track clone - this creates a single voice clone for the whole track based on all source audio for a given speaker
- Other voices - you can also choose from thousands of voices available in our Voice Library, each with detailed metadata and tags to help you choose the right one
You can also create, save, and reuse a voice from a specific clip by right clicking the clip and selecting ‘Create Voice from Selection.‘
Setting Track vs. Clip Level Settings
You can set voice settings at two levels:
-
Track Level - changes will apply across all clips in the track, which can help with stability and consistency.
-
Clip Level - changes will only apply to a specific clip. To set clip-level settings, use the panel on the right side of the timeline. Disable the ‘inherit track settings’ toggle and configure your desired settings.
Exports
Exports
Click ‘Export’ in the bottom right of Dubbing Studio to open the export menu.
Dubbing Studio currently supports the following export formats:
- AAC (audio)
- MP3 (audio)
- WAV (audio)
- .zip of audio tracks
- .zip of audio clips
- AAF (timeline data)
- SRT (subtitles/captions)
- CSV (speaker, start_time, end_time, transcription, translation)
Make sure you select the correct language when exporting.
Additional Features
- Voiceover Tracks: Voiceover tracks create new Speakers. You can click and add clips on the timeline wherever you like. After creating a clip, start writing your desired text on the speaker cards above. You’ll first need to translate that text, then you can press “Generate”. You can also use our voice changer tool by clicking on the microphone icon on the right side of the screen to use your own voice and then change it into the selected voice.
- SFX Tracks: Add a SFX track, then click anywhere on that track to create a SFX clip. Similar to our independent SFX feature, simply start writing your prompt in the Speaker card above and click “Generate” to create your new SFX audio. You can lengthen or shorten SFX clips and move them freely around your timeline to fit your project - make sure to press the “stale” button if you do so.
- Upload Audio: This option allows you to upload a non voiced track such as sfx, music or background track. Please keep in mind that if voices are present in this track, they won’t be detected so it will not be possible to translate or correct them.
Manual Dub
In cases where you already have an accurate dubbing script prepared and want to ensure your Dubbing Studio project sticks to your exact clips and speaker assignment, you can use the Manual Dub option during creation.
To create a Manual Dub, you’ll need:
- Video file
- Background audio file
- Foreground audio file
- CSV where each row contains a speaker, start_time, end_time, transcription, and translation field
The CSV file must strictly follow the predefined format in order to be processed correctly. Please see below for samples in the three supported timecodes:
- seconds
- hours:minutes:seconds:frame
- hours:minutes:seconds,milliseconds