Transcripts
Human-edited transcripts from ElevenLabs Productions
General
Transcripts ordered from Productions are reviewed and corrected by native speakers for maximum accuracy. We offer 2 types of human transcripts:
- For a more detailed breakdown of non-verbatim vs. verbatim transcription options, please see the Style guides section below.
- For more information about other Productions services, please see the Overview page.
How it works
Order transcript
Transcribing new files
Productions page
The easiest way to order a new transcript from Productions is from the Productions page in your ElevenLabs account.

Speech to Text Order Dialog
You can also select the Human Transcript option in the Speech to Text order dialog.

Starting from an existing transcript
Open an existing transcript and click the Get human review button to create a new Productions order for that transcript.

Export transcript
You will receive an email notification when your transcript is ready and see it marked as ‘Done’ on your Productions page.
Quick export
Open a transcript on your Productions page and click the three dots, then the Export button.
Export from viewer
Open a transcript on your Productions page and click the View icon to open the transcript viewer.
Pricing
All prices are in USD ($) and per minute of source audio.
Prices are subject to change. You will always see the final price for an order during the checkout process.
SLAs / Delivery Time
We aim to deliver all transcripts within 48 hours. If you are an enterprise interested in achieving quicker turnaround times, please contact us at productions@elevenlabs.io.
Style guides
When ordering a Productions transcript, you will see the option to activate ‘Verbatim’ mode for an extra 30% fee. Please read the breakdown below for more information about this option.

Non-verbatim
Non-verbatim transcription, also called clean or intelligent verbatim, focuses on clarity and readability. Unlike verbatim transcriptions, it removes unnecessary elements like filler words, stutters, and irrelevant sounds while preserving the speaker’s message.
What gets left out in non-verbatim transcripts:
- Filler words and verbal tics like “um,” “like,” “you know,” or “I mean”
- Repetitions including intentional and unintentional (e.g. stuttering)
- Audio event tags, including non-verbal sounds like [coughing] or [throat clearing] as well as environmental sounds like [dog barking]
- Slang or incorrect grammar (e.g. ‘ain’t’ → ‘is not’)
Verbatim
In verbatim transcription, the goal is to capture everything that can be heard,, meaning:
- All detailed verbal elements: stutters, repetitions, etc
- All non-verbal elements like human sounds ([cough]) and environmental sounds ([dog barking])
Non-verbatim vs. verbatim
The following table provides a comprehensive breakdown of our non-verbatim vs. verbatim transcription services.
FAQ
What if I'm not happy with the result?
You can leave feedback on a completed transcript by opening it (use the View option in the sidebar) and clicking the Feedback button.
Can I make changes once I receive the final version?
No. You can export a completed transcript and make changes off platform. We plan to add support for this soon.