Transcripts

Human-edited transcripts from ElevenLabs Productions

General

Transcripts ordered from Productions are reviewed and corrected by native speakers for maximum accuracy. We offer 2 types of human transcripts:

OptionWhen to use itDescription
Non‑verbatim (”clean”)Podcasts, webinars, marketing, personal useRemoves filler words, stutters, audio event tags for smoother reading. Focuses on transcribing the core meaning. Most suitable for the majority of use-cases.
VerbatimLegal, researchAttempts to capture exactly what is said, including all filler words, stutters and audio event tags.
  • For a more detailed breakdown of non-verbatim vs. verbatim transcription options, please see the Style guides section below.
  • For more information about other Productions services, please see the Overview page.

How it works

1

Order transcript

Productions page

The easiest way to order a new transcript from Productions is from the Productions page in your ElevenLabs account.

Productions Home

Speech to Text Order Dialog

You can also select the Human Transcript option in the Speech to Text order dialog.

Productions STT Dialog

Open an existing transcript and click the Get human review button to create a new Productions order for that transcript.

Productions Get Human Review
2

Export transcript

You will receive an email notification when your transcript is ready and see it marked as ‘Done’ on your Productions page.

Open a transcript on your Productions page and click the three dots, then the Export button.

Export menu

Open a transcript on your Productions page and click the View icon to open the transcript viewer.

Viewer export menu

Pricing

All prices are in USD ($) and per minute of source audio.

LanguageNon-verbatim (per minute)Verbatim (per minute)
English$2.00$2.60
French$3.00$3.90
Spanish$3.00$3.90
German$3.00$3.90
Italian$3.00$3.90
Portuguese (Brazil)$3.00$3.90
Hindi$2.00$2.60

Prices are subject to change. You will always see the final price for an order during the checkout process.

SLAs / Delivery Time

We aim to deliver all transcripts within 48 hours. If you are an enterprise interested in achieving quicker turnaround times, please contact us at productions@elevenlabs.io.

Style guides

When ordering a Productions transcript, you will see the option to activate ‘Verbatim’ mode for an extra 30% fee. Please read the breakdown below for more information about this option.

Productions Style Guide

Non-verbatim transcription, also called clean or intelligent verbatim, focuses on clarity and readability. Unlike verbatim transcriptions, it removes unnecessary elements like filler words, stutters, and irrelevant sounds while preserving the speaker’s message.

This is the default option for Productions transcriptions. Unless you explicitly select ‘Verbatim’ mode, we will deliver a non-verbatim transcript.

What gets left out in non-verbatim transcripts:

  • Filler words and verbal tics like “um,” “like,” “you know,” or “I mean”
  • Repetitions including intentional and unintentional (e.g. stuttering)
  • Audio event tags, including non-verbal sounds like [coughing] or [throat clearing] as well as environmental sounds like [dog barking]
  • Slang or incorrect grammar (e.g. ‘ain’t’ → ‘is not’)

In verbatim transcription, the goal is to capture everything that can be heard,, meaning:

  • All detailed verbal elements: stutters, repetitions, etc
  • All non-verbal elements like human sounds ([cough]) and environmental sounds ([dog barking])

The following table provides a comprehensive breakdown of our non-verbatim vs. verbatim transcription services.

FeatureVerbatim TranscriptionVerbatim ExampleNon-Verbatim (Clean) TranscriptionNon-Verbatim Example
Filler wordsAll filler words are included exactly as spoken.”So, um, I was like, you know, maybe we should wait.”Filler words like “um,” “like,” “you know” are removed.”I was thinking maybe we should wait.”
StuttersStutters and repeated syllables are transcribed with hyphens.”I-I-I don’t know what to say.”Stutters are removed for smoother reading.”I don’t know what to say.”
RepetitionsRepeated words are retained even when unintentional.”She, she, she told me not to come.”Unintentional repetitions are removed.”She told me not to come.”
False StartsFalse starts are included using double hyphens.”I was going to—no, actually—let’s wait.”False starts are removed unless they show meaningful hesitation.”Let’s wait.”
InterruptionsSpeaker interruptions are marked with a single hyphen.Speaker 1: “Did you see—” Speaker 2: “Yes, I did.”Interruptions are simplified or smoothed.Speaker 1: “Did you see it?” Speaker 2: “Yes, I did.”
Informal ContractionsInformal speech is preserved as spoken.”She was gonna go, but y’all called.”Standard grammar should be used for clarity, outside of exceptions. Please refer to your language style guide to know which contractions to keep vs. when to resort to standard grammar.”She was going to go, but you all called.”
Emphasized WordsElongated pronunciations are reflected with extended spelling.”That was amaaazing!”Standard spelling is used.”That was amazing!”
InterjectionsInterjections and vocal expressions are included.”Ugh, this is terrible. Wow, I can’t believe it!”Only meaningful interjections are retained.”This is terrible. Wow, I can’t believe it!”
Swear WordsSwear words are fully transcribed.”Fuck this, I’m not going.”Swear words should be fully transcribed, unless indicated otherwise.”Fuck this, I’m not going.”
Pronunciation MistakesMispronounced words are corrected.Example (spoken): “ecsetera” Transcribed: “etcetera”Mispronounced words are corrected here as well.Example (spoken): “ecsetera” Transcribed: “etcetera”
Non-verbal human soundsHuman non-verbal sounds like [laughing], [sighing], [swallowing] are transcribed inline.”I—[sighs]—don’t know.”Most non-verbal sounds are excluded unless they impact meaning.”I don’t know.”
Environmental SoundsEnvironmental sounds are described in square brackets.”[door slams], [birds chirping], [phone buzzes]“Omit unless essential to meaning. Include if: 1. The sound impacts emotion or meaning 2. The sound is directly referenced by the speaker”What was that noise? [dog barking]” “Hang on, I hear something [door slamming]“

FAQ

You can leave feedback on a completed transcript by opening it (use the View option in the sidebar) and clicking the Feedback button.

No. You can export a completed transcript and make changes off platform. We plan to add support for this soon.