
Scribe v2 just got an upgrade
- Category
- Developer
- Date
Highest accuracy STT for bulk applications. Detect emphasis & sound effects, and guide transcription with keyterm prompting.
Uh, hi! So, um, I was wondering if you wanted to meet up for coffee? Maybe tomorrow morning? [nervous laugh] Totally fine if not!
Create captions, subtitles, and editable transcripts for podcasts, videos, interviews, and other recorded content – all with industry-leading accuracy via API.
Scribe v2 achieves industry-leading transcription accuracy, delivering clean, editable text even in challenging audio conditions or across diverse accents.
Uh, hi! So, um, I was wondering if you wanted to meet up for coffee? Maybe tomorrow morning? [nervous laugh] Totally fine if not!
Transcription that works in noisy environments, with background music, strong accents, and low-quality audio.
The ElevenLabs Transcription API can detect laughter, emotion, and sound effects. Use keyterm prompting to guide transcription with domain-specific terms.
.webp&w=3840&q=95)
.webp&w=3840&q=95)

Capture non-speech events like laughter, applause, music, and background noise. Transcripts include the full context of your audio, not just the words.
Automatically identify and label up to 48 speakers. Clear attribution of who said what, organized into readable transcripts.
Automatically identify and tag 56 entity types including names, dates, locations, and organizations within your transcripts.

Highest accuracy, designed for batch workloads.

Lowest latency, for realtime workloads.
Delivering exceptional accuracy across accents, dialects, and recording conditions.
Change the languageCode to preview languages
import { ElevenLabsClient } from "@elevenlabs/elevenlabs-js";
const elevenlabs = new ElevenLabsClient({
apiKey: "<your_api_key>"
});
const response = await fetch(
"https://storage.googleapis.com/eleven-public-cdn/audio/marketing/nicole.mp3"
);
const audioBlob = new Blob([await response.arrayBuffer()], { type: "audio/mp3" });
const transcription = await elevenlabs
.speechToText.convert({
file: audioBlob,
modelId: "scribe_v2",
tagAudioEvents: true,
languageCode: , // Set language
diarize: true
});
console.log(transcription);“From dubbing Reels in local languages, to generating music and character voices in Horizon, ElevenLabs platform enables global creators, businesses, and enterprises to build with voice, music, and sound at scale.”
“Scribe’s unmatched accuracy across so many languages lets Fieldy understand every daily conversation and easily scale across continents. Fieldy has increased user retention by 50% after moving to ElevenLabs Scribe.”
“ElevenLabs made it easy for us to quickly bring powerful text-to-speech capabilities to our SDK, allowing Agents to respond in real time with expressive voices to user questions or as feedback to what it’s seeing.”

“Twilio has integrated ElevenLabs’ generative AI voice technology into its CPaaS, enhancing ConversationRelay. This integration allows businesses and developers to create conversational AI voice interactions that sound human, feel expressive, and respond in real time directly from the Twilio CPaaS platform. We at ElevenLabs are excited that Twilio has chosen ElevenLabs to enhance ConversationRelay with the most expressive, human sounding voices available. ”









