SPEECH TO TEXT

Transcribe speech to text with the world’s most accurate ASR model

Achieve industry-leading transcription accuracy in 99 languages with Scribe, featuring character-level timestamps, speaker diarization, and audio-event tagging—all delivered in a structured API response for seamless integration

Experience the full Audio AI platform

Every word, perfectly captured

Scribe listens to every nuance, capturing each word with unmatched precision. Delivering audio transcription in 99 languages—with character-level timestamps, speaker diarization, and audio-event tagging—it returns structured results for seamless integration

Powerful Audio to Text features for your app

Transform your audio into flawless text with Scribe, the world's most advanced ASR (automatic speech recognition) model with the simplest speech to text API integration

Industry-leading accuracy

Achieve precision like never before—Scribe delivers the industry's lowest word error rate for perfectly accurate transcription

Smart speaker diarization

In any conversation, even the busiest ones, Scribe intuitively distinguishes and labels every speaker for clear, organized transcripts

Precise word-level timestamps

Capture the exact moment each word is spoken. Scribe’s detailed timestamps enable seamless subtitle syncing and interactive audio experiences

Dynamic audio tagging

From laughter to footsteps, Scribe’s transcription model tags every sound event, enriching your transcripts with the full context of your audio

Global language support

Break language barriers with support for 99 languages—Scribe unlocks AI transcription capabilities for languages previously out of reach

Developers

Integrate ElevenLabs Scribe

Seamlessly integrate the world’s most accurate speech to text model, into your application. Get started with our developer-friendly examples that showcase features like diarization, character-level timestamps, and audio-event tagging for flawless transcriptions

FLEURS Benchmark Performance

Scribe's performance is state of the art on the FLEURS benchmark

Common Voice Benchmark Performance

Scribe's performance is state of the art on the Common Voice benchmark

Benchmarks

The world's most accurate ASR model, supporting over 99 languages

Benchmarked against other ASR models, Scribe delivers over 98% transcription accuracy in major languages while dramatically reducing errors in traditionally underserved ones—such as Serbian, Cantonese and Malayalam

Start transcribing free

Frequently asked questions

ElevenLabs

Create with the highest quality AI Audio

Get started free

Already have an account? Log in