Introducing Eleven v3 (alpha)

Try v3

Free Catalan Speech to Text Transcription

Free Catalan speech to text using our advanced AI transcription tool, Scribe. Transcribe Catalan voice, audio, and speech with industry-leading accuracy—Scribe outperforms Google Gemini and OpenAI Whisper, delivering a word error rate of just 3.1% on the FLEURS benchmark and 5.5% on Common Voice. Get accurate Catalan transcriptions for films, podcasts, business meetings, medical dictation, and more.

Experience the full Audio AI platform

Every word, perfectly captured

Scribe listens to every nuance, capturing each Catalan word with unmatched precision. Delivering audio transcription in 99 languages—with character-level timestamps, speaker diarization, and audio-event tagging—it returns structured results for seamless integration

Catalan Transcription Benchmark

ModelFLEURS
Scribe v1
2.5% WER
Deepgram Nova 2
6.3% WER
Gemini Flash 2
3.8% WER
Whisper Large v3
6.2% WER

Powerful Catalan Audio to Text features for your app

Transform your Catalan audio into flawless text with Scribe, the world's most advanced ASR (automatic speech recognition) model with the simplest speech to text API integration

Progress bar with a gradient from black to purple, labeled "II Scribe V1," "Gemini 2.0 Flash," and "Whisper Large v3" on a black background.

Industry-leading accuracy

Achieve precision like never before—Scribe delivers the industry's lowest word error rate for perfectly accurate Catalan transcription

Three colorful, glowing circles with radial patterns on a black background.

Smart speaker diarization

In any conversation, even the busiest ones, Scribe intuitively distinguishes and labels every speaker for clear, organized transcripts

Audio level meter with red peaks at 1:00, T4 and T5 markers, and time stamps at 0:58 and 1:02.

Precise word-level timestamps

Capture the exact moment each word is spoken. Scribe's detailed timestamps enable seamless subtitle syncing and interactive audio experiences

'It that funny? (laughter)

Dynamic audio tagging

From laughter to footsteps, Scribe's transcription model tags every sound event, enriching your Catalan transcripts with the full context of your audio

Multilingual text with the word "Multilingual" highlighted in blue and pink on a black background.

Global language support

Break language barriers with support for Catalan and 98 other languages—Scribe unlocks AI transcription capabilities for languages previously out of reach

Language Overview

Catalan Language Information

Speakers: 10 million Accents: Central (Barcelona), Valencian, Balearic, Northwestern Official language in: Andorra (sole official), Spain (co-official in Catalonia, Valencia, and Balearic Islands) Spoken in: Northeastern Spain (Catalonia, Valencia, Balearic Islands), Andorra, and parts of France and Italy A Romance language with features of both Iberian and Gallo-Romance languages. Known for its unique phonology and significant literary tradition dating back to the Middle Ages.

Developers

Integrate ElevenLabs Scribe

Seamlessly integrate the world's most accurate speech to text model for Catalan, into your application. Get started with our developer-friendly examples that showcase features like diarization, character-level timestamps, and audio-event tagging for flawless transcriptions

Frequently asked questions

Excellent Accuracy (≤ 5% Word Error Rate - WER)
Bulgarian, Catalan, Czech, Danish, Dutch, English, Finnish, French, Galician, German, Greek, Hindi, Indonesian, Italian, Japanese, Kannada, Malay, Malayalam, Macedonian, Norwegian, Polish, Portuguese, Romanian, Russian, Serbian, Slovak, Spanish, Swedish, Turkish, Ukrainian, Vietnamese

High Accuracy (>5% to ≤10% WER)
Bengali, Belarusian, Bosnian, Cantonese, Estonian, Filipino, Gujarati, Hungarian, Kazakh, Latvian, Lithuanian, Mandarin, Marathi, Nepali, Odia, Persian, Slovenian, Tamil, Telugu

Good (>10% to ≤25% WER)
Afrikaans, Arabic, Armenian, Assamese, Asturian, Azerbaijani, Burmese, Cebuano, Croatian, Georgian, Hausa, Hebrew, Icelandic, Javanese, Kabuverdianu, Korean, Kyrgyz, Lingala, Maltese, Mongolian, Māori, Occitan, Punjabi, Sindhi, Swahili, Tajik, Thai, Urdu, Uzbek, Welsh

Moderate (>25% to ≤50% WER)
Amharic, Chichewa, Fulah, Ganda, Igbo, Irish, Khmer, Kurdish, Lao, Luxembourgish, Luo, Northern Sotho, Pashto, Shona, Somali, Umbundu, Wolof, Xhosa, Zulu

Speech to text is a technology that transcribes spoken Catalan into written text using automatic speech recognition (ASR). It processes audio signals, identifies speech patterns, and transcribes them into text with high accuracy. ElevenLabs' AI-powered speech to text software is designed to transcribe audio and video content with human-like precision, making it ideal for voice-to-text conversion, audio transcription, and real-time speech recognition. speech to text technology is used in: ✔ Audio-to-text transcription for podcasts, meetings, and interviews. ✔ Captions and subtitles in video content. ✔ Voice-to-text software for hands-free typing and accessibility tools. ElevenLabs ASR offers fast, reliable, and highly accurate speech to text conversion for multiple languages and accents.

ElevenLabs provides video transcription to transcribe spoken Catalan dialogue into text format, making it easy to create subtitles, captions, and searchable transcripts. Steps to transcribe video to text: 1. Upload your video file to ElevenLabs ASR 2. Speech recognition technology processes the audio 3. A transcript is generated automatically, with timestamps 4. Download the text file or export subtitles for editing. This AI-powered video transcription model helps content creators, businesses, and educators quickly transcribe video speech into accurate text for accessibility and content repurposing.

Scribe currently works well for use-cases where the input audio is available upfront. A low-latency, real-time version will be released soon.

$0.40 per hour of transcribed audio, falling well below this at scale with Enterprise plans.
ElevenLabs

Create with the highest quality AI Audio

Get started free

Already have an account? Log in