Forced Alignment
Learn how to turn spoken audio and text into a time-aligned transcript with ElevenLabs.
Learn how to turn spoken audio and text into a time-aligned transcript with ElevenLabs.
The ElevenLabs Forced Alignment API turns spoken audio and text into a time-aligned transcript. This is useful for cases where you have audio recording and a transcript, but need exact timestamps for each word or phrase in the transcript. This can be used for:
The Forced Alignment API can be used by interfacing with the ElevenLabs API directly.
Learn how to integrate Forced Alignment into your application.
Full API reference for the Forced Alignment endpoint.
Our multilingual v2 models support 29 languages:
English (USA, UK, Australia, Canada), Japanese, Chinese, German, Hindi, French (France, Canada), Korean, Portuguese (Brazil, Portugal), Italian, Spanish (Spain, Mexico), Indonesian, Dutch, Turkish, Filipino, Polish, Swedish, Bulgarian, Romanian, Arabic (Saudi Arabia, UAE), Czech, Greek, Finnish, Croatian, Malay, Slovak, Danish, Tamil, Ukrainian & Russian.