For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
Connect
BlogHelp CenterAPI PricingSign up
OverviewElevenCreativeElevenAgentsElevenAPIReception AIAPI referenceChangelog
OverviewElevenCreativeElevenAgentsElevenAPIReception AIAPI referenceChangelog
  • Get started
    • Quickstart
    • Agents Quickstart
    • Choosing the right model
  • Tutorials
    • Text to Speech
    • Speech to Text
    • Speech Engine
    • Music
    • Text to Dialogue
    • Voice Changer
    • Voice Isolator
    • Dubbing
    • Sound effects
    • Forced Alignment
  • Concepts
    • Understanding audio streaming
    • Understanding latency
    • Voice cloning
  • How-to guides
        • Client-side streaming
        • Server-side streaming
        • Transcripts and commit strategies
        • Event reference
  • Reference
    • Libraries & SDKs
    • Errors
    • Agent tooling
    • Webhooks
    • Zero Retention Mode
    • Breaking changes policy
    • UI components
    • Example projects
    • Next.js template
    • Showcase
  • Private deployment
    • Overview
LogoLogo
Login
Login
Connect
BlogHelp CenterAPI PricingSign up
On this page
  • Overview
  • Quickstart
  • Next steps
How-to guidesSpeech to TextRealtime

Client-side streaming

This guide shows you how to transcribe audio in real time on the client side using ElevenLabs.
Was this page helpful?
Previous

Server-side streaming

This guide shows you how to transcribe audio in real time on the server side using ElevenLabs.
Next
Built with

How-to guide ยท Assumes you have completed the Speech to Text quickstart.

Overview

The ElevenLabs Realtime Speech to Text API enables you to transcribe audio streams in real-time with ultra-low latency using the Scribe Realtime v2 model. Whether youโ€™re building voice assistants, transcription services, or any application requiring live speech recognition, this WebSocket-based API delivers partial transcripts as you speak and committed transcripts when speech segments are complete.

Scribe v2 Realtime can be implemented on the client side to transcribe audio in realtime, either via the microphone or manually chunking the audio.

The client side implementation differs from server side in a few ways:

  • Requires a single use token - this is a temporary token that can be used to connect to the API without exposing your API key.
  • Audio from the microphone can be piped directly to the API to transcribe, without the need to manually chunk the audio.

For streaming audio from a URL, see the Server-side streaming guide.

Quickstart

This guide assumes you have set up your API key. Complete the quickstart first if you havenโ€™t.

1

Install the SDK

$npm install @elevenlabs/react @elevenlabs/elevenlabs-js
2

Create a token

To use the client side SDK, you need to create a single use token. This is a temporary token that can be used to connect to the API without exposing your API key. This can be done via the ElevenLabs API on the server side.

Never expose your API key to the client.

1// Node.js server
2import { ElevenLabsClient } from "@elevenlabs/elevenlabs-js";
3
4const elevenlabs = new ElevenLabsClient({
5 apiKey: process.env.ELEVENLABS_API_KEY,
6});
7
8app.get("/scribe-token", yourAuthMiddleware, async (req, res) => {
9 const token = await elevenlabs.tokens.singleUse.create("realtime_scribe");
10
11 res.json(token);
12});

A single use token automatically expires after 15 minutes.

3

Start the transcribing session

Transcription can be done either via the microphone or manually chunking your own audio. Your own audio can be a file or a stream.

For a full list of parameters and options the API supports, please refer to the API reference.

Microphone
Manual audio chunking
1import { useScribe } from "@elevenlabs/react";
2
3function MyComponent() {
4 const scribe = useScribe({
5 modelId: "scribe_v2_realtime",
6 onPartialTranscript: (data) => {
7 console.log("Partial:", data.text);
8 },
9 onCommittedTranscript: (data) => {
10 console.log("Committed:", data.text);
11 },
12 onCommittedTranscriptWithTimestamps: (data) => {
13 console.log("Committed with timestamps:", data.text);
14 console.log("Timestamps:", data.words);
15 },
16 });
17
18 const handleStart = async () => {
19 // Fetch a single use token from the server
20 const token = await fetchTokenFromServer();
21
22 await scribe.connect({
23 token,
24 microphone: {
25 echoCancellation: true,
26 noiseSuppression: true,
27 },
28 });
29 };
30
31 return (
32 <div>
33 <button onClick={handleStart} disabled={scribe.isConnected}>
34 Start Recording
35 </button>
36 <button onClick={scribe.disconnect} disabled={!scribe.isConnected}>
37 Stop
38 </button>
39
40 {scribe.partialTranscript && <p>Live: {scribe.partialTranscript}</p>}
41
42 <div>
43 {scribe.committedTranscripts.map((t) => (
44 <p key={t.id}>{t.text}</p>
45 ))}
46 </div>
47 </div>
48 );
49}

Next steps

Server-side streaming

Transcribe audio streams on the server side with the same WebSocket API.

Transcripts and commit strategies

Control when transcripts are committed and how to handle partial results.