JavaScript SDK | ElevenLabs Documentation

For an overview of Scribe and its capabilities, see the Speech to Text overview. For step-by-step usage guides, see Client-side streaming.

Installation

$ npm install @elevenlabs/client
$ # or
$ yarn add @elevenlabs/client
$ # or
$ pnpm install @elevenlabs/client

Use the ElevenLabs speech-to-text skill to transcribe audio from your AI coding assistant:

$ npx skills add elevenlabs/skills --skill speech-to-text

This library can be used in any JavaScript-based project. If you are using React, consider the useScribe hook which provides built-in state management and lifecycle handling.

Usage

Here is a minimal working example that connects to Scribe and logs transcription results:

1 import { Scribe, RealtimeEvents } from "@elevenlabs/client";
2 
3 const token = await fetchTokenFromServer();
4 
5 const connection = Scribe.connect({
6   token,
7   modelId: "scribe_v2_realtime",
8   microphone: {
9     echoCancellation: true,
10     noiseSuppression: true,
11   },
12 });
13 
14 connection.on(RealtimeEvents.PARTIAL_TRANSCRIPT, (data) => {
15   console.log("Partial:", data.text);
16 });
17 
18 connection.on(RealtimeEvents.COMMITTED_TRANSCRIPT, (data) => {
19   console.log("Committed:", data.text);
20 });
21 
22 // Later, close the connection
23 connection.close();

Getting a token

Scribe requires a single-use token for authentication. Create an API endpoint on your server:

1 // Node.js server
2 app.get("/scribe-token", yourAuthMiddleware, async (req, res) => {
3   const response = await fetch("https://api.elevenlabs.io/v1/single-use-token/realtime_scribe", {
4     method: "POST",
5     headers: {
6       "xi-api-key": process.env.ELEVENLABS_API_KEY,
7     },
8   });
9 
10   const data = await response.json();
11   res.json({ token: data.token });
12 });

Your ElevenLabs API key is sensitive. Never expose it to the client. Always generate the token on the server.

1 // Client
2 const fetchToken = async () => {
3   const response = await fetch("/scribe-token");
4   const { token } = await response.json();
5   return token;
6 };

Connection options

Scribe.connect() accepts either microphone options or manual audio options. Both share a common set of base options.

Base options

Property	Type	Default	Description
token	`string`		Single-use token for WebSocket authentication.
modelId	`string`		Model ID (e.g., `"scribe_v2_realtime"`).
baseUri	`string`	`"wss://api.elevenlabs.io"`	Custom WebSocket base URI.
commitStrategy	`CommitStrategy`	`"manual"`	`"manual"` or `"vad"`.
vadSilenceThresholdSecs	`number`	`1.5`	Seconds of silence before VAD commits (0.3-3.0).
vadThreshold	`number`	`0.4`	VAD sensitivity (0.1-0.9, lower is more sensitive).
minSpeechDurationMs	`number`	`100`	Minimum speech duration in ms (50-2000).
minSilenceDurationMs	`number`	`100`	Minimum silence duration in ms (50-2000).
languageCode	`string`		ISO-639-1 or ISO-639-3 language code. Leave empty for auto-detection.
includeTimestamps	`boolean`	`false`	Receive word-level timestamps via the `COMMITTED_TRANSCRIPT_WITH_TIMESTAMPS` event.

Microphone options

Pass a microphone object to stream audio directly from the user’s microphone. The connection handles getUserMedia and audio encoding automatically.

1 const connection = Scribe.connect({
2   token,
3   modelId: "scribe_v2_realtime",
4   microphone: {
5     deviceId: "optional-device-id",
6     echoCancellation: true,
7     noiseSuppression: true,
8     autoGainControl: true,
9   },
10 });

Property	Type	Description
deviceId	`string`	Specific microphone device ID.
echoCancellation	`boolean`	Enable echo cancellation.
noiseSuppression	`boolean`	Enable noise suppression.
autoGainControl	`boolean`	Enable automatic gain control.

Manual audio options

Pass audioFormat and sampleRate to send audio data manually via connection.send().

1 import { AudioFormat } from "@elevenlabs/client";
2 
3 const connection = Scribe.connect({
4   token,
5   modelId: "scribe_v2_realtime",
6   audioFormat: AudioFormat.PCM_16000,
7   sampleRate: 16000,
8 });

Property	Type	Description
audioFormat	`AudioFormat`	Audio encoding format (e.g., `AudioFormat.PCM_16000`).
sampleRate	`number`	Sample rate in Hz. Must match `audioFormat`.

AudioFormat enum

1 enum AudioFormat {
2   PCM_8000 = "pcm_8000",
3   PCM_16000 = "pcm_16000",
4   PCM_22050 = "pcm_22050",
5   PCM_24000 = "pcm_24000",
6   PCM_44100 = "pcm_44100",
7   PCM_48000 = "pcm_48000",
8   ULAW_8000 = "ulaw_8000",
9 }

Microphone mode

Stream audio directly from the user’s microphone:

1 import { Scribe, RealtimeEvents } from "@elevenlabs/client";
2 
3 async function transcribeFromMicrophone() {
4   const token = await fetchToken();
5 
6   const connection = Scribe.connect({
7     token,
8     modelId: "scribe_v2_realtime",
9     microphone: {
10       echoCancellation: true,
11       noiseSuppression: true,
12       autoGainControl: true,
13     },
14   });
15 
16   connection.on(RealtimeEvents.PARTIAL_TRANSCRIPT, (data) => {
17     document.getElementById("live").textContent = data.text;
18   });
19 
20   connection.on(RealtimeEvents.COMMITTED_TRANSCRIPT, (data) => {
21     const el = document.createElement("p");
22     el.textContent = data.text;
23     document.getElementById("transcripts").appendChild(el);
24     document.getElementById("live").textContent = "";
25   });
26 
27   document.getElementById("stop").addEventListener("click", () => {
28     connection.close();
29   });
30 }

Manual audio mode (file transcription)

Transcribe pre-recorded audio files by sending audio data manually:

1 import { Scribe, RealtimeEvents, AudioFormat } from "@elevenlabs/client";
2 
3 async function transcribeFile(file) {
4   const token = await fetchToken();
5 
6   const connection = Scribe.connect({
7     token,
8     modelId: "scribe_v2_realtime",
9     audioFormat: AudioFormat.PCM_16000,
10     sampleRate: 16000,
11   });
12 
13   connection.on(RealtimeEvents.COMMITTED_TRANSCRIPT, (data) => {
14     console.log("Transcript:", data.text);
15   });
16 
17   // Decode audio file
18   const arrayBuffer = await file.arrayBuffer();
19   const audioContext = new AudioContext({ sampleRate: 16000 });
20   const audioBuffer = await audioContext.decodeAudioData(arrayBuffer);
21 
22   // Convert to PCM16
23   const channelData = audioBuffer.getChannelData(0);
24   const pcmData = new Int16Array(channelData.length);
25 
26   for (let i = 0; i < channelData.length; i++) {
27     const sample = Math.max(-1, Math.min(1, channelData[i]));
28     pcmData[i] = sample < 0 ? sample * 32768 : sample * 32767;
29   }
30 
31   // Send in chunks
32   const chunkSize = 4096;
33   for (let offset = 0; offset < pcmData.length; offset += chunkSize) {
34     const chunk = pcmData.slice(offset, offset + chunkSize);
35     const bytes = new Uint8Array(chunk.buffer);
36     const base64 = btoa(String.fromCharCode(...bytes));
37 
38     connection.send({ audioBase64: base64 });
39     await new Promise((resolve) => setTimeout(resolve, 50));
40   }
41 
42   // Commit and close
43   connection.commit();
44 }

RealtimeConnection

Scribe.connect() returns a RealtimeConnection instance with the following methods.

on(event, listener)

1 connection.on(RealtimeEvents.COMMITTED_TRANSCRIPT, (data) => {
2   console.log("Committed:", data.text);
3 });

off(event, listener)

Remove a previously registered event listener.

1 const handler = (data) => console.log(data.text);
2 connection.on(RealtimeEvents.COMMITTED_TRANSCRIPT, handler);
3 
4 // Later
5 connection.off(RealtimeEvents.COMMITTED_TRANSCRIPT, handler);

send(data)

Send audio data to Scribe (manual audio mode only).

1 connection.send({
2   audioBase64: base64AudioChunk,
3   commit: false, // Optional: commit immediately
4   sampleRate: 16000, // Optional: override sample rate
5   previousText: "Previous transcription text", // Optional: context from a previous transcription
6 });

The previousText field can only be sent in the first audio chunk of a session. Sending it in subsequent chunks results in an error.

commit()

Manually commit the current transcription. Only needed when using CommitStrategy.MANUAL.

1 connection.commit();

close()

Close the WebSocket connection and clean up resources (microphone stream, audio context).

1 connection.close();

Events

Register event listeners using connection.on(event, listener). All events are available as constants on the RealtimeEvents enum.

Transcription events

Event	Data	Description
SESSION_STARTED	`{ session_id: string }`	Scribe session started.
PARTIAL_TRANSCRIPT	`{ text: string }`	Interim transcription result.
COMMITTED_TRANSCRIPT	`{ text: string }`	Finalized transcription result.
COMMITTED_TRANSCRIPT_WITH_TIMESTAMPS	`{ text: string; language_code?: string; words?: WordsItem[] }`	Finalized result with word-level timing.

The WordsItem type contains word-level timing information:

1 interface WordsItem {
2   text?: string; // Word text
3   start?: number; // Start time in seconds
4   end?: number; // End time in seconds
5   type?: "word" | "spacing"; // Token type
6   speaker_id?: string; // Speaker identifier
7 }

Connection events

Event	Data	Description
OPEN	`Event`	WebSocket connection opened.
CLOSE	`Event`	WebSocket connection closed.
ERROR	`Error \| Event`	Generic error.

Error events

All error events receive { error: string }.

Event	Description
AUTH_ERROR	Authentication error.
QUOTA_EXCEEDED	Usage quota exceeded.
COMMIT_THROTTLED	Commit request throttled.
TRANSCRIBER_ERROR	Transcription engine error.
UNACCEPTED_TERMS	Terms of service not accepted.
RATE_LIMITED	Rate limited.
INPUT_ERROR	Invalid input format.
QUEUE_OVERFLOW	Processing queue full.
RESOURCE_EXHAUSTED	Server resources at capacity.
SESSION_TIME_LIMIT_EXCEEDED	Maximum session time reached.
CHUNK_SIZE_EXCEEDED	Audio chunk too large.
INSUFFICIENT_AUDIO_ACTIVITY	Not enough audio activity to maintain the connection.

Commit strategies

Control when transcriptions are committed:

1 import { Scribe, CommitStrategy } from '@elevenlabs/client';
2 
3 // Manual (default): you control when to commit
4 const connection = Scribe.connect({
5   token,
6   modelId: 'scribe_v2_realtime',
7   audioFormat: AudioFormat.PCM_16000,
8   sampleRate: 16000,
9   commitStrategy: CommitStrategy.MANUAL,
10 });
11 
12 // Send audio, then commit when ready
13 connection.send({ audioBase64: chunk });
14 connection.commit();
15 
16 // Voice Activity Detection: Scribe detects silences and commits automatically
17 const connection = Scribe.connect({
18   token,
19   modelId: 'scribe_v2_realtime',
20   microphone: { echoCancellation: true },
21   commitStrategy: CommitStrategy.VAD,
22 });

For more details, see Transcripts and commit strategies.

Complete example

Here is a complete example that transcribes microphone audio with VAD-based commit strategy:

1 import { Scribe, RealtimeEvents, CommitStrategy } from "@elevenlabs/client";
2 
3 async function startTranscription() {
4   const token = await fetchToken();
5 
6   const connection = Scribe.connect({
7     token,
8     modelId: "scribe_v2_realtime",
9     commitStrategy: CommitStrategy.VAD,
10     microphone: {
11       echoCancellation: true,
12       noiseSuppression: true,
13     },
14   });
15 
16   connection.on(RealtimeEvents.SESSION_STARTED, (data) => {
17     console.log("Session started:", data.session_id);
18   });
19 
20   connection.on(RealtimeEvents.PARTIAL_TRANSCRIPT, (data) => {
21     document.getElementById("live").textContent = data.text;
22   });
23 
24   connection.on(RealtimeEvents.COMMITTED_TRANSCRIPT, (data) => {
25     const el = document.createElement("p");
26     el.textContent = data.text;
27     document.getElementById("transcripts").appendChild(el);
28     document.getElementById("live").textContent = "";
29   });
30 
31   connection.on(RealtimeEvents.ERROR, (error) => {
32     console.error("Scribe error:", error);
33   });
34 
35   // Stop button
36   document.getElementById("stop").addEventListener("click", () => {
37     connection.close();
38   });
39 }
40 
41 document.getElementById("start").addEventListener("click", startTranscription);