JavaScript SDK
Scribe: real-time speech-to-text transcription in JavaScript
For an overview of Scribe and its capabilities, see the Speech to Text overview. For step-by-step usage guides, see Client-side streaming.
Installation
Use the ElevenLabs speech-to-text skill to transcribe audio from your AI coding assistant:
This library can be used in any JavaScript-based project. If you are using React, consider the
useScribe hook which provides built-in
state management and lifecycle handling.
Usage
Here is a minimal working example that connects to Scribe and logs transcription results:
Getting a token
Scribe requires a single-use token for authentication. Create an API endpoint on your server:
Your ElevenLabs API key is sensitive. Never expose it to the client. Always generate the token on the server.
Connection options
Scribe.connect() accepts either microphone options or manual audio options. Both share a common set of base options.
Base options
Microphone options
Pass a microphone object to stream audio directly from the user’s microphone. The connection handles getUserMedia and audio encoding automatically.
Manual audio options
Pass audioFormat and sampleRate to send audio data manually via connection.send().
AudioFormat enum
Microphone mode
Stream audio directly from the user’s microphone:
Manual audio mode (file transcription)
Transcribe pre-recorded audio files by sending audio data manually:
RealtimeConnection
Scribe.connect() returns a RealtimeConnection instance with the following methods.
on(event, listener)
Register an event listener. See Events for available event types.
off(event, listener)
Remove a previously registered event listener.
send(data)
Send audio data to Scribe (manual audio mode only).
The previousText field can only be sent in the first audio chunk of a session. Sending it in
subsequent chunks results in an error.
commit()
Manually commit the current transcription. Only needed when using CommitStrategy.MANUAL.
close()
Close the WebSocket connection and clean up resources (microphone stream, audio context).
Events
Register event listeners using connection.on(event, listener). All events are available as constants on the RealtimeEvents enum.
Transcription events
The WordsItem type contains word-level timing information:
Connection events
Error events
All error events receive { error: string }.
Commit strategies
Control when transcriptions are committed:
For more details, see Transcripts and commit strategies.
Complete example
Here is a complete example that transcribes microphone audio with VAD-based commit strategy: