Streaming

The ElevenLabs API supports real-time audio streaming for select endpoints, returning raw audio bytes (e.g., MP3 data) directly over HTTP using chunked transfer encoding. This allows clients to process or play audio incrementally as it is generated.

Our official Node and Python libraries include utilities to simplify handling this continuous audio stream.

Streaming is supported for the Text to Speech API, Voice Changer API & Audio Isolation API. This section focuses on how streaming works for requests made to the Text to Speech API.

In Python, a streaming request looks like:

1from elevenlabs import stream
2from elevenlabs.client import ElevenLabs
3
4client = ElevenLabs()
5
6audio_stream = client.text_to_speech.convert_as_stream(
7 text="This is a test",
8 voice_id="JBFqnCBsd6RMkjVDRZzb",
9 model_id="eleven_multilingual_v2"
10)
11
12# option 1: play the streamed audio locally
13stream(audio_stream)
14
15# option 2: process the audio bytes manually
16for chunk in audio_stream:
17 if isinstance(chunk, bytes):
18 print(chunk)

In Node / Typescript, a streaming request looks like:

1import { ElevenLabsClient, stream } from 'elevenlabs';
2import { Readable } from 'stream';
3
4const client = new ElevenLabsClient();
5
6async function main() {
7 const audioStream = await client.textToSpeech.convertAsStream('JBFqnCBsd6RMkjVDRZzb', {
8 text: 'This is a test',
9 model_id: 'eleven_multilingual_v2',
10 });
11
12 // option 1: play the streamed audio locally
13 await stream(Readable.from(audioStream));
14
15 // option 2: process the audio manually
16 for await (const chunk of audioStream) {
17 console.log(chunk);
18 }
19}
20
21main();
Built with