JavaScript SDK reference

Classes, methods, and events for the Speech Engine JavaScript SDK.

This page documents the public API for the Speech Engine JavaScript SDK (@elevenlabs/elevenlabs-js).

Getting a Speech Engine resource

Retrieve a SpeechEngineResource by its engine ID. The returned object provides methods to attach to an existing HTTP server, start a standalone server, or create individual sessions.

1import { ElevenLabsClient } from "@elevenlabs/elevenlabs-js";
2
3const elevenlabs = new ElevenLabsClient();
4const engine = await elevenlabs.speechEngine.get("seng_8k3m9xr4hjnfg983brhmhkd98n6");

SpeechEngineResource

Properties

PropertyTypeDescription
engineIdstringThe ID of the speech engine.

attach

Attach to an existing Node.js HTTP server and begin accepting Speech Engine connections at the given path. Use this when you already have an HTTP server (e.g. Express, Fastify, or a plain http.createServer()) and want to add Speech Engine alongside your existing routes.

Handles WebSocket upgrades, path routing, and request verification automatically. Returns a SpeechEngineAttachment whose close() method stops accepting connections without affecting the HTTP server.

1const attachment = engine.attach(httpServer, "/ws", {
2 debug: true,
3 onTranscript(transcript, signal, session) {
4 session.sendResponse(stream);
5 },
6});
ParameterTypeDescription
httpServerhttp.ServerThe Node.js HTTP server to attach to.
pathstringURL path to handle WebSocket upgrades on.
handlerSpeechEngineCallbacksCallback object (see Callbacks).

A shortcut is available directly on the client, combining get() and attach() into a single call:

1await elevenlabs.speechEngine.attach("seng_8k3m9xr4hjnfg983brhmhkd98n6", httpServer, "/ws", {
2 onTranscript(transcript, signal, session) {
3 session.sendResponse(stream);
4 },
5});

verifyRequest

Verify that an incoming request originates from the ElevenLabs Speech Engine API. Checks the X-Elevenlabs-Speech-Engine-Authorization header for a valid JWT signed with the SHA-256 hash of your API key.

Only needed when managing the WebSocket upgrade yourself. When using attach() or SpeechEngineServer, verification is handled automatically.

1const isValid = await engine.verifyRequest(req);
ParameterTypeDescription
req{ headers: Record<string, string | string[] | undefined> }Incoming HTTP request object.

Returns: Promise<boolean>true if the request is valid.

createSession

Wrap an accepted WebSocket in a SpeechEngineSession. Use this for custom server integration or manual WebSocket handling.

1const session = engine.createSession(ws, { debug: true });
2session.on("user_transcript", (transcript, signal) => {
3 /* ... */
4});
ParameterTypeDefaultDescription
wsWebSocketAn accepted WebSocket connection.
options.debugbooleanfalseEnable debug logging.

Returns: SpeechEngineSession

SpeechEngineServer

A standalone WebSocket server that accepts Speech Engine connections without requiring an existing HTTP server. Use this when your server’s only purpose is handling Speech Engine connections.

For integration with an existing HTTP server (e.g. Express, Fastify), use engine.attach() instead.

1import { SpeechEngine } from "@elevenlabs/elevenlabs-js";
2
3const server = new SpeechEngine.Server({
4 port: 3001,
5 debug: true,
6 onTranscript(transcript, signal, session) {
7 session.sendResponse(stream);
8 },
9});
10
11server.start();

Constructor options

ParameterTypeDefaultDescription
portnumber3001Port to listen on.
apiKeystringElevenLabs API key for verifying connections. Falls back to the ELEVENLABS_API_KEY environment variable.
engineIdstringThe speech engine ID. Populated automatically when created via the resource.
SpeechEngineCallbacksAll callback options (onInit, onTranscript, onClose, onDisconnect, onError, debug). See Callbacks.

start

Start the standalone WebSocket server on the configured port. Verifies each incoming connection against the ElevenLabs API using the configured API key.

1server.start();

stop

Stop the WebSocket server and close all active connections.

1await server.stop();

handleConnection

Wrap an existing WebSocket in a SpeechEngineSession with the server’s callbacks wired up. Use this when you manage your own WebSocket server and want to wrap individual connections.

1const session = server.handleConnection(ws);
ParameterTypeDescription
wsWebSocketAn accepted WebSocket connection.

Returns: SpeechEngineSession

SpeechEngineSession

Wraps a single WebSocket connection. Each connection represents one conversation. The session emits events for transcripts and lifecycle changes, and provides methods to send LLM responses back.

When a new transcript arrives, the previous transcript handler’s abort signal is fired, interrupting any in-flight LLM call.

Properties

PropertyTypeDescription
conversationIdstringThe conversation ID assigned by the API. Available after init.
isOpenbooleanWhether the session is still open.

on

Register a handler for an event. Returns the session for chaining.

1session.on("user_transcript", (transcript, signal) => {
2 /* ... */
3});

off

Remove a previously registered handler.

1session.off("user_transcript", listener);

once

Register a handler that fires once then removes itself.

1session.once("init", (conversationId) => {
2 /* ... */
3});

sendResponse

Send an LLM response back to the Speech Engine API for text-to-speech synthesis. Must be called inside an onTranscript handler. Calling it outside of a handler emits a warning and returns without sending.

1// String response
2session.sendResponse("Hello, how can I help?");
3
4// Streamed response (OpenAI, Anthropic, or Gemini)
5const stream = await openai.responses.create(
6 { model: "gpt-4o", input: messages, stream: true },
7 { signal }
8);
9session.sendResponse(stream);
ParameterTypeDescription
responsestring | AsyncIterable<unknown>A complete string or an async iterable of text chunks / LLM stream events.

The SDK auto-detects and extracts text from the following LLM stream formats:

ProviderEvent format
OpenAI Responses API{ type: "response.output_text.delta", delta: "text" }
OpenAI Chat Completions{ choices: [{ delta: { content: "text" } }] }
Anthropic Messages API{ type: "content_block_delta", delta: { type: "text_delta", text: "text" } }
Google Gemini API{ candidates: [{ content: { parts: [{ text: "text" }] } }] }

close

Close the session and the underlying WebSocket connection.

1session.close();

SpeechEngineAttachment

Returned by engine.attach(). Controls the lifecycle of the WebSocket server without affecting the HTTP server it was attached to.

close

Stop accepting new connections, remove the upgrade listener from the HTTP server, and close the underlying WebSocket server.

1await attachment.close();

Callbacks

The callback object passed to attach() or SpeechEngineServer. All callbacks are optional.

CallbackSignatureDescription
onInit(conversationId: string, session: Session) => voidSession initialized with a conversation ID.
onTranscript(transcript: TranscriptMessage[], signal: AbortSignal, session: Session) => voidUser speech transcribed.
onClose(session: Session) => voidClean disconnect from ElevenLabs.
onDisconnect(session: Session) => voidWebSocket dropped unexpectedly.
onError(error: Error, session: Session) => voidProtocol or WebSocket error.
debugbooleanEnable debug logging.

The onTranscript handler receives an AbortSignal that fires when the user interrupts mid-response.

Events

When using session.on() directly instead of callbacks, these are the event names and their handler signatures.

EventHandler signature
user_transcript(transcript: TranscriptMessage[], signal: AbortSignal)
init(conversationId: string)
close()
disconnected()
error(error: Error)

Event name constants are available for type-safe usage:

1import { SpeechEngine } from "@elevenlabs/elevenlabs-js";
2
3session.on(SpeechEngine.USER_TRANSCRIPT, (transcript, signal) => {
4 /* ... */
5});

TranscriptMessage

A single message in the conversation history. The full transcript is passed to onTranscript on every turn.

PropertyTypeDescription
role"user" | "agent"Who sent the message.
contentstringThe text content of the message.

Wire protocol

For reference, these are the JSON messages exchanged over the WebSocket connection. The SDK handles serialization and deserialization automatically.

Incoming (ElevenLabs API to developer server)

Message typeFieldsDescription
initconversation_id: stringSession initialized.
user_transcriptuser_transcript: TranscriptMessage[], event_id: numberUser speech transcribed.
pingKeep-alive. SDK responds with pong.
closeClean disconnect.
errormessage: stringError from the API.

Outgoing (developer server to ElevenLabs API)

Message typeFieldsDescription
agent_responsecontent: string, event_id: number, is_final: booleanLLM response chunk for TTS synthesis.
pongResponse to ping.