For AI agents: a documentation index is available at the root level at /llms.txt and /llms-full.txt. Append /llms.txt to any URL for a page-level index, or .md for the markdown version of any page.
Connect
BlogHelp CenterAPI PricingSign up
OverviewElevenCreativeElevenAgentsElevenAPIReception AIAPI referenceChangelog
OverviewElevenCreativeElevenAgentsElevenAPIReception AIAPI referenceChangelog
  • Get started
    • Overview
    • Quickstart
  • Configure
    • Overview
    • Voice & language
    • Knowledge base
    • Tools
    • Personalization
    • Authentication
  • Deploy
    • Overview
    • Environment variables
    • WhatsApp
    • Batch calls
  • Monitor
    • Overview
    • Users
    • Testing
    • Experiments
    • Versioning
    • Conversation Analysis
    • Analytics
    • Real-time monitoring
    • OpenTelemetry traces
    • Privacy
    • Cost optimization
    • CLI
  • Advanced
    • Events
    • Custom models
    • LLM cascading
    • Post-call webhooks
  • Resources
      • Python
      • React
      • React Native
      • JavaScript
      • Kotlin
      • Swift
      • WebSockets
    • UI components
  • Guides
    • Chat Mode
    • Burst pricing
    • ElevenLabs' docs agent
    • Scaling user interviews
    • Simulate Conversations
LogoLogo
Login
Login
Connect
BlogHelp CenterAPI PricingSign up
On this page
  • Authentication
  • Using Agent ID
  • Using a signed URL
  • Example using cURL
  • WebSocket events
  • Client to server events
  • Next.js implementation example
  • Next steps
  • Latency management
  • Security best practices
  • Additional resources
ResourcesSDKs

WebSocket

Create real-time, interactive voice conversations with AI agents

Was this page helpful?
Previous

Next.JS

Learn how to create a web application that enables voice conversations with ElevenLabs AI agents
Next
Built with

This documentation is for developers integrating directly with the ElevenLabs WebSocket API. For convenience, consider using the official SDKs provided by ElevenLabs.

The ElevenAgents WebSocket API enables real-time, interactive voice conversations with AI agents. By establishing a WebSocket connection, you can send audio input and receive audio responses in real-time, creating life-like conversational experiences.

Endpoint: wss://api.elevenlabs.io/v1/convai/conversation?agent_id={agent_id}

Authentication

Using Agent ID

For public agents, you can directly use the agent_id in the WebSocket URL without additional authentication:

$wss://api.elevenlabs.io/v1/convai/conversation?agent_id=<your-agent-id>

Using a signed URL

For private agents or conversations requiring authorization, obtain a signed URL from your server, which securely communicates with the ElevenLabs API using your API key.

Example using cURL

Request:

$curl -X GET "https://api.elevenlabs.io/v1/convai/conversation/get-signed-url?agent_id=<your-agent-id>" \
> -H "xi-api-key: <your-api-key>"

Response:

1{
2 "signed_url": "wss://api.elevenlabs.io/v1/convai/conversation?agent_id=<your-agent-id>&token=<token>"
3}
Never expose your ElevenLabs API key on the client side.

WebSocket events

Client to server events

The following events can be sent from the client to the server:

Contextual Updates

Send non-interrupting contextual information to update the conversation state. This allows you to provide additional context without disrupting the ongoing conversation flow.

1{
2 "type": "contextual_update",
3 "text": "User clicked on pricing page"
4}

Use cases:

  • Updating user status or preferences
  • Providing environmental context
  • Adding background information
  • Tracking user interface interactions

Key points:

  • Does not interrupt current conversation flow
  • Updates are incorporated as tool calls in conversation history
  • Helps maintain context without breaking the natural dialogue

Contextual updates are processed asynchronously and do not require a direct response from the server.

WebSocket API Reference

See the ElevenLabs Agents WebSocket API reference documentation for detailed message structures, parameters, and examples.

Next.js implementation example

This example demonstrates how to implement a WebSocket-based conversational agent client in Next.js using the ElevenLabs WebSocket API.

While this example uses the voice-stream package for microphone input handling, you can implement your own solution for capturing and encoding audio. The focus here is on demonstrating the WebSocket connection and event handling with the ElevenLabs API.

1

Install required dependencies

First, install the necessary packages:

$npm install voice-stream

The voice-stream package handles microphone access and audio streaming, automatically encoding the audio in base64 format as required by the ElevenLabs API.

This example uses Tailwind CSS for styling. To add Tailwind to your Next.js project:

$npm install -D tailwindcss postcss autoprefixer
$npx tailwindcss init -p

Then follow the official Tailwind CSS setup guide for Next.js.

Alternatively, you can replace the className attributes with your own CSS styles.

2

Create WebSocket types

Define the types for WebSocket events:

app/types/websocket.ts
1type BaseEvent = {
2 type: string;
3};
4
5type UserTranscriptEvent = BaseEvent & {
6 type: "user_transcript";
7 user_transcription_event: {
8 user_transcript: string;
9 };
10};
11
12type AgentResponseEvent = BaseEvent & {
13 type: "agent_response";
14 agent_response_event: {
15 agent_response: string;
16 };
17};
18
19type AgentResponseCorrectionEvent = BaseEvent & {
20 type: "agent_response_correction";
21 agent_response_correction_event: {
22 original_agent_response: string;
23 corrected_agent_response: string;
24 };
25};
26
27type AudioResponseEvent = BaseEvent & {
28 type: "audio";
29 audio_event: {
30 audio_base_64: string;
31 event_id: number;
32 alignment: {
33 chars: string[];
34 char_durations_ms: number[];
35 char_start_times_ms: number[];
36 };
37 };
38};
39
40type InterruptionEvent = BaseEvent & {
41 type: "interruption";
42 interruption_event: {
43 reason: string;
44 };
45};
46
47type PingEvent = BaseEvent & {
48 type: "ping";
49 ping_event: {
50 event_id: number;
51 ping_ms?: number;
52 };
53};
54
55type AgentChatResponsePartEvent = BaseEvent & {
56 type: "agent_chat_response_part";
57 text_response_part: {
58 type: "start" | "delta" | "stop";
59 text: string;
60 event_id: string;
61 };
62};
63
64export type ElevenLabsWebSocketEvent =
65 | UserTranscriptEvent
66 | AgentResponseEvent
67 | AgentResponseCorrectionEvent
68 | AudioResponseEvent
69 | InterruptionEvent
70 | PingEvent
71 | AgentChatResponsePartEvent;
3

Create WebSocket hook

Create a custom hook to manage the WebSocket connection:

app/hooks/useAgentConversation.ts
1'use client';
2
3import { useCallback, useEffect, useRef, useState } from 'react';
4import { useVoiceStream } from 'voice-stream';
5import type { ElevenLabsWebSocketEvent } from '../types/websocket';
6
7const sendMessage = (websocket: WebSocket, request: object) => {
8 if (websocket.readyState !== WebSocket.OPEN) {
9 return;
10 }
11 websocket.send(JSON.stringify(request));
12};
13
14export const useAgentConversation = () => {
15 const websocketRef = useRef<WebSocket>(null);
16 const [isConnected, setIsConnected] = useState<boolean>(false);
17
18 const { startStreaming, stopStreaming } = useVoiceStream({
19 onAudioChunked: (audioData) => {
20 if (!websocketRef.current) return;
21 sendMessage(websocketRef.current, {
22 user_audio_chunk: audioData,
23 });
24 },
25 });
26
27 const startConversation = useCallback(async () => {
28 if (isConnected) return;
29
30 const websocket = new WebSocket("wss://api.elevenlabs.io/v1/convai/conversation");
31
32 websocket.onopen = async () => {
33 setIsConnected(true);
34 sendMessage(websocket, {
35 type: "conversation_initiation_client_data",
36 });
37 await startStreaming();
38 };
39
40 websocket.onmessage = async (event) => {
41 const data = JSON.parse(event.data) as ElevenLabsWebSocketEvent;
42
43 // Handle ping events to keep connection alive
44 if (data.type === "ping") {
45 setTimeout(() => {
46 sendMessage(websocket, {
47 type: "pong",
48 event_id: data.ping_event.event_id,
49 });
50 }, data.ping_event.ping_ms);
51 }
52
53 if (data.type === "user_transcript") {
54 const { user_transcription_event } = data;
55 console.log("User transcript", user_transcription_event.user_transcript);
56 }
57
58 if (data.type === "agent_response") {
59 const { agent_response_event } = data;
60 console.log("Agent response", agent_response_event.agent_response);
61 }
62
63 if (data.type === "agent_response_correction") {
64 const { agent_response_correction_event } = data;
65 console.log("Agent response correction", agent_response_correction_event.corrected_agent_response);
66 }
67
68 if (data.type === "interruption") {
69 // Handle interruption
70 }
71
72 if (data.type === "audio") {
73 const { audio_event } = data;
74 // Implement your own audio playback system here
75 // Note: You'll need to handle audio queuing to prevent overlapping
76 // as the WebSocket sends audio events in chunks
77 }
78
79 if (data.type === "agent_chat_response_part") {
80 const { text_response_part } = data;
81 const { type: partType, text, event_id } = text_response_part;
82 // Handle streaming text chunks during text-only conversations
83 console.log("Chat response part:", partType, text, event_id);
84 }
85 };
86
87 websocketRef.current = websocket;
88
89 websocket.onclose = async () => {
90 websocketRef.current = null;
91 setIsConnected(false);
92 stopStreaming();
93 };
94 }, [startStreaming, isConnected, stopStreaming]);
95
96 const stopConversation = useCallback(async () => {
97 if (!websocketRef.current) return;
98 websocketRef.current.close();
99 }, []);
100
101 useEffect(() => {
102 return () => {
103 if (websocketRef.current) {
104 websocketRef.current.close();
105 }
106 };
107 }, []);
108
109 return {
110 startConversation,
111 stopConversation,
112 isConnected,
113 };
114};
4

Create the conversation component

Create a component to use the WebSocket hook:

app/components/Conversation.tsx
1'use client';
2
3import { useCallback } from 'react';
4import { useAgentConversation } from '../hooks/useAgentConversation';
5
6export function Conversation() {
7 const { startConversation, stopConversation, isConnected } = useAgentConversation();
8
9 const handleStart = useCallback(async () => {
10 try {
11 await navigator.mediaDevices.getUserMedia({ audio: true });
12 await startConversation();
13 } catch (error) {
14 console.error('Failed to start conversation:', error);
15 }
16 }, [startConversation]);
17
18 return (
19 <div className="flex flex-col items-center gap-4">
20 <div className="flex gap-2">
21 <button
22 onClick={handleStart}
23 disabled={isConnected}
24 className="px-4 py-2 bg-blue-500 text-white rounded disabled:bg-gray-300"
25 >
26 Start Conversation
27 </button>
28 <button
29 onClick={stopConversation}
30 disabled={!isConnected}
31 className="px-4 py-2 bg-red-500 text-white rounded disabled:bg-gray-300"
32 >
33 Stop Conversation
34 </button>
35 </div>
36 <div className="flex flex-col items-center">
37 <p>Status: {isConnected ? 'Connected' : 'Disconnected'}</p>
38 </div>
39 </div>
40 );
41}

Next steps

  1. Audio Playback: Implement your own audio playback system using Web Audio API or a library. Remember to handle audio queuing to prevent overlapping as the WebSocket sends audio events in chunks.
  2. Error Handling: Add retry logic and error recovery mechanisms
  3. UI Feedback: Add visual indicators for voice activity and connection status

Latency management

To ensure smooth conversations, implement these strategies:

  • Adaptive Buffering: Adjust audio buffering based on network conditions.
  • Jitter Buffer: Implement a jitter buffer to smooth out variations in packet arrival times.
  • Ping-Pong Monitoring: Use ping and pong events to measure round-trip time and adjust accordingly.

Security best practices

  • Rotate API keys regularly and use environment variables to store them.
  • Implement rate limiting to prevent abuse.
  • Clearly explain the intention when prompting users for microphone access.
  • Optimized Chunking: Tweak the audio chunk duration to balance latency and efficiency.

Additional resources

  • ElevenLabs Agents Documentation
  • ElevenLabs Agents SDKs