This documentation is for developers integrating directly with the ElevenLabs WebSocket API. For convenience, consider using the official SDKs provided by ElevenLabs.

The ElevenLabs Conversational AI WebSocket API enables real-time, interactive voice conversations with AI agents. By establishing a WebSocket connection, you can send audio input and receive audio responses in real-time, creating life-like conversational experiences.

Endpoint: wss://{agent_id}


Using Agent ID

For public agents, you can directly use the agent_id in the WebSocket URL without additional authentication:


Using a Signed URL

For private agents or conversations requiring authorization, obtain a signed URL from your server, which securely communicates with the ElevenLabs API using your API key.

Example using cURL


curl -X GET "<your-agent-id>" \
     -H "xi-api-key: <your-api-key>"


  "signed_url": "wss://<your-agent-id>&token=<token>"
Never expose your ElevenLabs API key on the client side.


Client-to-Server Messages

User Audio Chunk

Send audio data from the user to the server.


  "user_audio_chunk": "<base64-encoded-audio-data>"


  • Audio Format Requirements:

    • PCM 16-bit mono format
    • Base64 encoded
    • Sample rate of 16,000 Hz
  • Recommended Chunk Duration:

    • Send audio chunks approximately every 250 milliseconds (0.25 seconds)
    • This equates to chunks of about 4,000 samples at a 16,000 Hz sample rate
  • Optimizing Latency and Efficiency:

    • Balance Latency and Efficiency: Sending audio chunks every 250 milliseconds offers a good trade-off between responsiveness and network overhead.
    • Adjust Based on Needs:
      • Lower Latency Requirements: Decrease the chunk duration to send smaller chunks more frequently.
      • Higher Efficiency Requirements: Increase the chunk duration to send larger chunks less frequently.
    • Network Conditions: Adapt the chunk size if you experience network constraints or variability.

Pong Message

Respond to server ping messages by sending a pong message, ensuring the event_id matches the one received in the ping message.


  "type": "pong",
  "event_id": 12345

Server-to-Client Messages


Provides initial metadata about the conversation.


  "type": "conversation_initiation_metadata",
  "conversation_initiation_metadata_event": {
    "conversation_id": "conv_123456789",
    "agent_output_audio_format": "pcm_16000"

Other Server-to-Client Messages

user_transcriptTranscriptions of the user’s speech
agent_responseAgent’s textual response
audioChunks of the agent’s audio response
interruptionIndicates that the agent’s response was interrupted
pingServer pings to measure latency
Message Formats


  "type": "user_transcript",
  "user_transcription_event": {
    "user_transcript": "Hello, how are you today?"


  "type": "agent_response",
  "agent_response_event": {
    "agent_response": "Hello! I'm doing well, thank you for asking. How can I assist you today?"


  "type": "audio",
  "audio_event": {
    "audio_base_64": "SGVsbG8sIHRoaXMgaXMgYSBzYW1wbGUgYXVkaW8gY2h1bms=",
    "event_id": 67890


  "type": "interruption",
  "interruption_event": {
    "event_id": 54321


  "type": "internal_tentative_agent_response",
  "tentative_agent_response_internal_event": {
    "tentative_agent_response": "I'm thinking about how to respond..."


  "type": "ping",
  "ping_event": {
    "event_id": 13579,
    "ping_ms": 50

Latency Management

To ensure smooth conversations, implement these strategies:

  • Adaptive Buffering: Adjust audio buffering based on network conditions.
  • Jitter Buffer: Implement a jitter buffer to smooth out variations in packet arrival times.
  • Ping-Pong Monitoring: Use ping and pong events to measure round-trip time and adjust accordingly.

Security Best Practices

  • Rotate API keys regularly and use environment variables to store them.
  • Implement rate limiting to prevent abuse.
  • Clearly explain the intention when prompting users for microphone access.
  • Optimized Chunking: Tweak the audio chunk duration to balance latency and efficiency.

Additional Resources