Text to Speech
- Eleven v3 (alpha): Released Eleven v3 (alpha), our most expressive Text to Speech model, as a research preview.
Conversational AI
- Custom voice settings in multi-voice: Added support for configuring individual voice settings per supported voice in multi-voice agents, allowing fine-tuned control over stability, speed, similarity boost, and streaming latency for each voice.
- Silent transfer to human in Twilio: Added backend configuration support for silent (cold) transfer to human in the Twilio native integration, enabling seamless handoff without announcing the transfer to callers.
- Batch calling retry and cancel: Added support for retrying outbound calls to phone numbers that did not respond during a batch call, along with the ability to cancel ongoing batch operations for better campaign management.
- LLM pinning: Added support for versioned LLM models with explicit checkpoint identifiers
- Custom LLM headers: Added support for passing custom headers to custom LLMs
- Fixed issue in non-latin languages: Fixed an issue causing some conversations in non latin alphabet languages to fail.
SDKs
- Python SDK v2.3.0: Released Python SDK v2.3.0
- JavaScript SDK v2.2.0: Released JavaScript SDK v2.2.0
API
View API changes
New Endpoints
Conversational AI
-
Batch Calling:
- Cancel batch call - Cancel a running batch call and set all recipients to cancelled status
- Retry batch call - Retry a batch call by setting completed recipients back to pending status
-
Knowledge Base RAG:
- Get document RAG indexes - Get information about all RAG indexes of a knowledge base document
- Delete document RAG index - Delete a specific RAG index for a knowledge base document
- RAG index overview - Get total size and information of RAG indexes used by knowledge base documents
Workspace
- Update user auto provisioning - Update user auto provisioning settings for the workspace
Updated Endpoints
Conversational AI
-
Supported Voices:
- Agent configuration - Added
optimize_streaming_latency
,stability
,speed
, andsimilarity_boost
parameters for per-voice TTS customization
- Agent configuration - Added
-
Transfer to Human:
- Agent configuration - Added
enable_client_message
parameter to control whether a message is played to the client during transfer
- Agent configuration - Added
-
Knowledge Base:
- Knowledge base documents now use
supported_usages
instead ofprompt_injectable
for better usage mode control - RAG index creation now returns enhanced response model with usage information
- Knowledge base documents now use
-
Custom LLM:
- Agent configuration - Added
request_headers
parameter for custom header configuration
- Agent configuration - Added
-
Widget Configuration:
- Agent platform settings - Added comprehensive
styles
configuration for widget appearance customization
- Agent platform settings - Added comprehensive
-
LLM:
- Added support for versioned LLM models with explicit version identifiers
Conversational AI
- Multi-voice support for agents: Enable conversational AI agents to dynamically switch between different voices during conversations for multi-character storytelling, language tutoring, and role-playing scenarios.
- Claude Sonnet 4 support: Added Claude Sonnet 4 as a new LLM option for conversational agents, providing enhanced reasoning capabilities and improved performance.
- Genesys Cloud integration: Introduced AudioHook Protocol integration for seamless connection with Genesys Cloud contact center platform.
- Force delete knowledge base documents: Added
force
parameter to knowledge base document deletion, allowing removal of documents even when used by agents. - Multimodal widget: Added text input and text-only mode defaults for better user experience with improved widget configuration.
API
View API changes
Updated Endpoints
Speech to Text
- Create transcript - Added
webhook
parameter for asynchronous processing with webhook delivery
Conversational AI
-
Knowledge Base:
- Delete knowledge base document - Added
force
query parameter to delete documents regardless of agent dependencies
- Delete knowledge base document - Added
-
Widget:
- Widget configuration - Added text input and text-only mode support for multi-modality
Forced Aligment
- Forced alignment improvements: Fixed a rare failure case in forced alignment processing to improve reliability.
Voices
- Live moderated voices filter: Added
include_live_moderated
query parameter to the shared voices endpoint, allowing you to include or exclude voices that are live moderated.
Conversational AI
- Secret dynamic variables: Added support for specifying dynamic variables as secrets with the
secret__
prefix. Secret dynamic variables can only be used in webhook tool headers and are never sent to an LLM, enhancing security for sensitive data. Learn more. - Skip turn system tool: Introduced a new system tool called skip_turn. When enabled, the agent will skip its turn if the user explicitly indicates they need a moment to think or perform an action (e.g., “just a sec”, “give me a minute”). This prevents turn timeout from being triggered during intentional user pauses. See the skip turn tool docs for more information.
- Text input support: Added text input support in websocket connections via “user_message” event with text field. Also added “user_activity” event support to indicate typing or other UI activity, improving agent turn-taking when there’s interleaved text and audio input.
- RAG chunk limit: Added ability to configure the maximum number of chunks collected during RAG retrieval, giving users more control over context window usage and costs.
- Enhanced widget configuration: Expanded widget customization options to include text input and text only mode.
- LLM usage calculator: Introduced tools to calculate expected LLM token usage and costs for agents, helping with cost estimation and planning.
Audio Native
- Accessibility improvements: Enhanced accessibility for the AudioNative player with multiple improvements:
- Added aria-labels for all buttons
- Enabled keyboard navigation for all interactive elements
- Made progress bar handle focusable and keyboard-accessible
- Improved focus indicator visibility for better screen reader compatibility
API
View API changes
New Endpoints
- Added 3 new endpoints:
- Get Agent Knowledge Base Size - Returns the number of pages in the agent’s knowledge base.
- Calculate Agent LLM Usage - Calculates expected number of LLM tokens needed for the specified agent.
- Calculate LLM Usage - Returns a list of LLM models and the expected cost for using them based on the provided values.
Updated Endpoints
Voices
- Get Shared Voices - Added
include_live_moderated
query parameter toGET /v1/shared-voices
to filter voices by live moderation status.
Conversational AI
-
Agent Configuration:
- Enhanced system tools with new
skip_turn
tool configuration - Improved RAG configuration with
max_retrieved_rag_chunks_count
parameter
- Enhanced system tools with new
-
Widget Configuration:
- Added support for text-only mode
-
Batch Calling:
- Batch call responses now include
phone_provider
field with default value “twilio”
- Batch call responses now include
Text to Speech
- Voice Settings:
- Added
quality
parameter to voice settings for controlling audio generation quality - Model response schema updated to include
can_use_quality
field
- Added
SDKs
- SDKs V2: Released new v2 SDKs for both Python and JavaScript
Speech to Text
- Speech to text logprobs: The Speech to Text response now includes a
logprob
field for word prediction confidence.
Billing
- Improved API error messages: Enhanced API error messages for subscriptions with failed payments. This provides clearer information if a failed payment has caused a user to reach their quota threshold sooner than expected.
Conversational AI
- Batch calls: Released new batch calling functionality, which allows you to automate groups of outbound calls.
- Increased evaluation criteria limit: The maximum number of evaluation criteria for agent performance evaluation has been increased from 5 to 10.
- Human-readable IDs: Introduced human-readable IDs for key Conversational AI entities (e.g., agents, conversations). This improves usability and makes resources easier to identify and manage through the API and UI.
- Unanswered call tracking: ‘Not Answered’ outbound calls are now reliably detected and visible in the conversation history.
- LLM cost visibility in dashboard: The Conversational AI dashboard now displays the total and per-minute average LLM costs.
- Zero retention mode (ZRM) for agents: Allowed enabling Zero Retention Mode (ZRM) per agent.
- Dynamic variables in headers: Added option of setting dynamic variable as a header value for tools
- Customisable tool timeouts: Shipped setting different timeout durations per tool.
Workspaces
- Simplified secret updates: Workspace secrets can now be updated more granularly using a
PATCH
request via the API, simplifying the management of individual secret values. For technical details, please see the API changes section below.
API
View API changes
New Endpoints
- Added 6 new endpoints:
- Get Signed Url - Get a signed URL to start a conversation with an agent that requires authorization.
- Simulate Conversation - Run a conversation between an agent and a simulated user.
- Simulate Conversation (Stream) - Run and stream a conversation simulation between an agent and a simulated user.
- Update Convai Workspace Secret - Update an existing secret for the Convai workspace.
- Submit Batch Call Request - Submit a batch call request to schedule calls for multiple recipients.
- Get All Batch Calls for Workspace - Retrieve all batch calls for the current workspace.
Updated Endpoints
Conversational AI
- Agents & Conversations:
- Endpoint
GET /v1/convai/conversation/get_signed_url
(snake_case path) has been deprecated. Use the newGET /v1/convai/conversation/get-signed-url
(kebab-case path) instead.
- Endpoint
- Phone Numbers:
- Get Phone Number Details - Response schema for
GET /v1/convai/phone-numbers/{phone_number_id}
updated to distinctTwilio
andSIPTrunk
provider details. - Update Phone Number - Response schema for
PATCH /v1/convai/phone-numbers/{phone_number_id}
updated similarly forTwilio
andSIPTrunk
. - List Phone Numbers - Response schema for
GET /v1/convai/phone-numbers/
list items updated forTwilio
andSIPTrunk
providers.
- Get Phone Number Details - Response schema for
Text To Speech
- Text to Speech Endpoints - Default
model_id
changed fromeleven_monolingual_v1
toeleven_multilingual_v2
for the following endpoints:POST /v1/text-to-speech/{voice_id}/stream
POST /v1/text-to-speech/{voice_id}/stream-with-timestamps
POST /v1/text-to-speech/{voice_id}
POST /v1/text-to-speech/{voice_id}/with-timestamps
Voices
- Get Shared Voices - Added
include_custom_rates
query parameter toGET /v1/shared-voices
. - Schema Updates:
LibraryVoiceResponseModel
andVoiceSharingResponseModel
now include an optionalfiat_rate
field (USD per 1000 credits).
Billing
- Downgraded Plan Pricing Fix: Fixed an issue where customers with downgraded subscriptions were shown their current price instead of the correct future price.
Conversational AI
- Edit Knowledge Base Document Names: You can now edit the names of knowledge base documents.
See: Knowledge Base - Conversation Simulation: Released a new endpoint that allows you to test an agent over text
Studio
- Export Paragraphs as Zip: Added support for exporting separated paragraphs in a zip file.
See: Studio
SDKs
- Released new SDKs:
API
View API changes
New Endpoints
-
Update metadata for a speaker
PATCH /v1/dubbing/resource/{dubbing_id}/speaker/{speaker_id}
Amend the metadata associated with a speaker, such as their voice. Both voice cloning and using voices from the ElevenLabs library are supported. -
Search similar voices for a speaker
GET /v1/dubbing/resource/{dubbing_id}/speaker/{speaker_id}/similar-voices
Fetch the top 10 similar voices to a speaker, including IDs, names, descriptions, and sample audio. -
Simulate a conversation
POST /v1/convai/agents/{agent_id}/simulate_conversation
Run a conversation between the agent and a simulated user. -
Simulate a conversation (stream)
POST /v1/convai/agents/{agent_id}/simulate_conversation/stream
Stream a simulated conversation between the agent and a simulated user. -
Handle outbound call via SIP trunk
POST /v1/convai/sip-trunk/outbound-call
Initiate an outbound call using SIP trunking.
Updated Endpoints
-
List conversations
GET /v1/convai/conversations
Addedcall_start_after_unix
query parameter to filter conversations by start date. -
Update knowledge base document
PATCH /v1/convai/knowledge-base/{documentation_id}
Now supports updating the name of a document. -
Text to Speech endpoints
The default model for all TTS endpoints is noweleven_multilingual_v2
(waseleven_monolingual_v1
).
Removed Endpoints
- None.
Dubbing
- Disable Voice Cloning: Added an option in the Dubbing Studio UI to disable voice cloning when uploading audio, aligning with the existing
disable_voice_cloning
API parameter.
Billing
- Quota Exceeded Error: Improved error messaging for exceeding character limits. Users attempting to generate audio beyond their quota within a short billing window will now receive a clearer
401 unauthorized: This request exceeds your quota limit of...
error message indicating the limit has been exceeded.
SDKs
- Released new SDKs: Added ElevenLabs Python v1.58.0 and ElevenLabs JS v1.58.0 to fix a breaking change that had been mistakenly shipped
Conversational AI
- Custom Dashboard Charts: The Conversational AI Dashboard can now be extended with custom charts displaying the results of evaluation criteria over time. See the new GET and PATCH endpoints for managing dashboard settings.
- Call History Filtering: Added the ability to filter the call history by start date using the new
call_start_before_unix
parameter in the List Conversations endpoint. Try it here. - Server Tools: Added option of making PUT requests in server tools
- Transfer to human: Added call forwarding functionality to support forwarding to operators, see docs here
- Language detection: Fixed an issue where the language detection system tool would trigger on a user replying yes in non-English language.
Usage Analytics
- Custom Aggregation: Added an optional
aggregation_interval
parameter to the Get Usage Metrics endpoint to control the interval over which to aggregate character usage (hour, day, week, month, or cumulative). - New Metric Breakdowns: The Usage Analytics section now supports additional metric breakdowns including
minutes_used
,request_count
,ttfb_avg
, andttfb_p95
, selectable via the newmetric
parameter in the Get Usage Metrics endpoint. Furthermore, you can now get a breakdown and filter byrequest_queue
.
API
View API changes
New Endpoints
- Added 2 new endpoints for managing Conversational AI dashboard settings:
- Get Dashboard Settings - Retrieves custom chart configurations for the ConvAI dashboard.
- Update Dashboard Settings - Updates custom chart configurations for the ConvAI dashboard.
Updated Endpoints
Audio Generation (TTS, S2S, SFX, Voice Design)
- Updated endpoints to support new
output_format
optionpcm_48000
:- Text to Speech (
POST /v1/text-to-speech/{voice_id}
) - Text to Speech with Timestamps (
POST /v1/text-to-speech/{voice_id}/with-timestamps
) - Text to Speech Stream (
POST /v1/text-to-speech/{voice_id}/stream
) - Text to Speech Stream with Timestamps (
POST /v1/text-to-speech/{voice_id}/stream/with-timestamps
) - Speech to Speech (
POST /v1/speech-to-speech/{voice_id}
) - Speech to Speech Stream (
POST /v1/speech-to-speech/{voice_id}/stream
) - Sound Generation (
POST /v1/sound-generation
) - Create Voice Previews (
POST /v1/text-to-voice/create-previews
)
- Text to Speech (
Usage Analytics
- Updated usage metrics endpoint:
- Get Usage Metrics (
GET /v1/usage/character-stats
) - Added optionalaggregation_interval
andmetric
query parameters.
- Get Usage Metrics (
Conversational AI
- Updated conversation listing endpoint:
- List Conversations (
GET /v1/convai/conversations
) - Added optionalcall_start_before_unix
query parameter for filtering by start date.
- List Conversations (
Schema Changes
Conversational AI
- Added detailed LLM usage and pricing information to conversation charging and history models.
- Added
tool_latency_secs
to tool result schemas - Added
access_info
toGET /v1/convai/agents/{agent_id}
Professional Voice Cloning (PVC)
- PVC API: Introduced a comprehensive suite of API endpoints for managing Professional Voice Clones (PVC). You can now programmatically create voices, add/manage/delete audio samples, retrieve audio/waveforms, manage speaker separation, handle verification, and initiate training. For a full list of new endpoints check the API changes summary below or read the PVC API reference here.
Speech to Text
- Enhanced Export Options: Added options to include or exclude timestamps and speaker IDs when exporting Speech to Text results in segmented JSON format via the API.
Conversational AI
- New LLM Models: Added support for new GPT-4.1 models:
gpt-4.1
,gpt-4.1-mini
, andgpt-4.1-nano
here - VAD Score: Added a new client event which sends VAD scores to the client, see reference here
Workspace
- Member Management: Added a new API endpoint to allow administrators to delete workspace members here
API
View API changes
New Endpoints
- Added 16 new endpoints:
- Delete Member - Allows deleting workspace members.
- Create PVC Voice - Creates a new PVC voice.
- Edit PVC Voice - Edits PVC voice metadata.
- Add Samples To PVC Voice - Adds audio samples to a PVC voice.
- Update PVC Voice Sample - Updates a PVC voice sample (noise removal, speaker selection, trimming).
- Delete PVC Voice Sample - Deletes a sample from a PVC voice.
- Retrieve Voice Sample Audio - Retrieves audio for a PVC voice sample.
- Retrieve Voice Sample Visual Waveform - Retrieves the visual waveform for a PVC voice sample.
- Retrieve Speaker Separation Status - Gets the status of speaker separation for a sample.
- Start Speaker Separation - Initiates speaker separation for a sample.
- Retrieve Separated Speaker Audio - Retrieves audio for a specific separated speaker.
- Get PVC Voice Captcha - Gets the captcha for PVC voice verification.
- Verify PVC Voice Captcha - Submits captcha verification for a PVC voice.
- Run PVC Training - Starts the training process for a PVC voice.
- Request Manual Verification - Requests manual verification for a PVC voice.
Updated Endpoints
Speech to Text
- Updated endpoint with changes:
- Create Forced Alignment Task - Added
enabled_spooled_file
parameter to allow streaming large files (POST /v1/forced-alignment
).
- Create Forced Alignment Task - Added
Schema Changes
Conversational AI
GET conversation details
: Addedhas_audio
,has_user_audio
,has_response_audio
boolean fields here
Dubbing
GET dubbing resource
: Addedstatus
field to each render here
Voices
- New PVC flow: Added new flow for Professional Voice Clone creation, try it out here
Conversational AI
- Agent-agent transfer: Added support for agent-to-agent transfers via a new system tool, enabling more complex conversational flows. See the Agent Transfer tool documentation for details.
- Enhanced tool debugging: Improved how tool execution details are displayed in the conversation history for easier debugging.
- Language detection fix: Resolved an issue regarding the forced calling of the language detection tool.
Dubbing
- Render endpoint: Introduced a new endpoint to regenerate audio or video renders for specific languages within a dubbing project. This automatically handles missing transcriptions or translations. See the Render Dub endpoint.
- Increased size limit: Raised the maximum allowed file size for dubbing projects to 1 GiB.
API
View API changes
New Endpoints
- Added render dub endpoint - Regenerate dubs for a specific language.
Updated Endpoints
Pronunciation Dictionaries
- Updated the response for the
GET /v1/pronunciation-dictionaries/{pronunciation_dictionary_id}/
endpoint and related components to include thepermission_on_resource
field.
Speech to Text
- Updated Speech to Text endpoint (
POST /v1/speech-to-text
):- Added
cloud_storage_url
parameter to allow transcription directly from public S3 or GCS URLs (up to 2GB). - Made the
file
parameter optional; exactly one offile
orcloud_storage_url
must now be provided.
- Added
Speech to Speech
- Added optional
file_format
parameter (pcm_s16le_16
orother
) for lower latency with PCM input toPOST /v1/speech-to-speech/{voice_id}
Conversational AI
- Updated components to support agent-agent transfer tool
Voices
- Updated
GET /v1/voices/{voice_id}
samples
field to include optionaltrim_start
andtrim_end
parameters.
AudioNative
- Updated
Get /v1/audio-native/{project_id}/settings
to includestatus
field (processing
orready
).
Speech to text
scribe_v1_experimental
: Launched a new experimental preview of the Scribe v1 model with improvements including improved performance on audio files with multiple languages, reduced hallucinations when audio is interleaved with silence, and improved audio tags. The new model is available via the API under the model namescribe_v1_experimental
Text to speech
- A-law format support: Added a-law format with 8kHz sample rate to enable integration with European telephony systems.
- Fixed quota issues: Fixed a database bug that caused some requests to be mistakenly rejected as exceeding their quota.
Conversational AI
- Document type filtering: Added support for filtering knowledge base documents by their type (file, URL, or text).
- Non-audio agents: Added support for conversational agents that don’t output audio but still send response transcripts and can use tools. Non-audio agents can be enabled by removing the audio client event.
- Improved agent templates: Updated all agent templates with enhanced configurations and prompts. See more about how to improve system prompts here.
- Fixed stuck exports: Fixed an issue that caused exports to be stuck for extended periods.
Studio
- Fixed volume normalization: Fixed issue with streaming project snapshots when volume normalization is enabled.
New API endpoints
- Forced alignment: Added new forced alignment endpoint for aligning audio with text, perfect for subtitle generation.
- Batch calling: Added batch calling endpoint for scheduling calls to multiple recipients
API
View API changes
New Endpoints
- Added Forced alignment endpoint for aligning audio with text
- Added dedicated endpoints for knowledge base document types:
Updated Endpoints
Text to Speech
- Added a-law format (8kHz) to all audio endpoints:
Voices
- Get voices - Added
collection_id
parameter for filtering voices by collection
Knowledge Base
- Get knowledge base - Added
types
parameter for filtering documents by type - General endpoint for creating knowledge base documents marked as deprecated in favor of specialized endpoints
User Subscription
- Get user subscription - Added
professional_voice_slots_used
property to track number of professional voices used in a workspace
Conversational AI
- Added
silence_end_call_timeout
parameter to set maximum wait time before terminating a call - Removed
/v1/convai/agents/{agent_id}/add-secret
endpoint (now handled by workspace secrets endpoints)