Text to Speech

  • Eleven v3 (alpha): Released Eleven v3 (alpha), our most expressive Text to Speech model, as a research preview.

Conversational AI

  • Custom voice settings in multi-voice: Added support for configuring individual voice settings per supported voice in multi-voice agents, allowing fine-tuned control over stability, speed, similarity boost, and streaming latency for each voice.
  • Silent transfer to human in Twilio: Added backend configuration support for silent (cold) transfer to human in the Twilio native integration, enabling seamless handoff without announcing the transfer to callers.
  • Batch calling retry and cancel: Added support for retrying outbound calls to phone numbers that did not respond during a batch call, along with the ability to cancel ongoing batch operations for better campaign management.
  • LLM pinning: Added support for versioned LLM models with explicit checkpoint identifiers
  • Custom LLM headers: Added support for passing custom headers to custom LLMs
  • Fixed issue in non-latin languages: Fixed an issue causing some conversations in non latin alphabet languages to fail.

SDKs

API

New Endpoints

Conversational AI

  • Batch Calling:

    • Cancel batch call - Cancel a running batch call and set all recipients to cancelled status
    • Retry batch call - Retry a batch call by setting completed recipients back to pending status
  • Knowledge Base RAG:

Workspace

Updated Endpoints

Conversational AI

  • Supported Voices:

    • Agent configuration - Added optimize_streaming_latency, stability, speed, and similarity_boost parameters for per-voice TTS customization
  • Transfer to Human:

    • Agent configuration - Added enable_client_message parameter to control whether a message is played to the client during transfer
  • Knowledge Base:

    • Knowledge base documents now use supported_usages instead of prompt_injectable for better usage mode control
    • RAG index creation now returns enhanced response model with usage information
  • Custom LLM:

  • Widget Configuration:

  • LLM:

Conversational AI

API

Updated Endpoints

Speech to Text

  • Create transcript - Added webhook parameter for asynchronous processing with webhook delivery

Conversational AI

Forced Aligment

  • Forced alignment improvements: Fixed a rare failure case in forced alignment processing to improve reliability.

Voices

  • Live moderated voices filter: Added include_live_moderated query parameter to the shared voices endpoint, allowing you to include or exclude voices that are live moderated.

Conversational AI

  • Secret dynamic variables: Added support for specifying dynamic variables as secrets with the secret__ prefix. Secret dynamic variables can only be used in webhook tool headers and are never sent to an LLM, enhancing security for sensitive data. Learn more.
  • Skip turn system tool: Introduced a new system tool called skip_turn. When enabled, the agent will skip its turn if the user explicitly indicates they need a moment to think or perform an action (e.g., “just a sec”, “give me a minute”). This prevents turn timeout from being triggered during intentional user pauses. See the skip turn tool docs for more information.
  • Text input support: Added text input support in websocket connections via “user_message” event with text field. Also added “user_activity” event support to indicate typing or other UI activity, improving agent turn-taking when there’s interleaved text and audio input.
  • RAG chunk limit: Added ability to configure the maximum number of chunks collected during RAG retrieval, giving users more control over context window usage and costs.
  • Enhanced widget configuration: Expanded widget customization options to include text input and text only mode.
  • LLM usage calculator: Introduced tools to calculate expected LLM token usage and costs for agents, helping with cost estimation and planning.

Audio Native

  • Accessibility improvements: Enhanced accessibility for the AudioNative player with multiple improvements:
    • Added aria-labels for all buttons
    • Enabled keyboard navigation for all interactive elements
    • Made progress bar handle focusable and keyboard-accessible
    • Improved focus indicator visibility for better screen reader compatibility

API

New Endpoints

Updated Endpoints

Voices

  • Get Shared Voices - Added include_live_moderated query parameter to GET /v1/shared-voices to filter voices by live moderation status.

Conversational AI

  • Agent Configuration:

    • Enhanced system tools with new skip_turn tool configuration
    • Improved RAG configuration with max_retrieved_rag_chunks_count parameter
  • Widget Configuration:

    • Added support for text-only mode
  • Batch Calling:

    • Batch call responses now include phone_provider field with default value “twilio”

Text to Speech

  • Voice Settings:
    • Added quality parameter to voice settings for controlling audio generation quality
    • Model response schema updated to include can_use_quality field

SDKs

Speech to Text

  • Speech to text logprobs: The Speech to Text response now includes a logprob field for word prediction confidence.

Billing

  • Improved API error messages: Enhanced API error messages for subscriptions with failed payments. This provides clearer information if a failed payment has caused a user to reach their quota threshold sooner than expected.

Conversational AI

  • Batch calls: Released new batch calling functionality, which allows you to automate groups of outbound calls.
  • Increased evaluation criteria limit: The maximum number of evaluation criteria for agent performance evaluation has been increased from 5 to 10.
  • Human-readable IDs: Introduced human-readable IDs for key Conversational AI entities (e.g., agents, conversations). This improves usability and makes resources easier to identify and manage through the API and UI.
  • Unanswered call tracking: ‘Not Answered’ outbound calls are now reliably detected and visible in the conversation history.
  • LLM cost visibility in dashboard: The Conversational AI dashboard now displays the total and per-minute average LLM costs.
  • Zero retention mode (ZRM) for agents: Allowed enabling Zero Retention Mode (ZRM) per agent.
  • Dynamic variables in headers: Added option of setting dynamic variable as a header value for tools
  • Customisable tool timeouts: Shipped setting different timeout durations per tool.

Workspaces

  • Simplified secret updates: Workspace secrets can now be updated more granularly using a PATCH request via the API, simplifying the management of individual secret values. For technical details, please see the API changes section below.

API

New Endpoints

Updated Endpoints

Conversational AI

  • Agents & Conversations:
    • Endpoint GET /v1/convai/conversation/get_signed_url (snake_case path) has been deprecated. Use the new GET /v1/convai/conversation/get-signed-url (kebab-case path) instead.
  • Phone Numbers:
    • Get Phone Number Details - Response schema for GET /v1/convai/phone-numbers/{phone_number_id} updated to distinct Twilio and SIPTrunk provider details.
    • Update Phone Number - Response schema for PATCH /v1/convai/phone-numbers/{phone_number_id} updated similarly for Twilio and SIPTrunk.
    • List Phone Numbers - Response schema for GET /v1/convai/phone-numbers/ list items updated for Twilio and SIPTrunk providers.

Text To Speech

  • Text to Speech Endpoints - Default model_id changed from eleven_monolingual_v1 to eleven_multilingual_v2 for the following endpoints:
    • POST /v1/text-to-speech/{voice_id}/stream
    • POST /v1/text-to-speech/{voice_id}/stream-with-timestamps
    • POST /v1/text-to-speech/{voice_id}
    • POST /v1/text-to-speech/{voice_id}/with-timestamps

Voices

  • Get Shared Voices - Added include_custom_rates query parameter to GET /v1/shared-voices.
  • Schema Updates:
    • LibraryVoiceResponseModel and VoiceSharingResponseModel now include an optional fiat_rate field (USD per 1000 credits).

Billing

  • Downgraded Plan Pricing Fix: Fixed an issue where customers with downgraded subscriptions were shown their current price instead of the correct future price.

Conversational AI

  • Edit Knowledge Base Document Names: You can now edit the names of knowledge base documents.
    See: Knowledge Base
  • Conversation Simulation: Released a new endpoint that allows you to test an agent over text

Studio

  • Export Paragraphs as Zip: Added support for exporting separated paragraphs in a zip file.
    See: Studio

SDKs

API

New Endpoints

  • Update metadata for a speaker
    PATCH /v1/dubbing/resource/{dubbing_id}/speaker/{speaker_id}
    Amend the metadata associated with a speaker, such as their voice. Both voice cloning and using voices from the ElevenLabs library are supported.

  • Search similar voices for a speaker
    GET /v1/dubbing/resource/{dubbing_id}/speaker/{speaker_id}/similar-voices
    Fetch the top 10 similar voices to a speaker, including IDs, names, descriptions, and sample audio.

  • Simulate a conversation
    POST /v1/convai/agents/{agent_id}/simulate_conversation
    Run a conversation between the agent and a simulated user.

  • Simulate a conversation (stream)
    POST /v1/convai/agents/{agent_id}/simulate_conversation/stream
    Stream a simulated conversation between the agent and a simulated user.

  • Handle outbound call via SIP trunk
    POST /v1/convai/sip-trunk/outbound-call
    Initiate an outbound call using SIP trunking.

Updated Endpoints

  • List conversations
    GET /v1/convai/conversations
    Added call_start_after_unix query parameter to filter conversations by start date.

  • Update knowledge base document
    PATCH /v1/convai/knowledge-base/{documentation_id}
    Now supports updating the name of a document.

  • Text to Speech endpoints
    The default model for all TTS endpoints is now eleven_multilingual_v2 (was eleven_monolingual_v1).

Removed Endpoints

  • None.

Dubbing

  • Disable Voice Cloning: Added an option in the Dubbing Studio UI to disable voice cloning when uploading audio, aligning with the existing disable_voice_cloning API parameter.

Billing

  • Quota Exceeded Error: Improved error messaging for exceeding character limits. Users attempting to generate audio beyond their quota within a short billing window will now receive a clearer 401 unauthorized: This request exceeds your quota limit of... error message indicating the limit has been exceeded.

SDKs

Conversational AI

  • Custom Dashboard Charts: The Conversational AI Dashboard can now be extended with custom charts displaying the results of evaluation criteria over time. See the new GET and PATCH endpoints for managing dashboard settings.
  • Call History Filtering: Added the ability to filter the call history by start date using the new call_start_before_unix parameter in the List Conversations endpoint. Try it here.
  • Server Tools: Added option of making PUT requests in server tools
  • Transfer to human: Added call forwarding functionality to support forwarding to operators, see docs here
  • Language detection: Fixed an issue where the language detection system tool would trigger on a user replying yes in non-English language.

Usage Analytics

  • Custom Aggregation: Added an optional aggregation_interval parameter to the Get Usage Metrics endpoint to control the interval over which to aggregate character usage (hour, day, week, month, or cumulative).
  • New Metric Breakdowns: The Usage Analytics section now supports additional metric breakdowns including minutes_used, request_count, ttfb_avg, and ttfb_p95, selectable via the new metric parameter in the Get Usage Metrics endpoint. Furthermore, you can now get a breakdown and filter by request_queue.

API

New Endpoints

  • Added 2 new endpoints for managing Conversational AI dashboard settings:

Updated Endpoints

Audio Generation (TTS, S2S, SFX, Voice Design)

Usage Analytics

  • Updated usage metrics endpoint:
    • Get Usage Metrics (GET /v1/usage/character-stats) - Added optional aggregation_interval and metric query parameters.

Conversational AI

  • Updated conversation listing endpoint:
    • List Conversations (GET /v1/convai/conversations) - Added optional call_start_before_unix query parameter for filtering by start date.

Schema Changes

Conversational AI

Professional Voice Cloning (PVC)

  • PVC API: Introduced a comprehensive suite of API endpoints for managing Professional Voice Clones (PVC). You can now programmatically create voices, add/manage/delete audio samples, retrieve audio/waveforms, manage speaker separation, handle verification, and initiate training. For a full list of new endpoints check the API changes summary below or read the PVC API reference here.

Speech to Text

  • Enhanced Export Options: Added options to include or exclude timestamps and speaker IDs when exporting Speech to Text results in segmented JSON format via the API.

Conversational AI

  • New LLM Models: Added support for new GPT-4.1 models: gpt-4.1, gpt-4.1-mini, and gpt-4.1-nano here
  • VAD Score: Added a new client event which sends VAD scores to the client, see reference here

Workspace

  • Member Management: Added a new API endpoint to allow administrators to delete workspace members here

API

New Endpoints

Updated Endpoints

Speech to Text

  • Updated endpoint with changes:

Schema Changes

Conversational AI

  • GET conversation details: Added has_audio, has_user_audio, has_response_audio boolean fields here

Dubbing

  • GET dubbing resource : Added status field to each render here

Voices

  • New PVC flow: Added new flow for Professional Voice Clone creation, try it out here

Conversational AI

  • Agent-agent transfer: Added support for agent-to-agent transfers via a new system tool, enabling more complex conversational flows. See the Agent Transfer tool documentation for details.
  • Enhanced tool debugging: Improved how tool execution details are displayed in the conversation history for easier debugging.
  • Language detection fix: Resolved an issue regarding the forced calling of the language detection tool.

Dubbing

  • Render endpoint: Introduced a new endpoint to regenerate audio or video renders for specific languages within a dubbing project. This automatically handles missing transcriptions or translations. See the Render Dub endpoint.
  • Increased size limit: Raised the maximum allowed file size for dubbing projects to 1 GiB.

API

New Endpoints

Updated Endpoints

Pronunciation Dictionaries

Speech to Text

  • Updated Speech to Text endpoint (POST /v1/speech-to-text):
    • Added cloud_storage_url parameter to allow transcription directly from public S3 or GCS URLs (up to 2GB).
    • Made the file parameter optional; exactly one of file or cloud_storage_url must now be provided.

Speech to Speech

Conversational AI

Voices

AudioNative

Speech to text

  • scribe_v1_experimental: Launched a new experimental preview of the Scribe v1 model with improvements including improved performance on audio files with multiple languages, reduced hallucinations when audio is interleaved with silence, and improved audio tags. The new model is available via the API under the model name scribe_v1_experimental

Text to speech

  • A-law format support: Added a-law format with 8kHz sample rate to enable integration with European telephony systems.
  • Fixed quota issues: Fixed a database bug that caused some requests to be mistakenly rejected as exceeding their quota.

Conversational AI

  • Document type filtering: Added support for filtering knowledge base documents by their type (file, URL, or text).
  • Non-audio agents: Added support for conversational agents that don’t output audio but still send response transcripts and can use tools. Non-audio agents can be enabled by removing the audio client event.
  • Improved agent templates: Updated all agent templates with enhanced configurations and prompts. See more about how to improve system prompts here.
  • Fixed stuck exports: Fixed an issue that caused exports to be stuck for extended periods.

Studio

  • Fixed volume normalization: Fixed issue with streaming project snapshots when volume normalization is enabled.

New API endpoints

  • Forced alignment: Added new forced alignment endpoint for aligning audio with text, perfect for subtitle generation.
  • Batch calling: Added batch calling endpoint for scheduling calls to multiple recipients

API

New Endpoints

Updated Endpoints

Text to Speech

Voices

  • Get voices - Added collection_id parameter for filtering voices by collection

Knowledge Base

  • Get knowledge base - Added types parameter for filtering documents by type
  • General endpoint for creating knowledge base documents marked as deprecated in favor of specialized endpoints

User Subscription

  • Get user subscription - Added professional_voice_slots_used property to track number of professional voices used in a workspace

Conversational AI

  • Added silence_end_call_timeout parameter to set maximum wait time before terminating a call
  • Removed /v1/convai/agents/{agent_id}/add-secret endpoint (now handled by workspace secrets endpoints)