Voices

  • New PVC flow: Added new flow for Professional Voice Clone creation, try it out here

Conversational AI

  • Agent-agent transfer: Added support for agent-to-agent transfers via a new system tool, enabling more complex conversational flows. See the Agent Transfer tool documentation for details.
  • Enhanced tool debugging: Improved how tool execution details are displayed in the conversation history for easier debugging.
  • Language detection fix: Resolved an issue regarding the forced calling of the language detection tool.

Dubbing

  • Render endpoint: Introduced a new endpoint to regenerate audio or video renders for specific languages within a dubbing project. This automatically handles missing transcriptions or translations. See the Render Dub endpoint.
  • Increased size limit: Raised the maximum allowed file size for dubbing projects to 1 GiB.

API

New Endpoints

Updated Endpoints

Pronunciation Dictionaries

Speech to Text

  • Updated Speech to Text endpoint (POST /v1/speech-to-text):
    • Added cloud_storage_url parameter to allow transcription directly from public S3 or GCS URLs (up to 2GB).
    • Made the file parameter optional; exactly one of file or cloud_storage_url must now be provided.

Speech to Speech

Conversational AI

Voices

AudioNative

Speech to text

  • scribe_v1_experimental: Launched a new experimental preview of the Scribe v1 model with improvements including improved performance on audio files with multiple languages, reduced hallucinations when audio is interleaved with silence, and improved audio tags. The new model is available via the API under the model name scribe_v1_experimental

Text to speech

  • A-law format support: Added a-law format with 8kHz sample rate to enable integration with European telephony systems.
  • Fixed quota issues: Fixed a database bug that caused some requests to be mistakenly rejected as exceeding their quota.

Conversational AI

  • Document type filtering: Added support for filtering knowledge base documents by their type (file, URL, or text).
  • Non-audio agents: Added support for conversational agents that don’t output audio but still send response transcripts and can use tools. Non-audio agents can be enabled by removing the audio client event.
  • Improved agent templates: Updated all agent templates with enhanced configurations and prompts. See more about how to improve system prompts here.
  • Fixed stuck exports: Fixed an issue that caused exports to be stuck for extended periods.

Studio

  • Fixed volume normalization: Fixed issue with streaming project snapshots when volume normalization is enabled.

New API endpoints

  • Forced alignment: Added new forced alignment endpoint for aligning audio with text, perfect for subtitle generation.
  • Batch calling: Added batch calling endpoint for scheduling calls to multiple recipients

API

New Endpoints

Updated Endpoints

Text to Speech

Voices

  • Get voices - Added collection_id parameter for filtering voices by collection

Knowledge Base

  • Get knowledge base - Added types parameter for filtering documents by type
  • General endpoint for creating knowledge base documents marked as deprecated in favor of specialized endpoints

User Subscription

  • Get user subscription - Added professional_voice_slots_used property to track number of professional voices used in a workspace

Conversational AI

  • Added silence_end_call_timeout parameter to set maximum wait time before terminating a call
  • Removed /v1/convai/agents/{agent_id}/add-secret endpoint (now handled by workspace secrets endpoints)

Text to speech

  • Opus format support: Added support for Opus format with 48kHz sample rate across multiple bitrates (32-192 kbps).
  • Improved websocket error handling: Updated TTS websocket API to return more accurate error codes (1011 for internal errors instead of 1008) for better error identification and SLA monitoring.

Conversational AI

  • Twilio outbound: Added ability to natively run outbound calls.
  • Post-call webhook override: Added ability to override post-call webhook settings at the agent level, providing more flexible configurations.
  • Large knowledge base document viewing: Enhanced the knowledge base interface to allow viewing the entire content of large RAG documents.
  • Added call SID dynamic variable: Added system__call_sid as a system dynamic variable to allow referencing the call ID in prompts and tools.

Studio

  • Actor Mode: Added Actor Mode in Studio, allowing you to use your own voice recordings to direct the way speech should sound in Studio projects.
  • Improved keyboard shortcuts: Updated keyboard shortcuts for viewing settings and editor shortcuts to avoid conflicts and simplified shortcuts for locking paragraphs.

Dubbing

  • Dubbing duplication: Made dubbing duplication feature available to all users.
  • Manual mode foreground generation: Added ability to generate foreground audio when using manual mode with a file and CSV.

Voices

  • Enhanced voice collections: Improved voice collections with visual upgrades, language-based filtering, navigation breadcrumbs, collection images, and mouse dragging for carousel navigation.
  • Locale filtering: Added locale parameter to shared voices endpoint for more precise voice filtering.

API

Updated Endpoints

Text to Speech

Audio Format

Conversational AI

Voices

  • Updated Voice endpoints:

Dubbing

  • Updated Dubbing endpoint:
    • Dub a video or audio file - Renamed beta feature use_replacement_voices_from_library parameter to disable_voice_cloning for clarity

Voices

Conversational AI

  • Native outbound calling: Added native outbound calling for Twilio-configured numbers, eliminating the need for complex setup configurations. Outbound calls are now visible in the Call History page.
  • Automatic language detection: Added new system tool for automatic language detection that enables agents to switch languages based on both explicit user requests (“Let’s talk in Spanish”) and implicit language in user audio.
  • Pronunciation dictionary improvements: Fixed phoneme tags in pronunciation dictionaries to work correctly with conversational AI.
  • Large RAG document viewing: Added ability to view the entire content of large RAG documents in the knowledge base.
  • Customizable widget controls: Updated UI to include an optional mute microphone button and made widget icons customizable via slots.

Sound Effects

  • Fractional duration support: Fixed an issue where users couldn’t enter fractional values (like 0.5 seconds) for sound effect generation duration.

Speech to Text

  • Repetition handling: Improved detection and handling of repetitions in speech-to-text processing.

Studio

  • Reader publishing fixes: Added support for mp3_44100_192 output format (high quality) so users below Publisher tier can export audio to Reader.

Mobile

  • Core app signup: Added signup endpoints for the new Core mobile app.

API

New Endpoints

Updated Endpoints

Conversational AI

  • Updated Conversational AI endpoints:
    • Create agent - Added mic_muting_enabled property for UI control and workspace_overrides property for workspace-specific configurations
    • Update agent - Added workspace_overrides property for customizing agent behavior per workspace
    • Get agent - Added workspace_overrides property to the response
    • Get widget - Added mic_muting_enabled property for controlling microphone muting in the widget UI
    • Get conversation - Added rag information to view knowledge base content used during conversations
    • Create phone number - Replaced generic structure with specific twilio phone number and sip trunk options
    • Compute RAG index - Removed force_reindex query parameter for more controlled indexing
    • List knowledge base documents - Changed response structure to support different document types
    • Get knowledge base document - Modified to return different response models based on document type

Text to Speech

Speech to Text

  • Updated Speech to Text endpoint:
    • Convert speech to text - Removed biased_keywords property from form data and improved internal repetition detection algorithm

Voice Management

Studio

Pronunciation Dictionary

  • Updated Pronunciation Dictionary endpoints:

Conversational AI

  • Default LLM update: Changed the default agent LLM from Gemini 1.5 Flash to Gemini 2.0 Flash for improved performance.
  • Fixed incorrect conversation abandons: Improved detection of conversation continuations, preventing premature abandons when users repeat themselves.
  • Twilio information in history: Added Twilio call details to conversation history for better tracking.
  • Knowledge base redesign: Redesigned the knowledge base interface.
  • System dynamic variables: Added system dynamic variables to use time, conversation id, caller id and other system values as dynamic variables in prompts and tools.
  • Twilio client initialisation: Adds an agent level override for conversation initiation client data twilio webhook.
  • RAG chunks in history: Added retrieved chunks by RAG to the call transcripts in the history view.

Speech to Text

  • Reduced pricing: Reduced the pricing of our Scribe model, see more here.
  • Improved VAD detection: Enhanced Voice Activity Detection with better pause detection at segment boundaries and improved handling of silent segments.
  • Enhanced diarization: Improved speaker clustering with a better ECAPA model, symmetric connectivity matrix, and more selective speaker embedding generation.
  • Fixed ASR bugs: Resolved issues with VAD rounding, silence and clustering that affected transcription accuracy.

Studio

  • Disable publishing UI: Added ability to disable the publishing interface for specific workspace members to support enterprise workflows.
  • Snapshot API improvement: Modified endpoints for project and chapter snapshots to return an empty list instead of throwing errors when snapshots can’t be downloaded.
  • Disabled auto-moderation: Turned off automatic moderation based on Text to Speech generations in Studio.

Workspaces

  • Fixed API key editing: Resolved an issue where editing workspace API keys would reset character limits to zero, causing the keys to stop working.
  • Optimized free subscriptions: Fixed an issue with refreshing free subscription character limits,

API

New Endpoints

Updated Endpoints

Dubbing

Project Management

  • Updated Project endpoints:
    • Add project - Made metadata, project_name, description nullable
    • Create podcast - Made title, description, author nullable
    • Get project - Made last_modified_at, created_at, project_name nullable
    • Add chapter - Made chapter_id, word_count, statistics nullable
    • Update chapter - Made content and blocks properties nullable

Conversational AI

  • Updated Conversational AI endpoints:
    • Update agent - Made conversation_config, platform_settings nullable and added workspace_overrides property
    • Create agent - Made agent_name, prompt, widget_config nullable and added workspace_overrides property
    • Add to knowledge base - Made document_name nullable
    • Get conversation - Added twilio_call_data model and made transcript, metadata nullable

Text to Speech

Voice Management

  • Updated Voice endpoints:

Speech to Text

Other Updates

Conversational AI

  • HIPAA compliance: Conversational AI is now HIPAA compliant on appropriate plans, when a BAA is signed, zero-retention mode is enabled and appropriate LLMs are used. For access please contact sales
  • Cascade LLM: Added dynamic dispatch during the LLM step to other LLMs if your default LLM fails. This results in higher latency but prevents the turn failing.
  • Better error messages: Added better error messages for websocket failures.
  • Audio toggling: Added ability to select only user or agent audio in the conversation playback.

Scribe

  • HIPAA compliance: Added a zero retention mode to Scribe to be HIPAA compliant.
  • Diarization: Increased time length of audio files that can be transcribed with diarization from 8 minutes to 2 hours.
  • Cheaper pricing: Updated Scribe’s pricing to be cheaper, as low as $0.22 per hour for the Business tier.
  • Memory usage: Shipped improvements to Scribe’s memory usage.
  • Fixed timestamps: Fixed an issue that was causing incorrect timestamps to be returned.
  • Biased keywords: Added biased keywords to improve Scribe’s performance.

Text to Speech

  • Pronunciation dictionaries: Fixed pronunciation dictionary rule application for replacements that contain symbols.

Dubbing

  • Studio support: Added support for creating dubs with dubbing_studio enabled, allowing for more advanced dubbing workflows beyond one-off dubs.

Voices

  • Verification: Fixed an issue where users on probation could not verify their voice clone.

API

New Endpoints

Updated Endpoints

Studio Projects

Voice Management

  • Updated Voice endpoints with several property changes:

Conversational AI

  • Updated Conversational AI agent endpoints:
    • Update agent - Modified conversation_config, agent, platform_settings, and widget properties
    • Create agent - Modified conversation_config, agent, prompt, platform_settings, widget properties and added shareable_page_show_terms
    • Get agent - Modified conversation_config, agent, platform_settings, and widget properties
    • Get widget - Modified widget_config property and added shareable_page_show_terms

Knowledge Base

Other Updates

Removed Endpoints

  • Temporarily removed Conversational AI tools endpoints:

    • Get tool
    • List tools
    • Update tool
    • Create tool
    • Delete tool

Dubbing

  • Scribe for speech recognition: Dubbing Studio now uses Scribe by default for speech recognition to improve accuracy.

Speech to Text

  • Fixes: Shipped several fixes improving the stability of Speech to Text.

Conversational AI

  • Speed control: Added speed control to an agent’s settings in Conversational AI.
  • Post call webhook: Added the option of sending post-call webhooks after conversations are completed.
  • Improved error messages: Added better error messages to the Conversational AI websocket.
  • Claude 3.7 Sonnet: Added Claude 3.7 Sonnet as a new LLM option in Conversational AI.

API

New Endpoints

Updated Endpoints

  • Added prompt_injectable property to knowledge base endpoints
  • Added name property to Knowledge Base document creation and retrieval endpoints:
  • Added speed property to agent creation
  • Removed secrets property from agent endpoints (now handled by dedicated secrets endpoints)
  • Added secret deletion endpoint for removing secrets
  • Removed secrets property from settings endpoints

Speech to Text

  • ElevenLabs launched a new state of the art Speech to Text API available in 99 languages.

Text to Speech

  • Speed control: Added speed control to the Text to Speech API.

Studio

  • Auto-assigned projects: Increased token limits for auto-assigned projects from 1 month to 3 months worth of tokens, addressing user feedback about working on longer projects.
  • Language detection: Added automatic language detection when generating audio for the first time, with suggestions to switch to Eleven Turbo v2.5 for languages not supported by Multilingual v2 (Hungarian, Norwegian, Vietnamese).
  • Project export: Enhanced project exporting in ElevenReader with better metadata tracking.

Dubbing

  • Clip overlap prevention: Added automatic trimming of overlapping clips in dubbing jobs to ensure clean audio tracks for each speaker and language.

Voice Management

  • Instant Voice Cloning: Improved preview generation for Instant Voice Cloning v2, making previews available immediately.

Conversational AI

  • Agent ownership: Added display of agent creators in the agent list, improving visibility and management of shared agents.

Web app

  • Dark mode: Added dark mode to the web app.

API

Conversational AI

  • Tool calling fix: Fixed an issue where tool calling was not working with agents using gpt-4o mini. This was due to a breaking change in the OpenAI API.
  • Tool calling improvements: Added support for tool calling with dynamic variables inside objects and arrays.
  • Dynamic variables: Fixed an issue where dynamic variables of a conversation were not being displayed correctly.

Voice Isolator

  • Fixed: Fixed an issue that caused the voice isolator to not work correctly temporarily.

Workspace

  • Billing: Improved billing visibility by differentiating rollover, cycle, gifted, and usage-based credits.
  • Usage Analytics: Improved usage analytics load times and readability.
  • Fine grained fiat billing: Added support for customizable pricing based on several factors.

API

  • Added phone_numbers property to Agent responses
  • Added usage metrics to subscription_extras in User endpoint:
    • unused_characters_rolled_over_from_previous_period
    • overused_characters_rolled_over_from_previous_period
    • usage statistics
  • Added enable_conversation_initiation_client_data_from_webhook to Agent creation
  • Updated Agent endpoints with consolidated settings for:
    • platform_settings
    • overrides
    • safety
  • Deprecated with_settings parameter in Voice retrieval endpoint

Conversational AI

Studio

  • GenFM: Updated the create podcast endpoint to accept multiple input sources.
  • GenFM: Fixed an issue where GenFM was creating empty podcasts.

Enterprise

  • New workspace group endpoints: Added new endpoints to manage workspace groups.

API

Studio (formerly Projects)

All /v1/projects/* endpoints have been deprecated in favor of the new /v1/studio/projects/* endpoints. The following endpoints are now deprecated:

  • All operations on /v1/projects/
  • All operations related to chapters, snapshots, and content under /v1/projects/*

Conversational AI

  • POST /v1/convai/add-tool - Use POST /v1/convai/tools instead
  • DELETE /v1/convai/agents/{agent_id} - Response type is no longer an object
  • GET /v1/convai/tools - Response type changed from array to object with a tools property

Conversational AI Updates

  • GET /v1/convai/agents/{agent_id} - Updated conversation configuration and agent properties
  • PATCH /v1/convai/agents/{agent_id} - Added use_tool_ids parameter for tool management
  • POST /v1/convai/agents/create - Added tool integration via use_tool_ids

Knowledge Base & Tools

  • GET /v1/convai/agents/{agent_id}/knowledge-base/{documentation_id} - Added name and access_level properties
  • GET /v1/convai/knowledge-base/{documentation_id} - Added name and access_level properties
  • GET /v1/convai/tools/{tool_id} - Added dependent_agents property
  • PATCH /v1/convai/tools/{tool_id} - Added dependent_agents property

GenFM

  • POST /v1/projects/podcast/create - Added support for multiple input sources

Studio (formerly Projects)

New endpoints replacing the deprecated /v1/projects/* endpoints

  • GET /v1/studio/projects: List all projects
  • POST /v1/studio/projects: Create a project
  • GET /v1/studio/projects/{project_id}: Get project details
  • DELETE /v1/studio/projects/{project_id}: Delete a project

Knowledge Base Management

  • GET /v1/convai/knowledge-base: List all knowledge base documents
  • DELETE /v1/convai/knowledge-base/{documentation_id}: Delete a knowledge base
  • GET /v1/convai/knowledge-base/{documentation_id}/dependent-agents: List agents using this knowledge base

Workspace Groups - New enterprise features for team management

  • GET /v1/workspace/groups/search: Search workspace groups
  • POST /v1/workspace/groups/{group_id}/members: Add members to a group
  • POST /v1/workspace/groups/{group_id}/members/remove: Remove members from a group

Tools

  • POST /v1/convai/tools: Create new tools for agents

Socials

  • ElevenLabs Developers: Follow our new developers account on X @ElevenLabsDevs