Text to Voice
- Voice Design: Launched new Text to Voice Design with Eleven v3 for creating custom voices from text descriptions.
Speech to Text
- Enhanced Diarization: Added
diarization_threshold
parameter to the Speech to Text endpoint. Fine-tune the balance between speaker accuracy and total speaker count by adjusting the threshold between 0.1 and 0.4.
Professional Voice Cloning
- Background Noise Removal: Added
remove_background_noise
to clean up voice samples using audio isolation models for better quality training data.
Studio
- Video Support Detection: Added
has_video
property to chapter responses to indicate whether chapters contain video content.
Workspaces
-
Service Account Groups: Service accounts can now be added to workspace groups for better permission management and access control.
-
Workspace Authentication: Added support for workspace authentication connections, enabling secure webhook tool integrations with external services.
SDKs
- Python SDK: Released v2.6.0 with latest API support and bug fixes.
- JavaScript SDK: Released v2.5.0 with latest API support and bug fixes.
- React Conversational AI SDK: Added WebRTC support in 0.2.0
API
View API changes
New Endpoints
- Added 2 new endpoints:
- Design a Voice - Create voice previews from text descriptions
- Create Voice From Preview - Convert voice previews to permanent voices
Updated Endpoints
Speech to Text
- Convert speech to text - Added
diarization_threshold
parameter for fine-tuning speaker separation
Voice Management
- Get voice sample audio - Added
remove_background_noise
query parameter and moved from request body to query parameters