Professional Voice Cloning (PVC)
- PVC API: Introduced a comprehensive suite of API endpoints for managing Professional Voice Clones (PVC). You can now programmatically create voices, add/manage/delete audio samples, retrieve audio/waveforms, manage speaker separation, handle verification, and initiate training. For a full list of new endpoints check the API changes summary below or read the PVC API reference here.
Speech to Text
- Enhanced Export Options: Added options to include or exclude timestamps and speaker IDs when exporting Speech to Text results in segmented JSON format via the API.
Conversational AI
- New LLM Models: Added support for new GPT-4.1 models:
gpt-4.1
,gpt-4.1-mini
, andgpt-4.1-nano
here - VAD Score: Added a new client event which sends VAD scores to the client, see reference here
Workspace
- Member Management: Added a new API endpoint to allow administrators to delete workspace members here
API
View API changes
New Endpoints
- Added 16 new endpoints:
- Delete Member - Allows deleting workspace members.
- Create PVC Voice - Creates a new PVC voice.
- Edit PVC Voice - Edits PVC voice metadata.
- Add Samples To PVC Voice - Adds audio samples to a PVC voice.
- Update PVC Voice Sample - Updates a PVC voice sample (noise removal, speaker selection, trimming).
- Delete PVC Voice Sample - Deletes a sample from a PVC voice.
- Retrieve Voice Sample Audio - Retrieves audio for a PVC voice sample.
- Retrieve Voice Sample Visual Waveform - Retrieves the visual waveform for a PVC voice sample.
- Retrieve Speaker Separation Status - Gets the status of speaker separation for a sample.
- Start Speaker Separation - Initiates speaker separation for a sample.
- Retrieve Separated Speaker Audio - Retrieves audio for a specific separated speaker.
- Get PVC Voice Captcha - Gets the captcha for PVC voice verification.
- Verify PVC Voice Captcha - Submits captcha verification for a PVC voice.
- Run PVC Training - Starts the training process for a PVC voice.
- Request Manual Verification - Requests manual verification for a PVC voice.
Updated Endpoints
Speech to Text
- Updated endpoint with changes:
- Create Forced Alignment Task - Added
enabled_spooled_file
parameter to allow streaming large files (POST /v1/forced-alignment
).
- Create Forced Alignment Task - Added
Schema Changes
Conversational AI
GET conversation details
: Addedhas_audio
,has_user_audio
,has_response_audio
boolean fields here
Dubbing
GET dubbing resource
: Addedstatus
field to each render here