Changelog

Professional Voice Cloning (PVC)

PVC API: Introduced a comprehensive suite of API endpoints for managing Professional Voice Clones (PVC). You can now programmatically create voices, add/manage/delete audio samples, retrieve audio/waveforms, manage speaker separation, handle verification, and initiate training. For a full list of new endpoints check the API changes summary below or read the PVC API reference here.

Speech to Text

Enhanced Export Options: Added options to include or exclude timestamps and speaker IDs when exporting Speech to Text results in segmented JSON format via the API.

Conversational AI

New LLM Models: Added support for new GPT-4.1 models: gpt-4.1, gpt-4.1-mini, and gpt-4.1-nano here
VAD Score: Added a new client event which sends VAD scores to the client, see reference here

Workspace

Member Management: Added a new API endpoint to allow administrators to delete workspace members here

API

View API changes

New Endpoints

Added 16 new endpoints:
- Delete Member - Allows deleting workspace members.
- Create PVC Voice - Creates a new PVC voice.
- Edit PVC Voice - Edits PVC voice metadata.
- Add Samples To PVC Voice - Adds audio samples to a PVC voice.
- Update PVC Voice Sample - Updates a PVC voice sample (noise removal, speaker selection, trimming).
- Delete PVC Voice Sample - Deletes a sample from a PVC voice.
- Retrieve Voice Sample Audio - Retrieves audio for a PVC voice sample.
- Retrieve Voice Sample Visual Waveform - Retrieves the visual waveform for a PVC voice sample.
- Retrieve Speaker Separation Status - Gets the status of speaker separation for a sample.
- Start Speaker Separation - Initiates speaker separation for a sample.
- Retrieve Separated Speaker Audio - Retrieves audio for a specific separated speaker.
- Get PVC Voice Captcha - Gets the captcha for PVC voice verification.
- Verify PVC Voice Captcha - Submits captcha verification for a PVC voice.
- Run PVC Training - Starts the training process for a PVC voice.
- Request Manual Verification - Requests manual verification for a PVC voice.

Updated Endpoints

Speech to Text

Updated endpoint with changes:
- Create Forced Alignment Task - Added enabled_spooled_file parameter to allow streaming large files (POST /v1/forced-alignment).

Schema Changes

Conversational AI

GET conversation details: Added has_audio, has_user_audio, has_response_audio boolean fields here

Dubbing

GET dubbing resource : Added status field to each render here