Scribe v2 Realtime

Scribe v2 Realtime, our fastest and most accurate live speech recognition model, has officially launched. It delivers state-of-the-art accuracy in over 92 languages with an ultra-low 150ms of latency. Read more here.

Agents Platform

  • Widget end-call feedback: Widgets now support customizable end-of-call feedback collection with optional rating and comment fields, enabling you to gather user feedback directly within your agent conversations.
  • Soft timeout configuration: Added comprehensive soft timeout settings to control how agents handle pauses in conversation, with configurable prompts and behavior across turn and conversation levels.
  • Advanced conversation filtering: The conversations endpoint now supports filtering by call duration range, evaluation parameters, data collection parameters, and specific tool names.
  • Enhanced conversation feedback: Updated conversation feedback system with structured feedback types including ratings and comments for better user experience tracking.

Studios

  • Extended project metadata: Projects now include comprehensive asset tracking with video thumbnails, external audio references, and enhanced snapshot information including audio duration.

Billing

  • Enhanced invoice details: Invoice responses now include detailed discount information with a new discounts array, replacing the deprecated discount_percent_off and discount_amount_off fields. The subtotal_cents and tax_cents fields provide clearer invoice breakdowns.

Music API

  • Stem separation latency improvements: The stem separation endpoint has been updated to better handle audio files and provide more predictable latency for processing requests.

SDK Releases

JavaScript SDK

  • v2.23.0 - Updated Speech to Text (Scribe) endpoint to scribe_realtime and exported additional TypeScript types for improved developer experience.
  • v2.22.0 - Updated with latest API schema changes including conversation filtering, soft timeout configuration, and widget feedback features.

Python SDK

  • v2.22.1 - Updated Speech to Text (Scribe) endpoint to scribe_realtime and fixed circular import issues for improved stability.
  • v2.22.0 - Updated with latest API schema changes including conversation filtering, soft timeout configuration, and widget feedback features.

Agents Packages

Android SDK

  • v0.5.0 - Added setVolume and getVolume functions for programmatic audio control, with example implementation including a volume seek bar in the sample app.

API

Updated Endpoints

Agents Platform

Conversation Management

  • Get conversations

    • Added call_duration_min_secs query parameter (integer) to filter conversations by minimum call duration
    • Added call_duration_max_secs query parameter (integer) to filter conversations by maximum call duration
    • Added evaluation_params query parameter (array of strings) for filtering by evaluation criteria
    • Added data_collection_params query parameter (array of strings) for filtering by data collection parameters
    • Added tool_names query parameter (array of strings) to filter conversations that used specific tools
  • Provide conversation feedback

    • Now uses ConversationFeedbackRequestModel with structured feedback types
    • Added type field to specify feedback category
    • Added rating field for numeric feedback
    • Added comment field for text feedback

Agent Configuration

  • Get agent widget
    • Added end_feedback field with WidgetEndFeedbackConfig for configuring end-of-call feedback collection
    • Supports optional rating and comment fields in feedback forms
  • Get agent & Update agent
    • Response schema updated to include new platform configuration options
    • Request schema enhanced with additional settings fields

Turn and Conversation Configuration

  • Create agent, Simulate conversation, Simulate conversation stream

    • Added soft_timeout_config field for controlling pause behavior during conversations
    • Added turn configuration overrides via TurnConfigOverride and TurnConfigOverrideConfig
    • Added initial_wait_time to TurnConfig for controlling initial response timing
    • Configuration available at conversation and workflow override levels

Testing

  • Run agent tests

    • Request schema enhanced for workflow expression support
    • Response schema updated with additional test result fields
  • Get test invocation

    • Response includes enhanced test result data
    • Updated schema for workflow expression results

Projects and Studio

  • Get project

    • Added assets field (required) containing videos and external audio references
    • Introduced ProjectVideoResponseModel for video asset metadata
    • Introduced ProjectExternalAudioResponseModel for external audio tracking
    • Introduced ProjectVideoThumbnailSheetResponseModel for video thumbnails
  • Get project snapshot

    • Response restructured with new ProjectSnapshotExtendedResponseModel
    • Added audio_duration_secs field (required) to snapshot data
    • Removed character_alignments from ProjectSnapshotResponseModel
    • Introduced ProjectSnapshotsResponseModel for snapshot collections

Billing

  • Get subscription
    • Added discounts array (required) with DiscountResponseModel entries
    • Deprecated discount_percent_off field (still available but marked for removal)
    • Deprecated discount_amount_off field (still available but marked for removal)
    • Added subtotal_cents field (integer, nullable) for pre-tax invoice totals
    • Added tax_cents field (integer, nullable) for tax amounts
    • Updated invoice examples to include new discount and total fields

Speech to Text

Text to Speech

Voice Management

Professional Voice Cloning (PVC)

Language Presets

  • Language preset models now include soft_timeout_translation field for localized soft timeout messages

API Integration Triggers

  • Registered new API collection: convai_api_integration_trigger_connections

Deprecations

  • Invoice fields discount_percent_off and discount_amount_off are deprecated; use discounts array instead