
SIP Upgrades for ElevenLabs Agents
Launch reliable, secure AI agents while maintaining your existing phone numbers, routing logic, and infrastructure.
Our ultra-low latency streaming Speech to Text model optimized for agentic use cases is now live in Agents Platform.
This week, we introduced Scribe v2 Realtime - our ultra-low latency streaming Speech to Text model, optimized for agentic use cases that depend on speed, accuracy, and conversational precision.
Scribe v2 Realtime transcribes speech in under 150ms with state-of-the-art accuracy, enabling agents to respond as naturally as humans do in conversation.
Most Speech to Text systems perform well in clean test environments but struggle when faced with reality - noisy backgrounds, diverse accents, or identifiers like names, emails, and IDs.
Scribe v2 Realtime was trained to handle exactly these challenges.
In internal benchmarks across hundreds of challenging English conversation samples featuring poor audio quality, diverse accents, and filler words, Scribe v2 Realtime captured user intent more accurately than any competing real-time ASR model.
Below are a couple real-world examples we tested Scribe v2 Realtime transcription accuracy on in different environments.
As a global company with a large share of agents deployed in Spanish, Portuguese, Hindi, and many other languages, it was critical that Scribe v2 Realtime maintained state-of-the-art performance across regions.
On the FLEURS multilingual benchmark, which measures accuracy across 30 languages, Scribe v2 Realtime achieved the lowest Word Error Rate (WER) of any low-latency ASR model.
This allows enterprises to launch multilingual agents that respond instantly and accurately, without compromising on speed or precision.
Scribe v2 Realtime is fully integrated into ElevenLabs Agents and can be enabled under the Advanced configuration section.


Launch reliable, secure AI agents while maintaining your existing phone numbers, routing logic, and infrastructure.

Making legal knowledge more accessible and human across jurisdictions and cultures
Bereitgestellt von ElevenLabs Agenten