
Introducing Experiments in ElevenAgents
The most data-driven way to improve real-world agent performance.
We've rolled out multi-region serving for our Text to Speech API. Requests now automatically route to the nearest backend (US, Netherlands, or Singapore) delivering faster time to first byte (TTFB) with no code changes required.
When you call api.elevenlabs.io, our infrastructure routes to the optimal backend based on your location:
You can verify your serving region via the x-region header in the API response.
With upgraded GPUs and an optimized inference stack, Flash v2.5 achieves 50ms model time to first byte, and with network routing improvements on top, that leads to large reductions in perceived latency.
Measured TTFB improvements across 11 global locations:
For most international developers, this represents a 20-40% reduction in perceived latency.
For voice agents and real-time applications, 150ms less latency means more natural conversations, better responsiveness, and a consistent experience for users regardless of geography. Combined with Flash v2.5's inference speed, this is the fastest agentic Text to Speech available.
No migration needed. If you're calling api.elevenlabs.io, global routing is already active.
To opt-out of the global routing and always use USA servers, use the api.us.elevenlabs.io base URL for your API requests.
See our latency optimization guide for additional best practices. Enterprise customers requiring regional data residency can contact sales.

The most data-driven way to improve real-world agent performance.
.webp&w=3840&q=95)
UK AI Security Institute researchers will explore the implications of AI voice technology