Advancing real-time conversational performance
We’re introducing ElevenLabs-hosted LLMs in our agents platform, enabling faster, more capable, and more efficient voice agents.
By hosting open-source models directly within our infrastructure, we deliver ultra-low latency and reduced reasoning cost, and our customers can now deploy voice agents without relying on additional providers.
Built for reasoning and responsiveness
With GLM 4.5 Air, ElevenLabs Agents achieve top-tier reasoning accuracy and tool-calling performance at roughly one-third the cost of alternatives.
For lighter reasoning tasks, Qwen3-30b-a3b delivers sub-150ms Time To First Sentence, enabling fluid, natural dialogue experiences.
Comparing ElevenLabs-hosted LLMs with State of the Art proprietary models
The benefits of co-located architecture
Our hosted LLMs operate alongside proprietary Speech to Text, Text to Speech, and turn-taking models within a single environment. This unified architecture reduces latency, improves reliability, and enhances data security.
Try it today
ElevenLabs-hosted LLMs are now available in Agents Platform.
Learn more about our LLM offering in our docs.