
Le Walk brings cities to life with ElevenLabs
Demand for digital tour guides rises with 10k+ tours taken and an average of 53 minutes listening time per session
Our AI agents can now seamlessly process both speech words and text inputs simultaneously, leading to more natural, efficient, and resilient user interactions.
Today, ElevenLabs is excited to announce a significant enhancement to our Conversational AI platform: the introduction of true text and voice multimodality. Our AI agents can now understand and process both spoken language and typed text inputs concurrently. This capability is designed to create more natural, flexible, and effective interactions for a wide range of use cases.
While voice offers a powerful and intuitive means of communication, voice-only AI agents can encounter challenges in certain situations. We have observed common failure modes in business deployments, such as:
By enabling agents to process both text and voice, we empower users to choose the input method best suited to the information they need to convey. This hybrid approach allows for smoother, more robust conversations. Users can speak naturally and then, when precision is paramount or typing is more convenient, seamlessly switch to text input within the same interaction.
The introduction of text and voice multimodality offers several key advantages:
Our multimodal Conversational AI includes the following functionalities:
This new multimodal functionality is natively supported across our platform:
Multimodal interactions benefit from all the existing innovations within our Conversational AI platform:
To begin using text and voice multimodality with your ElevenLabs Conversational AI agents:
We believe that text+voice multimodality will significantly enhance the capabilities and user experience of Conversational AI. We look forward to seeing how our users leverage this powerful new feature.
Demand for digital tour guides rises with 10k+ tours taken and an average of 53 minutes listening time per session
Supporting 10,000+ research conversations with natural, trustworthy voices
Powered by ElevenLabs Agents