Inworld Joins Forces with ElevenLabs to Bring Dynamic Voices to AI NPCs

Inworld adds sought-after voice capabilities to its world-leading AI Character Engine.

Loading the Elevenlabs Text to Speech AudioNative Player...

SAN FRANCISCO, Aug 8, 2023 - Inworld AI, the leading character engine for AI NPCs, today announced a new partnership with ElevenLabs, the industry leader in voice AI. Realistic voices add an emotional dimension to AI NPCs, and allows them to adapt their tone, intonation, and style of speech based on the context of the user interaction.

The integration is featured in a mod of Grand Theft Auto V where the NPCs are powered by Inworld. Developed by Bloc, players take on the role of a police officer who, along with your NPC partner, is tasked with a unique series of missions where every NPC can hold unscripted realistic conversations in real-time.

The Inworld Character Engine goes beyond LLMs (large language models), incorporating configurable safety, knowledge, memory, narrative controls, multimodality and now, realistic voices. With just a few clicks, Inworld customers can add ElevenLabs voices to their AI NPCs - either by choosing one of ElevenLabs’ pre-made custom voices, cloning their own voices using a one-minute sample, or by synthesizing entirely new voices from scratch.

"We are thrilled to incorporate ElevenLabs' real-time speech technology, which strengthens our already comprehensive off-the-shelf system for generative AI NPC creation,” said Kylan Gibbs, Chief Product Officer of Inworld. “By responding to community demand for enhanced voice capabilities, we get one step closer to making characters more believable and lifelike. We’re equipping developers with the tools to go beyond dialogue trees and scripted interactions.”  

"By combining our leading AI speech software with Inworld's platform, we are pushing the boundaries of immersive gaming experiences and adding an extra layer of possibility to gaming worlds,” said Mati Staniszewsi, CEO of ElevenLabs. “Our multi-purpose tool brings top-quality spoken audio to AI characters, incorporating human-like intonation and inflection while adapting to contextual cues. We are very excited about this development and can’t wait to see how it is used by the wider developer community.”

Both companies are known for bringing innovation to their respective areas of expertise in AI. Inworld has combined multiple AI and ML (machine learning) models to create characters that can hold unscripted, open-ended conversations, experience emotions, respond to triggers, remember shared lore or brand knowledge, and pursue their own goals. The characters can easily integrate with any avatar or character visuals and all major game engines.

ElevenLabs is a world leader in AI voice technology research. Its multi-purpose speech synthesis tool renders human intonation and emotions hyper-realistically and it can adjust delivery based on the context, providing lifelike spoken audio in any voice and style. Their technology has already been embraced by a number of market verticals and industries, ranging from content creation and video game development to publishing and accessibility.  

“Our goal at Inworld is to power the most believable, lifelike AI characters for games and immersive media,” concluded Kylan GIbbs. “A player should be able to ask an NPC anything, and that NPC needs to respond in a way that maintains the integrity of the game narrative and character design. We are looking forward to continuing to deliver infinite worlds and experiences with realistic voices that augment the sophistication of our offering.”

For more information please visit: ElevenLabs and Inworld.

About ElevenLabs

Established in 2022, ElevenLabs is a voice technology research company developing world-leading text-to-speech software for publishers, creators and developers. Their mission is to make content universally accessible.

ElevenLabs’ research teams have pioneered text-to-speech capabilities combining two novel approaches to speech synthesis to achieve ultra-realistically delivery: context awareness and high compression.

The ElevenLabs model is able to understand the relations between words and to adjust delivery based on context (‘contextual’ text-to-speech). Because there are no hardcoded voice features in the model, it can robustly predict thousands of voice characteristics while generating AI voices.

About Inworld AI

Inworld AI is the leading Character Engine for powering AI-driven characters in gaming, entertainment, and interactive experiences. Recent AI NPC experiences include Team Miaozi (NetEase Games), Niantic, ILM Immersive, LG UPlus, and Alpine Electronics.

Founded in 2021 by experts that have pioneered conversational AI platforms and generative models at API.AI (acquired by Google and renamed Dialogflow), Google, and DeepMind, Inworld uses advanced AI to build generative characters whose personalities, thoughts, memories, and behaviors are designed to mimic the deeply social nature of human interaction.

Inworld is backed by Lightspeed Venture Partners, Section 32, Intel Capital, Founders Fund, the Disney Accelerator, Microsoft’s M12 fund,  BITKRAFT Ventures, First Spark Ventures co-founded by Eric Schmidt, The Venture Reality Fund, Kleiner Perkins, CRV, Stanford University, Meta, Micron Ventures, LG Technology Ventures, Samsung Next, NTT Docomo Ventures, and SK Telecom Venture Capital. Inworld’s advisors include award-winning writer and futurist Neal Stephenson, AI visionary and vice president at Google (formerly at Unity) Danny Lange,  and immersive entertainment pioneer and former senior vice president of research and development at Walt Disney Imagineering Jon Snoddy

For more information, visit

Try ElevenLabs today

Get Started Free