.webp&w=3840&q=95)
How I built a text-to-commercial generator using ElevenLabs, Gemini, and VEO 2
How I built a full AI-powered tool that turns rough prompts into finished video ads.
Xaia uses both speech to text and text to speech to improve patient care
Xaia is a clinical assistant built to streamline workflows, automate documentation, and support patients in real time throughout their care journey.
For medical use, Xaia requires precise transcription and context awareness. Their initial Speech-to-Text (STT) model introduced serious risks: it often generated phrases that were never spoken — even inserting false commentary during silence. In a clinical setting, where accuracy is critical, this was unacceptable. The system also missed key non-verbal cues like laughter, crying, or coughing, limiting the context clinicians rely on.
Switching to ElevenLabs’ Scribe fixed these issues. Our model sharply reduced hallucinations and captured high-accuracy transcripts, including important contextual sounds. This gave Xaia a fuller understanding of each patient interaction.
Now deployed in hospitals, Xaia utilizes ElevenLabs for both responsive Speech-to-Text (STT) and clear, natural Text-to-Speech (TTS). This integration directly benefits patients through reliable mental health support interactions, while clinicians gain access to more comprehensive and trustworthy patient data. Time saving for clinicians has been significant, around 50% for psychiatrists, for example.
This improved data quality, enabled by ElevenLabs’ accuracy, specifically the absence of hallucinations, combined with the context detection, saves providers significant time in preparing for and documenting patient encounters.
“Switching to ElevenLabs resulted in a dramatic improvement, removing hallucinations and giving us the full context of patient interactions. We have now deployed Xaia in hospitals to provide patients with reliable mental health support.”
— Omer Liran, MD, MHSH, Co-Founder & CTO, Xaia.health
How I built a full AI-powered tool that turns rough prompts into finished video ads.
Powering storytelling with natural, multilingual narration
Powered by ElevenLabs Conversational AI