Black Friday - Starter plan for $1

Redeem

Dust adds multilingual voice to AI-driven enterprise workflows using ElevenLabs

Expanding access and productivity with voice-first AI

Dust logo

Dust, the operating system for AI-native enterprises, now includes multilingual voice input and output - powered by ElevenLabs. Designed to integrate models into everyday work, Dust needed voice capabilities that could operate across languages, devices, and contexts with low latency and high realism.

This wasn’t exploratory. Voice became a product priority after repeated customer requests. The result: a system that supports hands-free agent interaction during commutes, multilingual collaboration across global teams, and professional audio outputs for asynchronous workflows.

Why voice matters in the enterprise

Dust identified four critical requirements for voice in a work context:

  • Natural quality that holds up to scrutiny: Voice output must sound professional and human - suitable for sharing in client emails, podcasts, or product demos.
  • Multilingual by default: Teams operate across global offices and languages. Switching between French, English, and German within a single session shouldn’t be an edge case.
  • Low latency: For both input and output, response speed must match the pace of thought and conversation.
  • Enterprise-grade data handling: No data retention, region-based routing, and compliance with SOC2 and GDPR were non-negotiable.

Why Dust chose ElevenLabs

After evaluating providers including OpenAI, Google, Deepgram, and AssemblyAI, Dust selected ElevenLabs for its superior quality and deployment readiness:

  • Text to Speech voices delivered consistently high realism with broad emotional range - critical for Dust’s Speech Generator and Sound Studio tools.
  • Speech to Text supported 99 transcription languages, with strong cross-language fidelity.
  • Zero Data Retention and multi-region routing ensured enterprise compliance out of the box.
  • Production-grade SDKs and APIs enabled rapid integration and consistent performance across platforms.

How Dust integrated voice

Dust built voice support across two core workflows:

1. Voice input: speaking to agents

Using ElevenLabs' scribe_v1 model, users can now talk to agents via microphone. The system automatically detects the spoken language, transcribes it, and routes the request accordingly, even inferring agent names from natural speech.

Voice input is available on mobile, aligning with moments when typing is least convenient.

2. Voice output: audio generated by agents

Through Speech Generator, Dust agents can create audio content using ElevenLabs’ eleven_multilingual_v2 and eleven_v3 models. Output includes podcasts, briefings, and narrative audio artifacts—used for both internal consumption and external sharing.

Sound Studio, powered by Text to Sound Effects, adds non-verbal audio layers for training and content use cases.

What Dust learned

  • Regional routing matters: Enabling EU/US region selection reduced latency and eased compliance conversations.
  • Curation beats abundance: A curated set of 12 voices reduces decision fatigue while covering all core needs.
  • Quality > speed: Despite faster models being available, users consistently chose higher-fidelity voices for production content.

What this enables

  • Mobile-first productivity: Capture thoughts and collaborate on the move.
  • Multilingual collaboration: Speak naturally in your own language—agents handle the rest.
    Accessible, async workflows: Turn research into audio, lower input barriers, and support diverse working styles.

What’s next

Dust is exploring real-time conversational voice agents, deeper audio understanding beyond transcription, and support for long-form inputs like meetings and presentations. By integrating ElevenLabs, Dust makes voice a seamless part of enterprise AI.

Explore articles by the ElevenLabs team

ElevenLabs

Create with the highest quality AI Audio

Get started free

Already have an account? Log in