Sound effects | ElevenLabs Documentation

Overview

ElevenLabs sound effects API turns text descriptions into high-quality audio effects with precise control over timing, style and complexity. The model understands both natural language and audio terminology, enabling you to:

Generate cinematic sound design for films & trailers
Create custom sound effects for games & interactive media
Produce Foley and ambient sounds for video content

Listen to an example:

Usage

Sound effects are generated using text descriptions & two optional parameters:

Duration: Set a specific length for the generated audio (in seconds)
- Default: Automatically determined based on the prompt
- Range: 0.1 to 22 seconds
- Cost: 40 credits per second when duration is specified
Prompt influence: Control how strictly the model follows the prompt
- High: More literal interpretation of the prompt
- Low: More creative interpretation with added variations

Developer quickstart

Learn how to integrate sound effects into your application.

Product guide

Step-by-step guide for using sound effects in ElevenLabs.

Prompting guide

Simple effects

For basic sound effects, use clear, concise descriptions:

“Glass shattering on concrete”
“Heavy wooden door creaking open”
“Thunder rumbling in the distance”

Complex sequences

For multi-part sound effects, describe the sequence of events:

“Footsteps on gravel, then a metallic door opens”
“Wind whistling through trees, followed by leaves rustling”
“Sword being drawn, then clashing with another blade”

Musical elements

The API also supports generation of musical components:

”90s hip-hop drum loop, 90 BPM”
“Vintage brass stabs in F minor”
“Atmospheric synth pad with subtle modulation”

Audio Terminology

Common terms that can enhance your prompts:

Impact: Collision or contact sounds between objects, from subtle taps to dramatic crashes
Whoosh: Movement through air effects, ranging from fast and ghostly to slow-spinning or rhythmic
Ambience: Background environmental sounds that establish atmosphere and space
One-shot: Single, non-repeating sound
Loop: Repeating audio segment
Stem: Isolated audio component
Braam: Big, brassy cinematic hit that signals epic or dramatic moments, common in trailers
Glitch: Sounds of malfunction, jittering, or erratic movement, useful for transitions and sci-fi
Drone: Continuous, textured sound that creates atmosphere and suspense

FAQ

What's the maximum duration for generated effects?

The maximum duration is 22 seconds per generation. For longer sequences, generate multiple effects and combine them.

Can I generate music with this API?

Yes, you can generate musical elements like drum loops, bass lines, and melodic samples. However, for full music production, consider combining multiple generated elements.

How do I ensure consistent quality?

Use detailed prompts, appropriate duration settings, and high prompt influence for more predictable results. For complex sounds, generate components separately and combine them.

What audio formats are supported?

Generated audio is provided in MP3 format with professional-grade quality (44.1kHz, 128-192kbps).