It is said that audio is more important than visuals. Most people can accept bad visuals, but most can’t stand bad audio. Audio also evokes emotions and sets moods for your audience; it can be subtle, or it can be bombastic. Depending on the type of sounds and music that you use in your production, it can completely change the emotional context and meaning behind what you are trying to convey.

However, sometimes it’s quite difficult to find that perfect sound. But it has now become much easier with ElevenLabs, as our sound effects generator allows you to generate any sound imaginable by inputting a prompt, streamlining the process tremendously. Of course, this is not only a great tool for independent filmmakers or indie game developers. It is also a fantastic resource for big productions, sound designers, and producers because you can generate such a vast array of sounds.

We will go through some of them here in this documentation. Keep in mind that this is just scratching the surface. While the feature might seem simple at first glance, the understanding that the AI has of natural language, combined with the type of sound effects it can generate, opens up infinite possibilities.

The general layout for sound effects is fairly straightforward. You have a window where you will input a prompt, some settings, and a generate button. When you first open the web page, you will have a few suggestions below the text box to showcase what some of the prompts might look like that you can easily try out.

Each time you press generate, the AI will create full variations of the prompt you’ve given. The cost for using the sound effects generator is based on the length of the generated audio. If you let the AI decide the audio length itself, the cost is 200 characters per generation. If you set the duration yourself, the cost is 40 characters per second.

Prompting

A prompt is a piece of text or instruction that communicates to the AI model what kind of response or output is expected. The prompt serves as a starting point or context for the AI to understand the user’s intent and generate relevant and coherent output accordingly.

In this section, we will go through how to construct a good prompt as well as what a prompt entails. We will then categorize these prompts into simple prompts and complex prompts. In general, simple prompts instruct the AI to generate one sound, while complex prompts guide the AI to generate a series of sounds.

The AI understands both natural language, which will be explored further in complex prompts, and a lot of music terminology. Sound Effects currently works best when prompts are written in English.

Simple Prompts

Simple prompts are just that: they are straightforward, one-sided prompts where we try to get the AI to generate a single sound effect. This could be, for example, “person walking on grass” or “glass breaking.” These types of prompts will generate a single type of sound effect with a few variations within the same generation or in subsequent generations. All in all, they are fairly simple.

However, there are ways to improve these prompts by adding a little more detail. Even if they are simple prompts, they can yield better output by enhancing the prompt itself. For example, something that sometimes works is adding details like “high-quality, professionally recorded footsteps on grass, sound effects foley.” It may require some experimentation to find a good balance between being descriptive and keeping it brief enough for the AI to understand the prompt.

Opening a creaking door

Chopping wood

These types of prompts generate a single type of sound, but they might produce multiple variations of that sound within the same audio file. The AI is quite prone to doing that even without additional prompting, especially for short sounds like chopping wood, and also since that is a continuous action.

Complex Prompts

When referring to complex prompts, we don’t mean the length or the adjectives or adverbs used in the prompts. Although those can increase the complexity of the prompt, when we say complex prompts, we mean prompts where you have multiple sound effects or a sequence of sound effects happening in a specific order and the AI being able to replicate this.

A man walks through a hallway and then falls down some stairs

Let’s take the prompt above as an example. The AI needs to understand both what a man walking through the hallway sounds like and what a man falling down some stairs sounds like. It needs to understand the sequence in which these two actions are supposed to occur based on how you wrote it and then combine these sounds to make both of them sound coherent and correct. This is what we mean by a complex prompt because it involves both an understanding of sound and an understanding of the natural language explaining what you want.

The AI can handle this; for example, the result for the example prompt above should ideally be accurate.

However, in general, this is much more complicated for the AI to do because it is more complex. For the best results, we recommend generating individual sound effects and then combining them in an audio editor of your choice, much like you would with a real production where you have individual sound effects that are then combined.

A woman is singing in a church. Then someone coughs.

Settings

Once you’ve set your prompt and know what you want to generate, you can adjust the settings. Set how long you want the generated audio to be and how influential the prompt should be to the output.

There are just two settings:

Duration: Determine how long your generations should be. Depending on what you set this as, you can get quite different results. For example, if I write “kick drum” and set the length to 11 seconds, I might get a full drum loop with a kick drum in it, but that might not be what I want. On the other hand, if I set the length to 1 second, I might just get a one-shot with a single instance of a kick drum.

Prompt Influence: Slide the scale to make your generation perfectly adhere to your prompt or allow for a little creativity. This setting ranges from giving the AI more creativity in how it interprets the prompt to telling the AI to be more strict in following the exact prompt that you’ve given.

Sound Effects

Now that we are dealing with prompts, it is important to learn some terminology when it comes to audio to get the most out of the feature. You will have to prompt the AI with words and sentences in a way that it understands, and in this case, it understands both natural language and audio terminology.

There are many words that people working with audio know very well, and these are used in their daily vocabulary. However, for ordinary people, those words are completely foreign and might not mean anything. I will provide a short and very non-comprehensive list of some of the words you might want to test and that might be helpful to know.

Foley: The process of recreating and recording everyday sound effects like footsteps, movement, and object sounds in sync with the visuals of a film, TV show, or video game to enhance the audio quality and realism.

Whoosh: An effect that underscores movement, like a fist flying or a camera move. It’s versatile and can range from fast, ghostly, slow-spinning, rhythmic, noisy, to tense.

Impact: The sound of an object making contact with another object or structure, like a book falling, a car crashing, or a mug shattering.

Braam: A big, brassy, cinematic hit that conveys something epic and grand is about to happen, commonly used in movie trailers.

Glitch: The sound of a malfunction, jittering, scratching, skipping, or moving erratically, used for transitions, logo reveals, or sci-fi soundscapes.

Drone: A continuous, textured sound that adds atmosphere and suspense, often used to underscore exploration or horror scenes.

Onomatopoeias like “oink,” “meow,” “roar,” and “chirp” are also important sound effects that imitate natural sounds.

Examples

high-quality, wav, sound designed whoosh and braaam impact

In a case like this, it can sometimes be better to set the length instead of letting the AI decide. I know that I want a drawn-out “braaam” for this sound, so it will not be a very short sound. I will show you the results I get using both automatic and manual settings.

Duration set to 11 seconds:

Duration set to automatic:

high-quality, wav, sound designed whoosh

high-quality, wav, sound designed whoosh, aggressive

high-quality, wav, sound designed whoosh, aggressive, futuristic, electronic

Beyond Sound Effects

Even if the name of the feature is “Sound Effects,” don’t let that fool you. This is the perfect tool for sound designers, Foley artists, game developers, as well as producers and composers.

If you’re a hip-hop producer looking for samples, whether new or more old school, and are tired of digging in crates or reusing the same overused samples that everyone else uses, this is the perfect tool for you. If you are an EDM producer looking for one-shots or other samples, it’s perfect for you as well.

You can generate everything from individual one-shots to drum loops, instrumental loops, and unique new samples from big band sections and brass stabs—pretty much anything you can imagine.

I will go through a little bit of how to prompt this, but it involves a lot of trial and error to get what you want.

Stem: An individual track from a multitrack recording, such as isolated vocals, drums, or guitar.

BPM: Beats per minute, indicating the tempo of a piece of music.

Key: The scale in which a piece of music is set, such as C major or A minor.

Loop: A repeating section of sound material, commonly used in electronic music.

Sample: A portion of sound, typically a recording, used in musical compositions.

One-shot: A single, non-repeating sound or sample, often used in percussion.

These terms are, of course, just scratching the surface, as there are concepts such as synth pads, basslines, chord progressions, arpeggios, and many other musical terms that can be good to learn. However, the above can be a good starting point for generating musical material.

Examples

You can create individual one-shot drum sounds.

90s hip-hop beat, drum loop sample

Old-school funky bassline sample, stem, 88 BPM in F# minor

Old-school funky brass section from an old vinyl sample, stem, 88 BPM in F# minor

Old-school funky brass stabs from an old vinyl sample, stem, 88 BPM in F# minor

I do not remember the exact prompts for these, but they sounded good, so I wanted to include them. They showcase some other genres.

Then, you can, of course, take different types of samples and combine them to create full music. Anyone who’s ever worked as a producer, especially those familiar with old-school hip-hop where sampling old tracks and editing is the essence of the genre, will find this a treasure trove of new samples to use.

A professional producer could create something amazing with these types of samples, and we are very excited to see what might be developed. Here is a quick demo to show what just a few generations and a couple of minutes of work can achieve.

This is the final product.