It is said that audio is more important than visuals. Most people can accept bad visuals, but most can’t stand bad audio. Audio also evokes emotions and sets moods for your audience; it can be subtle, or it can be bombastic. Depending on the type of sounds and music that you use in your production, it can completely change the emotional context and meaning behind what you are trying to tell.

However, sometimes it’s quite difficult to find that perfect sound. But it has now gotten a whole lot easier with ElevenLabs, as our sound effects generator allows you to generate any sound imaginable by inputting a prompt, streamlining the process tremendously. Of course, this is not only a great tool for independent filmmakers or indie games. It is also a fantastic resource for big productions, sound designers, and producers, because you can generate such a vast array of sounds

We will go through some of them here in this documentation. Keep in mind that this is just scratching the surface. While the feature might seem simple at first glance, the understanding that the AI has of natural language, together with the type of sound effects it can generate, gives way to infinite possibilities.

The general layout for sound effects is fairly straightforward. You have a window where you will input a prompt, you have some settings, and you have a generate button. When you first open the web page, you will have a few suggestions below the text box to showcase what some of the prompts might look like that you can easily try out.

Each time you press generate, the AI will generate full variations of the prompt that you’ve given. The cost for using the sound effects generator is based on the length of the generated audio. If you let the AI decide the audio length for itself, the cost is 200 characters per generation. If you set the duration yourself, the cost is 40 characters per second.

Prompting

A prompt is a piece of text or instruction that communicates to the AI model what kind of response or output is expected from it. The prompt serves as a starting point or context for the AI to understand the user’s intent and generate relevant and coherent output accordingly.

In this section, we will go through how to construct a good prompt as well as what a prompt is. We will then split these prompts into simple prompts and complex prompts. In general, simple prompts are prompting the AI to generate one sound, while complex prompts are instructing the AI to generate a series of sounds.

The AI understands both natural language which will go into a little bit more in complex prompts but it also understands a lot of music terminology 

Simple Prompts

Simple prompts are just that: they are simple, one-sided prompts where we try to get the AI to generate a single sound effect. This could be, for example, “person walking on grass” or “glass breaking.” These types of prompts will generate a single type of sound effect with a few variations either in the same generation or in subsequent generations. All in all, they are fairly simple.

There are a few ways to improve these prompts, however, and that is by adding a little bit more detail. Even if they are simple prompts, they can be made to give better output by improving the prompt itself. For example, something that sometimes works is adding details like “high-quality, professionally recorded footsteps on grass, sound effects foley.” It can require some experimentation to find a good balance between being descriptive and keeping it brief enough to have AI understand the prompt.

Opening a creaking door

Chopping wood

These types of prompts generate a single type of sound, but they might generate multiple variations of that sound within the same audio file. The AI seems quite prone to do that even without extra prompting, especially for short sounds like chopping wood, and also since that is a continuous action.

Complex Prompts

When talking about complex prompts, we don’t mean the length or the adjectives or adverbs used in the prompts. Although those can increase the complexity of the prompt, when we say complex prompts, we mean prompts where you have multiple sound effects or a sequence of sound effects happening in a specific order and AI being able to replicate this.

A man walks through a hallway and then falls down some stairs

Let’s take the prompt above as an example. The AI needs to understand both what a man walking through the hallway sounds like, as well as what a man falling down some stairs sounds like. It needs to understand the sequence in which these two things are supposed to happen based on how you wrote it, and then combine these sounds to make both of them sound coherent and correct. This is what we mean when we say a complex prompt because it involves both an understanding of sound and an understanding of natural language explaining what you want.

The AI can do this; for example, the result for the example prompt above would hopefully be accurate.

However, in general, this is much more complicated for the AI to do because it is a lot more complex. For the best results, we would recommend generating individual sound effects and then combining them in an audio editor of your choice, like you would with a real production where you have individual sound effects that are then combined.

A woman is singing in a church. Then someone coughs.

Settings

Once you’ve set your prompt and you know what you want to generate, you can jump into the settings. Set how long you want the generated audio to be and how influential the prompt should be to the output.

There are just two settings:

Duration: Determine how long your generations should be. Depending on what you set this as, you can get quite different results. For example, if I write “kick drum” and set the length to 11 seconds, I might get a full drum loop with a kick drum in it, but that might not be what I want. On the other hand, if I set the length to 1 second, I might just get a one-shot with a single instance of a kick drum.

Prompt Influence: Slide the scale to make your generation perfectly adhere to your prompt or allow for a little creativity. This setting ranges from giving the AI more creativity in how it interprets the prompt to telling the AI to be more strict in following the exact prompt that you’ve given.

Sound Effects

Now that we are dealing with prompts, it is important to learn some terminology when it comes to audio to get the most out of the feature. You will have to prompt the AI with words and sentences in a way that it understands, and in this case, it understands both natural language and audio terminology.

There are a lot of words that people working with audio know very well, and these are used in their daily vocabulary. However, for ordinary people, those words are completely foreign and might not mean anything. I will give a short and very non-comprehensive list of some of the words you might want to test and that might be helpful to know. 

Foley: The process of recreating and recording everyday sound effects like footsteps, movement, and object sounds in sync with the visuals of a film, TV show, or video game to enhance the audio quality and realism.

Whoosh: An effect that underscores movement, like a fist flying or a camera move. It’s versatile and can range from fast, ghostly, slow-spinning, rhythmic, noisy, and tense.

Impact: The sound of an object making contact with another object or structure, like a book falling, a car crashing, or a mug shattering.

Braam: A big, brassy, cinematic hit that conveys something epic and grand is about to happen, commonly used in movie trailers.

Glitch: The sound of a malfunction, jittering, scratching, skipping, or moving erratically, used for transitions, logo reveals, or sci-fi soundscapes.

Drone: A continuous, textured sound that adds atmosphere and suspense, often used to underscore exploration or horror scenes.

Onomatopoeias like “oink”, “meow”, “roar”, and “chirp” are also important sound effects that imitate natural sounds.

Examples

high-quality, wav, sound designed whoosh and braaam impact

In a case like this, it can sometimes be better to set the length instead of letting the AI decide. I know that I want a drawn out “braaam” for this sound, so it will not be a very short sound. I will show you the results I get using both automatic and manual settings.

Duration set to 11 seconds:

Duration set to automatic:

high-quality, wav, sound designed whoosh

high-quality, wav, sound designed whoosh, aggressive

high-quality, wav, sound designed whoosh, aggressive, futuristic, electronic

Beyond Sound Effects

Even if the name of the feature is called “Sound Effects,” don’t let that fool you. This is the perfect tool for sound designers, Foley artists, game developers, as well as producers and composers. 

If you’re a hip-hop producer looking for samples, new or more old school, and are tired of digging in crates or reusing the same overused samples that everyone else uses, this is the perfect tool for you. If you are an EDM producer looking for one-shots or other samples, it’s perfect for you as well.

You can generate everything  from Individual one shots to drum Loops to instrumental Loops to Unique new samples  from Big Band sections and Brass stabs pretty much anything you can imagine.

I will go through a little bit of how to prompt this, but it is a lot of trial and error to get what you want.

Stem: An individual track from a multitrack recording, such as isolated vocals, drums, or guitar.

BPM: Beats per minute, indicating the tempo of a piece of music.

Key: The scale in which a piece of music is set, such as C major or A minor.

Loop: A repeating section of sound material, commonly used in electronic music.

Sample: A portion of sound, typically a recording, used in musical compositions.

One-shot: A single, non-repeating sound or sample, often used in percussion.

These terms are, of course, just scratching the surface, as there are things such as synth pads, baselines, chord progressions, arpeggio, and many, many other musical terms that can be good to learn. However, the above can be good to get started with generating musical material.

Examples

You can individual one-shot drum sound.

90s hiphop beat, drum loop sample

old-school funky bassline sample, stem, 88bpm in F# minor

old-school funky brass section from an old vinyl sample, stem, 88bpm in F# minor

old-school funky brass stabs from an old vinyl sample, stem, 88bpm in F# minor

I do not remember the exact prompts for these, but they sounded good, so I wanted to include them. They showcase some other genres.

Then, you can, of course, take different types of samples and combine them to create full music. Anyone who’s ever worked as a producer, especially those familiar with old-school hip-hop where sampling old tracks and editing is the essence of the genre, will find this a treasure trove of new samples to use.

A professional producer could make something amazing with these types of samples, and we are very excited to see what might be created. Here is a quick demo to show what just a few generations and a couple of minutes of work can get you.

This is the final product.