Prompting Eleven Music | ElevenLabs Documentation

This guide summarizes the most effective techniques for prompting the Eleven Music model. It covers genre & creativity, instrument & vocal isolation, musical control, and structural timing & lyrics.

The model is designed to understand intent and generate complete, context-aware audio based on your goals. High-level prompts like “ad for a sneaker brand” or “peaceful meditation with voiceover” are often enough to guide the model toward tone, structure, and content that match your use case.

Genre & Creativity

The model demonstrates strong adherence to genre conventions and emotional tone. Both musical descriptors of emotional tone and tone descriptors themselves will work. It responds effectively to both:

Abstract mood descriptors (e.g., “eerie,” “foreboding”)
Detailed musical language (e.g., “dissonant violin screeches over a pulsing sub-bass”)

Prompt length and detail do not always correlate with better quality outputs. For more creative and unexpected results, try using simple, evocative keywords to let the model interpret and compose freely.

Instrument & Vocal Isolation

The v1 model does not generate stems directly from a full track. To create stems with greater control, use targeted prompts and structure:

Use the word “solo” before instruments (e.g., “solo electric guitar,” “solo piano in C minor”).
For vocals, use “a cappella” before the vocal description (e.g., “a cappella female vocals,” “a cappella male chorus”).

To improve stem quality and control:

Include key, tempo (BPM), and musical tone (e.g., “a cappella vocals in A major, 90 BPM, soulful and raw”).
Be as musically descriptive as possible to guide the model’s output.

Musical Control

The model accurately follows BPM and often captures the intended musical key. To gain more control over timing and harmony, include tempo cues like “130 BPM” and key signatures like “in A minor” in your prompt.

To influence vocal delivery and tone, use expressive descriptors such as “raw,” “live,” “glitching,” “breathy,” or “aggressive.”

The model can effectively render multiple vocalists, use prompts like “two singers harmonizing in C” to direct vocal arrangement.

In general, more detailed prompts lead to greater control and expressiveness in the output.

Structural Timing & Lyrics

You can specify the length of the song (e.g., “60 seconds”) or use auto mode to let the model determine the duration. If lyrics are not provided, the model will generate structured lyrics that match the chosen or auto-detected length.

By default, most music prompts will include lyrics. To generate music without vocals, add “instrumental only” to your prompt. You can also write your own lyrics for more creative control. The model uses your lyrics in combination with the prompt length to determine vocal structure and placement.

To manage when vocals begin or end, include clear timing cues like:

“lyrics begin at 15 seconds”
“instrumental only after 1:45”

The model supports multilingual lyric generation. To change the language of a generated song in our UI, use follow-ups like “make it Japanese” or “translate to Spanish.”

Sample Prompts

The model allows you to move beyond song descriptors and into intent for maximum creativity.

Video Game with Musical Control

Mascara Audio Ad Creative

Live Indie Rock Performance

Create an intense, fast-paced electronic track for a high-adrenaline video game scene.
Use driving synth arpeggios, punchy drums, distorted bass, glitch effects, and
aggressive rhythmic textures. The tempo should be fast, 130–150 bpm, with rising tension,
quick transitions, and dynamic energy bursts.