Design a voice | ElevenLabs Documentation

Design a voice via a prompt. This method returns a list of voice previews. Each preview has a generated_voice_id and a sample of the voice as base64 encoded mp3 audio. To create a voice use the generated_voice_id of the preferred preview with the /v1/text-to-voice endpoint.

Headers

xi-api-keystringRequired

Query parameters

output_formatenumOptionalDefaults to mp3_44100_192

Output format of the generated audio. Formatted as codec_sample_rate_bitrate. So an mp3 with 22.05kHz sample rate at 32kbs is represented as mp3_22050_32. MP3 with 192kbps bitrate requires you to be subscribed to Creator tier or above. PCM with 44.1kHz sample rate requires you to be subscribed to Pro tier or above. Note that the μ-law format (sometimes written mu-law, often approximated as u-law) is commonly used for Twilio audio inputs.

Request

This endpoint expects an object.

voice_descriptionstringRequired20-1000 characters

Description to use for the created voice.

model_idenumOptionalDefaults to eleven_multilingual_ttv_v2

Model to use for the voice generation. Possible values: eleven_multilingual_ttv_v2, eleven_ttv_v3.

Allowed values:

textstring or nullOptional100-1000 characters

Text to generate, text length has to be between 100 and 1000.

auto_generate_textbooleanOptionalDefaults to false

Whether to automatically generate a text suitable for the voice description.

loudnessdoubleOptional-1-1Defaults to 0.5

Controls the volume level of the generated voice. -1 is quietest, 1 is loudest, 0 corresponds to roughly -24 LUFS.

seedinteger or nullOptional0-2147483647

Random number that controls the voice generation. Same seed with same inputs produces same voice.

guidance_scaledoubleOptional0-100Defaults to 5

Controls how closely the AI follows the prompt. Lower numbers give the AI more freedom to be creative, while higher numbers force it to stick more to the prompt. High numbers can cause voice to sound artificial or robotic. We recommend to use longer, more detailed prompts at lower Guidance Scale.

stream_previewsbooleanOptionalDefaults to false

Determines whether the Text to Voice previews should be included in the response. If true, only the generated IDs will be returned which can then be streamed via the /v1/text-to-voice/:generated_voice_id/stream endpoint.

should_enhancebooleanOptionalDefaults to false

Whether to enhance the voice description using AI to add more detail and improve voice generation quality. When enabled, the system will automatically expand simple prompts into more detailed voice descriptions. Defaults to False

remixing_session_idstring or nullOptional

The remixing session id.

remixing_session_iteration_idstring or nullOptional

The id of the remixing session iteration where these generations should be attached to. If not provided, a new iteration will be created.

qualitydouble or nullOptional-1-1

Higher quality results in better voice output but less variety.

reference_audio_base64string or nullOptional

Reference audio to use for the voice generation. The audio should be base64 encoded. Only supported when using the eleven_ttv_v3 model.

prompt_strengthdouble or nullOptional0-1

Controls the balance of prompt versus reference audio when generating voice samples. 0 means almost no prompt influence, 1 means almost no reference audio influence. Only supported when using the eleven_ttv_v3 model and providing reference audio.

Response

Successful Response

previewslist of objects

The previews of the generated voices.

textstring

The text used to preview the voices.

1	from elevenlabs import ElevenLabs
2
3	client = ElevenLabs(
4	base_url="https://api.elevenlabs.io"
5	)
6
7	client.text_to_voice.design(
8	voice_description="A sassy squeaky mouse with a playful and energetic tone, perfect for animated characters and lighthearted storytelling."
9	)

1	{
2	"previews": [
3	{
4	"audio_base_64": "SUQzBAAAAAAAI1RTU0UAAAAPAAADTGF2ZjU2LjI0LjEwNQAAAAAAAAAAAAAA//tQxAADBQAAAQABAAgAZGF0YQAAAAA=",
5	"generated_voice_id": "vce_9f8a7c3d2b4e4a1f9d6e7b8c",
6	"media_type": "audio/mpeg",
7	"duration_secs": 3.5,
8	"language": "en-US"
9	}
10	],
11	"text": "Every act of kindness, no matter how small, carries value and can make a difference, as no gesture of goodwill is ever wasted."
12	}

Headers

Query parameters

Request

Response

Errors