POST
/
v1
/
speech-to-speech
/
{voice_id}
/
stream
curl --request POST \
  --url https://api.elevenlabs.io/v1/speech-to-speech/{voice_id}/stream \
  --header 'Content-Type: multipart/form-data' \
  --form 'audio=<string>' \
  --form 'model_id=<string>' \
  --form 'voice_settings=<string>'
This response has no body data.
Speech to Speech API avaliable upon request. To get access, contact sales.

Headers

xi-api-key
string

Your API key. This is required by most endpoints to access our API programatically. You can view your xi-api-key using the 'Profile' tab on the website.

Path Parameters

voice_id
string
required

Voice ID to be used, you can use https://api.elevenlabs.io/v1/voices to list all the available voices.

Query Parameters

optimize_streaming_latency
integer
default: 0

You can turn on latency optimizations at some cost of quality. The best possible final latency varies by model. Possible values: 0 - default mode (no latency optimizations) 1 - normal latency optimizations (about 50% of possible latency improvement of option 3) 2 - strong latency optimizations (about 75% of possible latency improvement of option 3) 3 - max latency optimizations 4 - max latency optimizations, but also with text normalizer turned off for even more latency savings (best latency, but can mispronounce eg numbers and dates).

Defaults to 0.

Body

multipart/form-data
audio
string
required

The audio file which holds the content and emotion that will control the generated speech.

model_id
string
default: eleven_english_sts_v2

Identifier of the model that will be used, you can query them using GET /v1/models. The model needs to have support for speech to speech, you can check this using the can_do_voice_conversion property.

voice_settings
string

Voice settings overriding stored setttings for the given voice. They are applied only on the given request. Needs to be send as a JSON encoded string.

