Keyterm prompting
Overview
Keyterm prompting is only available with the Scribe v2 model and comes at an additional cost. See the API pricing page for detailed pricing information.
Keyterm prompting is a feature that allows you to highlight up to 100 words or phrases to bias the model towards transcribing them. This is useful for transcribing specific words or sentences that are not common in the audio, such as product names, names, or other specific terms. Keyterms are more powerful than biased keywords or customer vocabularies offered by other models, because it relies on the context to decide whether to transcribe that term or not.
For example, if your company name is not a common phrase or has a unique spelling or pronunciation you can use keyterms to ensure the model transcribes correctly. Take the following audio:
Without keyterm prompting, the model might transcribe the above as:
Which uses the wrong style for the company name. With keyterm prompting, you can ensure the model transcribes the above with the correct spelling and style:
Context
The model is able to use context to determine whether a term should be transcribed or not. When providing the keyterm “ElevenLabs”, the above audio transcribes as expected, yet the model will still be able to transcribe the following correctly based on the context:
Which outputs the following transcription:
Integrating keyterm prompting
Keyterm prompting is integrated into the Speech to Text API by passing the keyterms parameter to the convert method.