Entity detection
Overview
Entity detection comes at an additional cost. See the API pricing page for detailed pricing information.
Entity detection is a feature that allows you to detect specific words and phrases in the transcript, providing their exact timestamps. This is useful to detect credit card numbers, names, medical conditions or SSNs which can then be redacted.
When enabled, the model will detect specific entity types in the transcript and provide their exact timestamps.
For example, the following audio:
Outputs the following transcript when we specify "pii" (Personally Identifiable Information) as the entity type:
The result shows the PII entities detected in the transcript, with their exact timestamps.
Integrating entity detection
Entity detection is integrated into the Speech to Text API by passing the entity_detection parameter to the convert method.
Entity types
The following entity types are supported for detection. You can detect entire groups using pii, phi, or pci, or specify individual entity types by their label. To detect all entity types, use the all category.