Speech-To-Text transcribes words spoken in audio into text.
| Configuration Parameter | Description |
|---|---|
| CustomLanguageModel | The identifier and interpolation weight of each custom language model to use. |
| FilterMusic | Specifies whether to include speech-to-text results for audio segments identified as music or noise. |
| Input | The audio track to process. |
| LanguagePack | The language pack to use. |
| NumParallel | The maximum number of audio segments to process concurrently. |
| SampleFrequency | The sample frequency of the audio to send to the audio service. |
| SpeedBias | Specifies whether to prioritize processing accuracy or speed. |
| Type | The analysis engine to use. Set this parameter to SpeechToText. |
| Output track | Type | Description |
|---|---|---|
Result
|
SpeechToTextResult | Contains a record for each word. |
| Field name | Type | Description |
|---|---|---|
| id | UUID | A universally unique identifier to identify the section of audio described by the record. |
| text | TextData | The spoken word converted to text. |
| confidence | Int | The confidence score for the speech-to-text process. |
|
|