Before running speech-to-text you can assess whether a language pack, optionally combined with a custom language model, is suitable for processing your audio. You can check:
To check whether words are present in the vocabulary
AssessSpeechLanguageModel. Media Server returns statistics and information about unknown words.QuerySpeechLanguageModel action to check whether the words are present in the vocabulary.To measure perplexity for a language model
AssessSpeechLanguageModel. Media Server returns a perplexity value. Perplexity values around or below 100 are acceptable for processing call center conversations. Perplexity values around or below 250 are acceptable for television news/broadcast audio. A lower perplexity value is generally better. If the AssessSpeechLangaugeModel action returns a perplexity value that is much higher, consider training a custom language model.For more information about the AssessSpeechLanguageModel and QuerySpeechLanguageModel actions, refer to the Media Server Reference.
|
|