ModelVersion

The model to use to convert speech into text.

  • The micro model is the fastest. Use this model if you want to prioritize speed over accuracy, or if you need to process live streams without GPU acceleration.
  • The small model provides a good balance between accuracy and performance for English speech.
  • The medium model provides a significant increase in accuracy for non-English speech and a modest increase in accuracy for English, at the cost of greater memory requirements and longer processing times.
  • The large model provides maximum accuracy for non-English speech. This model is not available for English. If you specify English (for example by setting LanguagePack to ENUK), Media Server uses the medium model instead.
Type: String
Default:  
Required: Yes
Configuration Section: TaskName
Example: ModelVersion=small
See Also: