New models for recognizing Russian speech in VOSK library

17 Nov 2021 6:09 am GMT+0000 Date Time

Library developers vosk Published New Models For Russian speech recognition: Server VOSK-model-EN-0.22 and mobile VOSK-Model-Small-RU-0.22. New speech data is used in models, as well as a new neuro-net architecture, which made it possible to increase the recognition accuracy by 10-20%. Code and data are distributed under the APACHE 2.0 license.

Important changes:

New data collected in voice columns will significantly improve the recognition of speech commands sold from the distance.
The new sound extraction scheme made it possible to significantly improve the recognition accuracy for broadband records. At the same time, the accuracy of telephony recognition also improved.
Package for supplementing the dictionary allows you to customize the recognition of complex technical records.

For best accuracy, it is recommended to update the wax version to 0.3.32 . New wax features can also be interesting – integration with Unity, NativeScript, Jigasi. Models for recognition of Kazakh and Ukrainian languages. The server model for the work is needed a modern processor and 8GB of memory. Mobile model can be used in phones and RASPBERRYPI 3 +.

/Media reports.