Mnasr: a free speech corpus for mongolian speech recognition and accompanied baselines

Yihao Wu,Yonghe Wang,Hui Zhang,Feilong Bao,Guanglai Gao

2022 25th Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques (O-COCOSDA)（2022）

引用 0|浏览16

暂无评分

摘要

Thanks to the development of deep learning and the emergence of open source data sets, automatic speech recognition (ASR) has made great strides in mainstream languages such as Chinese and English. However, the research of ASR in Mongolian and other minority languages lags far behind the mainstream, due to low attention and limited open source data sets. To promote the development of new models and new methods for Mongolian ASR, this paper releases the MnASR database which contains 345 hours of Mongolian speech signal and the corresponding transcription. MnASR is the largest publicly available and free Mongolian speech database so far. Speech recognition baselines are made public at the same time. Both the database and the accompanied baselines are free for research purpose.

查看译文

关键词

Speech Recognition,Mongolian Dataset,Open Data

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要