谷歌浏览器插件
订阅小程序
在清言上使用

Deploying a Speech Recognition Model for Under-Resourced Languages: A Case Study on Dioula Wake Words 1, 2, 3, and 4

Ismaila Ouedraogo,Borlli Michel Jonas Some, Zakaria Cheick Oumar Keita, Emile Nabaloum, Fabrice Bationo,Roland Benedikter,Gayo Diallo

PROCEEDINGS OF 2023 7TH INTERNATIONAL CONFERENCE ON NATURAL LANGUAGE PROCESSING AND INFORMATION RETRIEVAL, NLPIR 2023(2023)

引用 0|浏览1
暂无评分
摘要
Speech recognition technology has the potential to provide valuable information and services to the 12.5 million Dioula speakers, especially the illiterates. However, these people, who could benefit the most, often do not have access to this technology because there are few data sets for resource-poor languages. This paper investigates the effectiveness of data augmentation in training wake words such as 1, 2, 3 and 4 in Dioula. The study contains two major contributions: the release of a Dioula language corpus for wake words 1, 2, 3 and 4, comprising 1.4 hours of audio with a labeled dataset, and a training of speech recognition model for 1, 2, 3, and 4 applying the data augmentation technique, which resulted in a significant improvement in accuracy from 51% to 96%. Additionally, the confusion matrices illustrate the model's enhanced predictive capacity, with an average of 1762 out of 1817 instances of the number "1" being correctly recognized after data augmentation. The study also uncovered an impressive reduction in loss from 205% to 14% after implementing data augmentation. These results underscore the pivotal role of data augmentation in improving the model's performance and mitigating overfitting issues, underscoring the promise of this technique in addressing data scarcity in underrepresented speech contexts. Training a speech recognition model to detect specific wake words, such as "1," "2," "3," and "4" in Dioula, can be highly valuable in constructing interactive voice response systems, thereby fostering greater inclusivity and accessibility for underserved communities.
更多
查看译文
关键词
Dioula language,voice recognition,user interface,under-resourced languages
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要