A Comparison of LSTM and GRU for Bengali Speech-to-Text Transformation

Nusrat Jahan, Zakia Sultana, Fahim Chowdhury, Sajjad Ahmed,Mohammad Zavid Parvez,Prabal Datta Barua,Subrata Chakraborty

Proceedings of the 2023 International Conference on Advances in Computing Research (ACR’23)（2023）

引用 0|浏览4

暂无评分

摘要

This paper represents an approach to speech-to-text conversion in the Bengali language. In this area, we have found most of the methodologies were focused on other languages rather than Bengali. We started with a novel dataset of 56 unique words from 160 individual subjects was prepared. Then in this paper, we illustrate the approach to increasing accuracy in a speech-to-text over the Bengali language where initially we started with Gated Recurrent Unit(GRU) and Long short-term memory (LSTM) algorithms. During further observation, we found that the output of the GRU failed to give any stable output. So, we moved completely to the LSTM algorithm where we achieved 90% accuracy on an unexplored dataset. Voices of several demographic populations and noises were used to validate the model. In the testing phase, we tried a variety of classes based on their length, complexity, noise, and gender variant. Moreover, we expect that this research will help to develop a real-time Bengali speak-to-text recognition model.

查看译文

关键词

Long-Short Term Memory, Natural Language Processing, Gated Recurrent Unit, Voice Recognition, Speech to Text

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要