Environmental Noise Robustness for Korean Fricatives Using Speech Enhancement Generative Adversarial Networks.

BigComp(2019)

引用 0|浏览47
暂无评分
摘要
Currently, speech recognition technology has a high recognition rate in a quiet environment condition. However, noise processing is needed in environmental noise conditions because they lower the recognition rate. In particular, in the case of Korean fricatives corresponding to [s], , [h], [ç], [x], and , noise processing is relatively difficult because the acoustic characteristics are similar to environmental noise. This paper proposes speech enhancement of Korean fricatives using the Speech Enhancement Generative Adversarial Network (SEGAN) for environmental noise robustness. SEGAN is a Deep Neural Network (DNN)-based generation model that is used to improve the clarity and quality of environmental noisy speech. Enhanced Korean fricative speech data are generated using SEGAN-trained Korean fricative speech data. The results showed that using DNN-Hidden Markov Model (HMM)-based acoustic model training by adding enhanced speech as training data resulted in a 0.26% Character Error Rate (CER) in environmental noise conditions compared to the case without enhanced speech addition.
更多
查看译文
关键词
Speech enhancement,Hidden Markov models,Speech recognition,Noise measurement,Working environment noise,Data models
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要