Intelligibility Improvement of Dysarthric Speech using MMSE DiscoGAN

2020 International Conference on Signal Processing and Communications (SPCOM)(2020)

引用 5|浏览20
暂无评分
摘要
Dysarthria is a manifestation of the disordering in articulatory parts that are used during speech production, which results in uneven, slow, slurred, monotone speech or speech in an abnormal rhythm. People with dysarthria produce less intelligible speech. Improving the intelligibility of dysarthric speech is challenging because unlike normal speech, there is less amount of data for dysarthric speech. It is a known fact that dysarthric speech and normal speech are different in speech production-perception perspectives. Recently, Generative Adversarial Network (GAN)-based architectures have become more popular to learn such kind of cross-domain relationships efficiently. In this paper, we propose to use Discover GAN (DiscoGAN) along with Mean Square Error (MSE) regularization (i.e., MMSE DiscoGAN) for Dysarthric-to-Normal speech conversion. In particular, a direct feature-based mapping technique is used to train all the models. In the end, we use the Automatic Speech Recognition (ASR) to measure the Phoneme Error Rate (PER) for a particular speaker. Proposed method is compared with baseline Deep Neural Network (DNN)-based system. Training of both the architectures and the evaluations were carried out on UA corpus. By analyzing the results, we observed that MMSE DiscoGAN outperforms DNN by 13.16% and 9.64% for male and female, respectively. Moreover, proposed GAN-based frameworks efficiently improve the intelligibility of dysarthric speech, and generate more naturalsounding speech compared to the DNN-based models.
更多
查看译文
关键词
MMSE DiscoGAN,monotone speech,intelligible speech,speech production-perception perspectives,automatic speech recognition,natural sounding speech,dysarthric-to-normal speech conversion,articulatory disordering,speech production,abnormal rhythm,dysarthria,intelligibility improvement,generative adversarial network,cross-domain relationships,discover GAN,mean square error regularization,direct feature-based mapping technique,phoneme error rate
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要