A Weighted Speaker-Specific Confusion Transducer-Based Augmentative and Alternative Speech Communication Aid for Dysarthric Speakers.

IEEE transactions on neural systems and rehabilitation engineering : a publication of the IEEE Engineering in Medicine and Biology Society(2019)

引用 12|浏览2
暂无评分
摘要
An augmentative and alternative speech communication (AASC) aid comprises a speech recognition system and a speech synthesis system. The main challenge in developing such an aid for dysarthric speakers lies in handling errors in the text derived from the recognition system. These errors (substitution, deletion, and insertion) may be due to inability of a dysarthric speaker to utter certain phones (articulatory error) or due to inaccuracy of the models trained (modeling error). Most existing AASC approaches only focus on the articulatory errors and the ones that do address both errors, do not differentiate between them. However, current work performs a three-level cascaded analysis to identify and distinguish between these errors, as differentiating these errors will aid in appropriately handling them. Further, analyses in the current work are independent of the syntax of utterances. Based on the analyses, weighted phone confusion transducers are formulated and used to correct erroneous text from the recognition system. The corrected text is finally synthesized by a text-to-speech synthesis system. The proposed AASC is observed to significantly reduce WER of severe dysarthric speakers from 100% to 41.52%, moderate from 61.85% to 18.08%, and mild from 12.23% to 8.55%.
更多
查看译文
关键词
Speech recognition,Hidden Markov models,Transducers,Text recognition,Syntactics,Acoustics,Analytical models
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要