A Multi-modal Soft Targets Approach for Pronunciation Erroneous Tendency Detection

2018 11th International Symposium on Chinese Spoken Language Processing (ISCSLP)(2018)

引用 0|浏览478
暂无评分
摘要
Detecting pronunciation erroneous tendency (PET) can provide detailed instructive feedback for second language learners in computer aided pronunciation training (CAPT). In this paper, we utilize soft targets with knowledge from various models for improving the detection performance of PET. First, we examined the effectiveness of soft targets in three single systems by replacing hard targets with soft targets directly for mispronunciation detection. Then, two kinds of methods using multi-modal soft targets are proposed in this paper: 1) explicit combination, which uses multi-modal soft targets as the final targets by weighted linear combination; 2) implicit combination, which employs the multi-task framework to combine soft targets. Experimental results showed that the detection performance of PET can be improved by using both single soft targets and multi-modal soft targets. Moreover, using multi-modal soft targets within multi-task framework achieve the best results in pronunciation error detection task, and it is more efficient than conventional ensemble methods which require multiple decoding runs or forward passes.
更多
查看译文
关键词
Hidden Markov models,Training,Neural networks,Acoustics,Decoding,Computational modeling,Task analysis
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要