L2 Mispronunciation Verification Based on Acoustic Phone Embedding and Siamese Networks

2018 11th International Symposium on Chinese Spoken Language Processing (ISCSLP)(2018)

引用 9|浏览476
暂无评分
摘要
Non-native mispronunciation verification with instructive feedback are desired in computer assisted pronunciation training (CAPT) system, as it benefits second language (L2) learners for improving their pronunciation. We proposed an approach of evaluating L2 learners’ goodness of pronunciation based on phone embedding and Siamese networks to address important research issues in CAPT: mispronunciation verification and pronunciation evaluation. A pair of acoustic feature vectors of phone segments with a pair-wise label was used as system inputs, and the feature vectors were encoded into high-level representation in form of phone embeddings by Siamese networks, consequently, each types of phones were expected to be differentiated by their embeddings’ similarities. As a result, Siamese networks with hinge cosine similarity which achieved an accuracy of 89.93% and outperformed the other methods, and diagnostic accuracy was 89.19% in pronunciation errors verification task with the best approach.
更多
查看译文
关键词
Acoustics,Task analysis,Hidden Markov models,Training,Speech recognition,Neural networks,Mathematical model
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要