Word Characters And Phone Pronunciation Embedding For Asr Confidence Classifier

2019 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP)(2019)

引用 2|浏览50
暂无评分
摘要
Confidence classifier is an integral component of an automatic speech recognition (ASR) system. These classifiers predict the accuracy of an ASR hypothesis by associating a confidence score in [0,1] range, where larger score implies higher probability of the hypothesis being correct. Confidence scores have significant applications in ASR system design, training data selection, model adaptation, and other ASR applications. In this work we focus on word embedding features to improve confidence classifier, and introduce character and phone embeddings as confidence features. We motivate these features in the context of representing and factorizing acoustic scores along the proposed features. We evaluate our work on large scale ASR tasks, and demonstrate significant improvement in the confidence performance with the proposed features. At our typical operating point, we report 8% relative reduction in false alarm (FA) for limited vocabulary enUS Xbox task, and 9.9% relative reduction in FA for large vocabulary enUS server task. We also conducted server experiments for our proposed features in combination with natural language Glove embeddings, and improved the overall relative reduction in FA to 16%.
更多
查看译文
关键词
Confidence Classifier, Speech Recognition, Deep Learning, Word Embedding
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要