Semi-Supervised Gmm And Dnn Acoustic Model Training With Multi-System Combination And Confidence Re-Calibration

14TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2013), VOLS 1-5(2013)

引用 77|浏览39
暂无评分
摘要
We present our study on semi-supervised Gaussian mixture model (GMM) hidden Markov model (HMM) and deep neural network (DNN) HMM acoustic model training. We analyze the impact of transcription quality and data sampling approach on the performance of the resulting model, and propose a multi system combination and confidence re-calibration approach to improve the transcription inference and data selection. Compared to using a single system recognition result and confidence score, our proposed approach reduces the phone error rate of the inferred transcription by 23.8% relatively when top 60% of data are selected. Experiments were conducted on the mobile short message dictation (SMD) task. For the GMM-HMM model, we achieved 7.2% relative word error rate reduction (WERR) against a well-trained narrow-band fMPE+bMMI system by adding 2100 hours of untranscribed data, and 28.2% relative WERR over a wide-band MLE model trained from transcribed out-of-domain voice search data after adding 10K hours of untranscribed SMD data. For the CD-DNN-HMM model, 11.7% and 15.0% relative WERRs are achieved after adding 1K hours of untranscribed data using random and importance sampling, respectively. We also found using large amount of untranscribed data for pre training does not help.
更多
查看译文
关键词
semi-supervised acoustic model training,system combination,confidence re-calibration,importance sampling
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要