Robust I-Vector Based Adaptation Of Dnn Acoustic Model For Speech Recognition

16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5(2015)

引用 59|浏览83
暂无评分
摘要
In the past, conventional i-vectors based on a Universal Background Model (UBM) have been successfully used as input features to adapt a Deep Neural Network (DNN) Acoustic Model (AM) for Automatic Speech Recognition (ASR). In contrast, this paper introduces Hidden Markov Model (HMM) based i-vectors that use HMM state alignment information from an ASR system for estimating i-vectors. Further, we propose passing these HMM based i-vectors though an explicit non-linear hidden layer of a DNN before combining them with standard acoustic features, such as log filter bank energies (LFBEs). To improve robustness to mismatched adaptation data, we also propose estimating i-vectors in a causal fashion for training the DNN, restricting the connectivity among hidden nodes in the DNN and applying a max-pool non-linearity at selected hidden nodes. In our experiments, these techniques yield about 5-7% relative word error rate (WER) improvement over the baseline speaker independent system in matched condition, and a substantial WER reduction for mismatched adaptation data.
更多
查看译文
关键词
speech recognition, adaptation of DNN acoustic model, i-vector, robustness
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要