Combining I-Vector Representation And Structured Neural Networks For Rapid Adaptation

2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)(2016)

引用 10|浏览26
暂无评分
摘要
Rapid adaptation of deep neural networks (DNNs) with limited unsupervised data remains a significant challenge. This paper investigates the combination of two schemes that have been proposed to address this problem: i-vector representations and multi-basis adaptive neural networks (MBANNs). Two approaches for combining these schemes together are described. The first uses i-vectors as one of the input features to the MBANN. The purpose is to combine the speaker representation of the i-vector with the network interpolation of the MBANN scheme. The second approach aims to reduce the computational cost, and improve the robustness to hypothesis errors, of the MBANN scheme. Here i-vectors are used to predict the interpolation weights of the MBANN scheme. This removes the need for an initial decoding pass, and alignment, which was previously used. These approaches are evaluated using acoustic and language models trained on a U.S. English Broadcast News (BN) transcription task. Two distinct sets of test data are examined. The first from the BN task, yields test data acoustically matched to the training data. The second, acoustically mismatched, set is from Youtube videos. The performance gains from these schemes is found to be sensitive to the level of mismatch between training and test.
更多
查看译文
关键词
Rapid adaptation,structured deep neural networks,i-vectors,acoustic modeling
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要