Research on generalization property of time-varying Fbank-weighted MFCC for i-vector based speaker verification
ISCSLP(2014)
摘要
MFCC is one of the most popular features used in speaker verification, it involves not only speaker information, but also information of contents and channels. A session-aware Fbank weighting approach has been proposed, where the Fbanks that are more sensitive to session variance are de-weighted so that speaker discriminative banks are given prominence. Most of the current researches on Fbank weighting are within the GMM-UBM framework. In this paper, we study the contribution of Fbank weighting in the state-of-the-art i-vector architecture. We found that, due to the unsupervised learned loading matrix in the i-vector model, Fbank weighting shows no advantages in i-vector systems, if the simple cosine-distance scoring is used. However, when discriminative models such as LDA/PLDA are applied, the advantage of Fbank weighting can be recovered, which leads to significant performance improvement. Meanwhile we verified that weighting parameters are well generalizable: the parameters trained with a small bilingual database can be applied successfully in another i-vector system trained with a large multi-channel database.
更多查看译文
关键词
cosine-distance scoring,i-vector architecture,gmm-ubm framework,i-vector,time-varying fbank-weighting,plda,generalization property,matrix algebra,speaker recognition,frequency-weighting,unsupervised learned loading matrix,speaker discriminative banks,mfcc,unsupervised learning,speaker verification,session-aware fbank weighting approach
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要