Research on generalization property of time-varying Fbank-weighted MFCC for i-vector based speaker verification

ISCSLP(2014)

引用 5|浏览13
暂无评分
摘要
MFCC is one of the most popular features used in speaker verification, it involves not only speaker information, but also information of contents and channels. A session-aware Fbank weighting approach has been proposed, where the Fbanks that are more sensitive to session variance are de-weighted so that speaker discriminative banks are given prominence. Most of the current researches on Fbank weighting are within the GMM-UBM framework. In this paper, we study the contribution of Fbank weighting in the state-of-the-art i-vector architecture. We found that, due to the unsupervised learned loading matrix in the i-vector model, Fbank weighting shows no advantages in i-vector systems, if the simple cosine-distance scoring is used. However, when discriminative models such as LDA/PLDA are applied, the advantage of Fbank weighting can be recovered, which leads to significant performance improvement. Meanwhile we verified that weighting parameters are well generalizable: the parameters trained with a small bilingual database can be applied successfully in another i-vector system trained with a large multi-channel database.
更多
查看译文
关键词
cosine-distance scoring,i-vector architecture,gmm-ubm framework,i-vector,time-varying fbank-weighting,plda,generalization property,matrix algebra,speaker recognition,frequency-weighting,unsupervised learned loading matrix,speaker discriminative banks,mfcc,unsupervised learning,speaker verification,session-aware fbank weighting approach
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要