Research on generalization property of time-varying Fbank-weighted MFCC for i-vector based speaker verification

Jun Wang,Lantian Li,Dong Wang,Thomas Fang Zheng

ISCSLP（2014）

引用 5|浏览13

暂无评分

摘要

MFCC is one of the most popular features used in speaker verification, it involves not only speaker information, but also information of contents and channels. A session-aware Fbank weighting approach has been proposed, where the Fbanks that are more sensitive to session variance are de-weighted so that speaker discriminative banks are given prominence. Most of the current researches on Fbank weighting are within the GMM-UBM framework. In this paper, we study the contribution of Fbank weighting in the state-of-the-art i-vector architecture. We found that, due to the unsupervised learned loading matrix in the i-vector model, Fbank weighting shows no advantages in i-vector systems, if the simple cosine-distance scoring is used. However, when discriminative models such as LDA/PLDA are applied, the advantage of Fbank weighting can be recovered, which leads to significant performance improvement. Meanwhile we verified that weighting parameters are well generalizable: the parameters trained with a small bilingual database can be applied successfully in another i-vector system trained with a large multi-channel database.

查看译文

关键词

cosine-distance scoring,i-vector architecture,gmm-ubm framework,i-vector,time-varying fbank-weighting,plda,generalization property,matrix algebra,speaker recognition,frequency-weighting,unsupervised learned loading matrix,speaker discriminative banks,mfcc,unsupervised learning,speaker verification,session-aware fbank weighting approach

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要