Content Normalization For Text-Dependent Speaker Verification

Subhadeep Dey,Srikanth R. Madikeri,Petr Motlícek,Marc Ferras

18TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2017), VOLS 1-6: SITUATED INTERACTION（2017）

引用 8|浏览52

暂无评分

摘要

Subspace based techniques, such as i-vector and Joint Factor Analysis (JFA) have shown to provide state-of-the-art performance for fixed phrase based text-dependent speaker verification. However, the error rates of such systems on the random digit task of RSR dataset are higher than that of Gaussian Mixture Model-Universal Background Model (GMM-UBM). In this paper, we aim at improving i-vector system by normalizing the content of the enrollment data to match the test data. We estimate i-vectors for each frames of a speech utterance (also called online i-vectors). The largest similarity scores across frames between enrollment and test are taken using these online i-vectors to obtain speaker verification scores. Experiments on Part3 of RSR corpora show that the proposed approach achieves 12% relative improvement in equal error rate over a GMM-UBM based baseline system.

查看译文

关键词

speaker verification, i-vectors, content matching

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要