Integrating Online I-Vector Extractor With Information Bottleneck Based Speaker Diarization System

Srikanth R. Madikeri,Ivan Himawan,Petr Motlícek,Marc Ferras

16TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION (INTERSPEECH 2015), VOLS 1-5（2015）

引用 34|浏览21

暂无评分

摘要

Conventional approaches to speaker diarization use short-term features such as Mel Frequency Cepstral Co-efficients (MFCC). Features such as i-vectors have been used on longer segments (minimum 2.5 seconds of speech). Using i-vectors for speaker diarization has been shown to be beneficial as it models speaker information explicitly. In this paper, the i-vector modelling technique is adapted to be used as short term features for diarization by estimating i-vectors over a short window of MFCCs. The Information Bottleneck (IB) approach provides a convenient platform to integrate multiple features together for fast and accurate diarization of speech. Speaker models are estimated over a window of 10 frames of speech and used as features in the IB system. Experiments on the NIST RT datasets show an absolute improvement of 3.9% in the best case when i-vectors are used as auxiliary features to MFCCs. Further, discriminative training algorithms such as LDA and PLDA are applied on the i-vectors. A best case performance improvement of 5% in absolute terms is obtained on the RT datasets.

查看译文

关键词

speaker diarization, online i-vectors, Information Bottleneck

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要