Automatic Weighting For The Combination Of Tdoa And Acoustic Features In Speaker Diarization For Meetings
ICASSP (4)(2007)
摘要
In the task of speaker diarization for meetings it has been shown in previous work that it is useful to use the Time Delay of Arrival (TDOA) between the different audio channels in the meeting room as an extra source of information in addition to the acoustic features. When combining feature streams, we use a weight to control the relative contributions of the streams. In the past, this weight was determined using development data and the same weight value was applied to all meetings. In this paper we present a method for automatically determining the weight. A metric derived from the Bayesian Information Criterion (BIC) computed for each feature stream estimates the weight for each meeting on the initial clustering iteration and adapts its value throughout the diarization process. By using this technique we achieve a more robust system and up to 18.2% relative improvement over the method of tuning the weight on development data.
更多查看译文
关键词
speaker diarization,segmentation,clustering,BIC,features fusion,multi-stream
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络