ViVoLAB Speaker Diarization System for the DIHARD 2019 Challenge

INTERSPEECH(2019)

引用 7|浏览48
暂无评分
摘要
This paper presents the latest improvements in Speaker Diarization obtained by ViVoLAB research group for the 2019 DIHARD Diarization Challenge. This evaluation seeks the improvement of the diarization task in adverse conditions. For this purpose, the audio recordings involve multiple scenarios with no restrictions in terms of speakers, overlapped speech nor quality of the audio. Our submission follows the traditional segmentation-clustering-resegmentation pipeline: Speaker embeddings are extracted from acoustic segments with a single speaker on them, later clustered by means of a PLDA. Our contribution in this work is focused on the clustering step. We present results with our Variational Bayes PLDA clustering and our tree-based clustering strategy, which sequentially assigns the different embeddings to its corresponding speaker according to a PLDA model. Both strategies compare multiple diarization hypotheses and choose their candidate one according to a generative criterion. We also analyze the impact of the different available embeddings in the state-of-the-art with both clustering approaches.
更多
查看译文
关键词
diarization, DIHARD Challenge, PLDA, Variational Bayes, Tree search, M-algorithm
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要