Multi-Channel Speaker Diarization Using Spatial Features for Meetings

Naijun Zheng,Na Li,JianWei Yu,Chao Weng,Dan Su,XunYing Liu,Helen Meng

IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP)（2022）

引用 6|浏览33

暂无评分

摘要

Speaker identification for overlapped speech presents a great challenge for speaker diarization tasks in meeting scenarios. In order to overcome such challenges, several overlap-aware resegmentation methods based on deep learning have been integrated into speaker diarization systems. In this paper we propose two multi-channel diarization systems which have enhanced capability in detecting overlapped speech and identify speakers via learning spatial features. The first system applies a multi-look strategy to train networks without given the speakers’ direction of arrival(DOA), and the other system estimates the DOA of target speakers based on existing diarization results. Both systems aim to estimate the voice activity of speakers in different directions to handle overlapped speech. Experimental results on the AMI corpus show that the relative improvements of both systems can reach 9.4% and 18.1% in term of diarization error rate (DER) against an overlap-aware single-channel system with a BeamformIt front-end.

查看译文

关键词

speaker diarization,direction of arrival,overlapped speech,multi-look,multi-channel

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要