Improved Speech Activity Detection Using Cross-Channel Features For Recognition Of Multiparty Meetings

INTERSPEECH 2006 AND 9TH INTERNATIONAL CONFERENCE ON SPOKEN LANGUAGE PROCESSING, VOLS 1-5(2006)

引用 33|浏览29
暂无评分
摘要
We describe the development of a speech activity detection system using an HMM-based segmenter for automatic speech recognition on individual headset microphones in multispeaker meetings. We look at cross-channel features (energy and correlation based) to incorporate into the segmenter for the purpose of addressing errors related to cross-channel phenomena such as crosstalk. Results demonstrate that these features provide a marked improvement (18% relative) over a baseline system using single-channel features as well as an improvement (8% relative) over our previous solution of separate speech activity detection and cross-channel analysis. In addition, the simple cross-channel energy features are shown to be more robust-and consequently better performing-than the more common correlation-based features.
更多
查看译文
关键词
speech activity detection,multi-channel audio,crosstalk
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要