Adaptive High Accuracy Approaches To Speech Activity Detection In Noisy And Hostile Audio Environments

11TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2010 (INTERSPEECH 2010), VOLS 3 AND 4(2010)

引用 24|浏览18
暂无评分
摘要
This study examines the difficult task of Speech Activity Detection (SAD) in two hostile environments: AM push-to-talk air traffic control and international telephone conversations with very low SNRs. Due to the poor performance of traditional energy-based SAD, two novel approaches to SAD were developed that specifically target spectral characteristics that typify speech, rather than trying to separate out the background, which can vary enormously. As a result these approaches are inherently adaptive to their environments. A Speech Energy Resonance Band Detection approach and a Harmonic Product Spectrum clustering approach to SAD are described in this paper and their performance evaluated against MIT Xtalk and the Teager Energy Operator (TEO) in clean and hostile environments.
更多
查看译文
关键词
speech activity detection,audio segmentation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要