Bayesian Feature Enhancement For Asr Of Noisy Reverberant Real-World Data

13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3(2012)

引用 28|浏览8
暂无评分
摘要
In this contribution we investigate the effectiveness of BAYESIAN feature enhancement (BFE) on a medium-sized recognition task containing real-world recordings of noisy reverberant speech. BFE employs a very coarse model of the acoustic impulse response (AIR) from the source to the microphone, which has been shown to be effective if the speech to be recognized has been generated by artificially convolving non-reverberant speech with a constant AIR. Here we demonstrate that the model is also appropriate to be used in feature enhancement of true recordings of noisy reverberant speech. On the Multi-Channel Wall Street Journal Audio Visual corpus (MC-WSJ-AV) the word error rate is cut in half to 41.9% compared to the ETSI Standard Front-End using as input the signal of a single distant microphone with a single recognition pass.
更多
查看译文
关键词
bayesian feature enhancement,dereverberation,denoising
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要