Training Deep Neural Networks for Reverberation Robust Speech Recognition

Speech Communication; 12. ITG Symposium(2016)

引用 22|浏览74
暂无评分
摘要
Recently hybrid systems of deep neural networks (DNNs) and hidden Markov models (HMMs) have shown state of the art results on various speech recognition tasks. Best results were achieved by training large neural networks (NNs) on huge data sets (>_ 2000h [11, 16, 20]). The required training data is often generated using different methods of data augmentation. We show that a simple approach using room impulse response (RIR) can be used to train systems more robust to reverberation. The method does not require multiple microphones or complex signal processing techniques. On a test set, simulating large rooms we show improvements from 59.7% word error rate (WER) down to 41.9%. In the case of known lectures rooms, with varying microphone positions the approach can be used to train the system for the environment. We compare systems trained with RIRs from one, multiple and simulated rooms.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要