Speaker Adaptation For Multichannel End-To-End Speech Recognition
2018 IEEE INTERNATIONAL CONFERENCE ON ACOUSTICS, SPEECH AND SIGNAL PROCESSING (ICASSP)(2018)
摘要
Recent work on multichannel end-to-end automatic speech recognition (ASR) has shown that multichannel speech enhancement and speech recognition functions can be integrated into a deep neural network (DNN)-based system, and promising experimental results have been shown using the CHiME-4 and AMI corpora. In other recent DNN-based hidden Markov model (DNN-HMM) hybrid architectures, the effectiveness of speaker adaptation has been well established. Motivated by these results, we propose a multi-path adaptation scheme for end-to-end multichannel ASR, which combines the unprocessed noisy speech features with a speech-enhanced pathway to improve upon previous end-to-end ASR approaches. Experimental results using CHiME-4 show that (1) our proposed multi-path adaptation scheme improves ASR performance and (2) adapting the encoder network is more effective than adapting the neural beam-former, attention mechanism, or decoder network.
更多查看译文
关键词
multichannel end-to-end ASR, neural beam-former, attention-based encoder-decoder, speaker adaptation
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络