Pipelined Back-Propagation For Context-Dependent Deep Neural Networks

13TH ANNUAL CONFERENCE OF THE INTERNATIONAL SPEECH COMMUNICATION ASSOCIATION 2012 (INTERSPEECH 2012), VOLS 1-3(2012)

引用 105|浏览161
暂无评分
摘要
The Context-Dependent Deep-Neural-Network HMM, or CD-DNN-HMM, is a recently proposed acoustic-modeling technique for HMM-based speech recognition that can greatly outperform conventional Gaussian-mixture based HMMs. For example, a CD-DNN-HMM trained on the 2000h Fisher corpus achieves 14.4% word error rate on the Hub5'00-FSH speaker-independent phone-call transcription task, compared to 19.6% obtained by a state-of-the-art, conventional discriminatively trained GMM-based HMM.That CD-DNN-HMM, however, took 59 days to train on a modern GPGPU-the immense computational cost of the mini-batch based back-propagation (BP) training is a major roadblock. Unlike the familiar Baum-Welch training for conventional HMMs, BP cannot be efficiently parallelized across data.In this paper we show that the pipelined approximation to BP, which parallelizes computation with respect to layers, is an efficient way of utilizing multiple GPGPU cards in a single server. Using 2 and 4 GPGPUs, we achieve a 1.9 and 3.3 times end-to-end speed-up, at parallelization efficiency of 0.95 and 0.82, respectively, at no loss of recognition accuracy.
更多
查看译文
关键词
speech recognition,deep neural networks,parallelization,GPGPU
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要