Convolutional, Long Short-Term Memory, Fully Connected Deep Neural Networks

2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)(2015)

引用 1805|浏览296
暂无评分
摘要
Both Convolutional Neural Networks (CNNs) and Long Short-Term Memory (LSTM) have shown improvements over Deep Neural Networks (DNNs) across a wide variety of speech recognition tasks. CNNs, LSTMs and DNNs are complementary in their modeling capabilities, as CNNs are good at reducing frequency variations, LSTMs are good at temporal modeling, and DNNs are appropriate for mapping features to a more separable space. In this paper, we take advantage of the complementarity of CNNs, LSTMs and DNNs by combining them into one unified architecture. We explore the proposed architecture, which we call CLDNN, on a variety of large vocabulary tasks, varying from 200 to 2,000 hours. We find that the CLDNN provides a 4-6% relative improvement in WER over an LSTM, the strongest of the three individual models.
更多
查看译文
关键词
long short-term memory,convolutional memory,fully connected deep neural networks,CNN,convolutional neural networks,LSTM,DNN,speech recognition,frequency variations
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要