Nonrecurrent Neural Structure for Long-Term Dependence.

IEEE/ACM Trans. Audio, Speech & Language Processing(2017)

引用 21|浏览112
暂无评分
摘要
In this paper, we propose a novel neural network structure, namely feedforward sequential memory networks FSMN, to model long-term dependence in time series without using recurrent feedback. The proposed FSMN is a standard fully connected feedforward neural network equipped with some learnable memory blocks in its hidden layers. The memory blocks use a tapped-delay line structure to encode the long context information into a fixed-size representation as short-term memory mechanism which are somehow similar to the time-delay neural networks layers. We have evaluated the FSMNs in several standard benchmark tasks, including speech recognition and language modeling. Experimental results have shown that FSMNs outperform the conventional recurrent neural networks RNN while can be learned much more reliably and faster in modeling sequential signals like speech or language. Moreover, we also propose a compact feedforward sequential memory networks cFSMN by combining FSMN with low-rank matrix factorization and make a slight modification to the encoding method used in FSMNs in order to further simplify the network architecture. On the speech recognition Switchboard task, the proposed cFSMN structures can reduce the model size by 60% and speed up the learning by more than seven times while the model can still significantly outperform the popular bidirectional LSTMs for both frame-level cross-entropy criterion-based training and MMI-based sequence training.
更多
查看译文
关键词
Feedforward neural networks,Standards,Speech recognition,Encoding,Speech,Recurrent neural networks
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要