Training RNNs as Fast as CNNs

empirical methods in natural language processing, Volume abs/1709.02755, 2018.

Cited by: 29|Bibtex|Views20|Links
EI

Abstract:

Common recurrent neural network architectures scale poorly due to the intrinsic difficulty in parallelizing their state computations. In this work, we propose the Simple Recurrent Unit (SRU) architecture, a recurrent unit that simplifies the computation and exposes more parallelism. In SRU, the majority of computation for each step is ind...More

Code:

Data:

Your rating :
0

 

Tags
Comments