Learning to Selectively Update State Neurons in Recurrent Networks

Thomas Hartvigsen,Cansu Sen,Xiangnan Kong,Elke Rundensteiner

CIKM '20: The 29th ACM International Conference on Information and Knowledge Management Virtual Event Ireland October, 2020（2020）

引用 2|浏览29

暂无评分

摘要

Recurrent Neural Networks (RNNs) are the state-of-the-art approach to sequential learning. However, standard RNNs use the same amount of computation to generate their hidden states at each timestep, regardless of the input data. Recent works have begun to tackle this rigid assumption by imposing a priori-determined patterns for updating the states at each step. These approaches could lend insights into the dynamics of RNNs and possibly speed up inference. However, the pre-determined nature of the current update strategies limits their application. To overcome this, we instead design the first fully-learned approach, SA-RNN, that augments any RNN by predicting discrete update patterns at the fine granularity of individual hidden state neurons. This is achieved through the parameterization of a distribution of update-likelihoods driven by the input data. Unlike related methods, our approach imposes no assumptions on the structure of the update patterns. Better yet, our method adapts its update patterns online, allowing different dimensions to be updated conditionally based on the input. To learn which dimensions to update, the model solves a multi-objective optimization problem, maximizing task performance while minimizing the number of updates based on a unified control. Using five publicly-available datasets spanning three sequential learning settings, we demonstrate that our method consistently achieves higher accuracy with fewer updates compared to state-of-the-art alternatives. We also show the benefits of learning to sparsely-update a large hidden state as opposed to densely-update a small hidden state. As an added benefit, our method can be directly applied to a wide variety of models containing RNN architectures.

查看译文

关键词

Recurrent Neural Networks, Conditional Computation, Sequential Data

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要