State Gradients for Analyzing Memory in LSTM Language Models

Computer Speech & Language(2020)

引用 8|浏览34
暂无评分
摘要
•Decreasing state gradient SVs correspond to decreasing importance of the input.•The average memory of LSTM LMs decays quickly but not exponentially over time.•Average memory depends on type of data, selectiveness on the size of the dataset•Infrequent words are remembered better and longer by LSTM LMs.•LSTM LMs remember on average (syntactic differences between) nouns best.
更多
查看译文
关键词
Language modeling,Neural networks,Memory,Gradients,Interpreting
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要