Reinforcement learning for mapping instructions to actions

Satchuthanan R Branavan,Harr Chen,Luke S. Zettlemoyer,Regina Barzilay

ACL/IJCNLP（2009）

引用 332|浏览552

暂无评分

摘要

In this paper, we present a reinforcement learning approach for mapping natural language instructions to sequences of executable actions. We assume access to a reward function that defines the quality of the executed actions. During training, the learner repeatedly constructs action sequences for a set of documents, executes those actions, and observes the resulting reward. We use a policy gradient algorithm to estimate the parameters of a log-linear model for action selection. We apply our method to interpret instructions in two domains --- Windows troubleshooting guides and game tutorials. Our results demonstrate that this method can rival supervised learning techniques while requiring few or no annotated training examples.

查看译文

关键词

constructs action sequence,resulting reward,log-linear model,executable action,mapping instruction,windows troubleshooting guide,executed action,game tutorial,reward function,annotated training example,action selection,reinforcement learning,log linear model

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要