Learning Rewards from Linguistic Feedback

Theodore R. Sumers
Theodore R. Sumers
Mark K. Ho
Mark K. Ho
Robert D. Hawkins
Robert D. Hawkins
Cited by: 0|Views13
Weibo:
Contextualization grounds language into the feature-space of the Markov decision process; we provide an implementation inspired by research on human teaching

Abstract:

We explore unconstrained natural language feedback as a learning signal for artificial agents. Humans use rich and varied language to teach, yet most prior work on interactive learning from language assumes a particular form of input (e.g. commands). We propose a general framework which does not make this assumption. We decompose lingui...More

Code:

Data:

0
Full Text
Bibtex
Weibo
Your rating :
0

 

Tags
Comments