Learning Behaviors with Uncertain Human Feedback
UAI, pp. 131-140, 2020.
Human feedback is widely used to train agents in many domains. However, previous works rarely consider the uncertainty when humans provide feedback, especially in cases that the optimal actions are not obvious to the trainers. For example, the reward of a sub-optimal action can be stochastic and sometimes exceeds that of the optimal act...More
Full Text (Upload PDF)
PPT (Upload PPT)