Bayesian Reinforcement Learning with Behavioral Feedback
IJCAI, pp. 1571-1577, 2016.
In the standard reinforcement learning setting, the agent learns optimal policy solely from state transitions and rewards from the environment. We consider an extended setting where a trainer additionally provides feedback on the actions executed by the agent. This requires appropriately incorporating the feedback, even when the feedback ...More
PPT (Upload PPT)