Robust online optimization of reward-uncertain MDPs
IJCAI, pp. 2165-2171, 2011.
Imprecise-reward Markov decision processes (IRMDPs) are MDPs in which the reward function is only partially specified (e.g., by some elicitation process). Recent work using minimax regret to solve IRMDPs has shown, despite their theoretical intractability, how the set of policies that are nondominated w.r.t. reward uncertainty can be expl...More
PPT (Upload PPT)