Optimal Sensing via Multi-armed Bandit Relaxations in Mixed Observability Domains
IEEE International Conference on Robotics and Automation, pp. 4807-4812, 2016.
Sequential decision making under uncertainty is studied in a mixed observability domain. The goal is to maximize the amount of information obtained on a partially observable stochastic process under constraints imposed by a fully observable internal state. An upper bound for the optimal value function is derived by relaxing constraints....More
PPT (Upload PPT)