Learning Where to Fixate on Foveated Images.

arXiv: Computer Vision and Pattern Recognition(2018)

引用 23|浏览63
暂无评分
摘要
Foveation, the ability to sequentially acquire high-acuity regions of a scene viewed initially at low-acuity, is a key property of biological vision systems. In a computer vision system, foveation is also desired to increase data efficiency and derive task-relevant features. Yet, most existing deep learning models lack the ability to foveate. In this paper, we propose a deep reinforcement learning-based foveation model, DRIFT, and apply it to challenging fine-grained classification tasks. Training of DRIFT requires only image-level category labels and encourages fixations to contain discriminative information while maintaining data efficiency. Specifically, we formulate foveation as a sequential decision-making process and train a foveation actor network with a novel Deep Deterministic Policy Gradient by Conditioned Critic and Coaching (DDPGC3) algorithm. In addition, we propose to shape the reward to provide informative feedback after each fixation to better guide the RL training. We demonstrate the effectiveness of our method on five fine-grained classification benchmark datasets, and show that the proposed approach achieves state-of-the-art performance using an order-of-magnitude fewer pixels.
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要