A Computational Focus of Attention Mechanism to Process Shapes Efficiently: Theory

CoRR（2011）

引用 22|浏览18

暂无评分

摘要

Given the ever increasing bandwidth of the visual sensory information available to autonomous agents and other automatic systems, it is becoming essential to endow them with a sense of what is worthwhile their attention and what can be safely disregarded. This article presents a general mathematical framework to efficiently allocate the available computational resources to process the parts of the input that are relevant to solve a perceptual problem of interest. By solving a perceptual problem we mean to find the hypothesis H (i.e., the state of the world) that maximizes a function L(H), referred to as the evidence, representing how well each hypothesis “explains” the input. However, given the large bandwidth of the sensory input, fully evaluating the evidence for each hypothesis is computationally infeasible (e.g., because it would imply checking a large number of pixels). To address this problem we propose a mathematical framework with two key ingredients. The first one is a Bounding Mechanism (BM) to compute lower and upper bounds of the evidence of a hypothesis, for a given computational budget. These bounds are much cheaper to compute than the evidence itself, can be refined at any time by increasing the budget allocated to a hypothesis, and are frequently sufficient to discard a hypothesis. The second ingredient is a Focus of Attention Mechanism (FoAM) to select which hypothesis’ bounds should be refined next, with the goal of discarding non-optimal hypotheses with the least amount of computation. D. Rother · R. Vidal Johns Hopkins University Tel.: +1-410-516-6736 E-mail: diroth@gmail.com S. Schutz University of Gottingen The proposed framework has the following desirable characteristics: 1) it is very efficient since most hypotheses are discarded with minimal computation; 2) it is parallelizable; 3) it is guaranteed to find the globally optimal hypothesis or hypotheses; and 4) its running time depends on the problem at hand, not on the bandwidth of the input. In order to illustrate the general framework, in this article we instantiate it for the problem of simultaneously estimating the class, pose and a noiseless version of a 2D shape in a 2D image. To do this, we develop a novel theory of semidiscrete shapes that allows us to compute the bounds required by the BM. We believe that the theory presented in this article (i.e., the algorithmic paradigm and the theory of shapes) has multiple potential applications well beyond the application demonstrated in this article.

查看译文

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要