Detecting Common Objects in Context

user-5ebe28934c775eda72abcddd(2017)

引用 0|浏览56
暂无评分
摘要
Visual scene understanding is a basic function of human perception and one of the primary goals of computer vision. Object detection, which involves recognizing and localizing objects present in an environment, is a fundamental task in scene understanding. In the past years, object detection is one of most rapidly developing research areas in computer vision. Progress has been made through a combined efforts of large scale datasets, high quality annotations, and feature representations learned with novel convolutional neural network architectures. This thesis discusses both the process of dataset creation and the subsequent challenges in algorithm design for object detection. We create a large scale visual dataset Common Object in COntext (COCO) that contains objects in everyday scenes and detailed instance segmentation masks. The COCO dataset aims to enable research on detecting objects in an unconstrained environment and presents the combined challenges of recognizing objects in context and accurately localizing instances in 2D. We discuss the algorithm design to address the subsequent challenges in COCO dataset. First, we focus on learning multiscale feature representations to improve object detection performance over a wide range of object scales. We show that by leveraging the pyramidal shape of feature hierarchy in convolutional neural network (ConvNet), we can learn multiscale pyramidal feature representations that are semantic strong at all levels. The proposed Feature Pyramid Networks (FPN) provides generic feature presentations that greatly improve performance in terms of both accuracy and speed for …
更多
查看译文
AI 理解论文
溯源树
样例
生成溯源树,研究论文发展脉络
Chat Paper
正在生成论文摘要