Semantic Amodal Segmentation

Yan Zhu,Yuandong Tian, Dimitris Mexatas,Piotr Dollár

2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)（2015）

引用 162|浏览161

暂无评分

摘要

Common visual recognition tasks such as classification, object detection, and semantic segmentation are rapidly reaching maturity, and given the recent rate of progress, it is not unreasonable to conjecture that techniques for many of these problems will approach human levels of performance in the next few years. In this paper we look to the future: what is the next frontier in visual recognition? We offer one possible answer to this question. We propose a detailed image annotation that captures information beyond the visible pixels and requires complex reasoning about full scene structure. Specifically, we create an amodal segmentation of each image: the full extent of each region is marked, not just the visible pixels. Annotators outline and name all salient regions in the image and specify a partial depth order. The result is a rich scene structure, including visible and occluded portions of each region, figure-ground edge information, semantic labels, and object overlap. To date, we have labeled 500 images in the BSDS dataset with at least five annotators per image. Critically, the resulting full scene annotation is surprisingly consistent between annotators. For example, for edge detection our annotations have substantially higher human consistency than the original BSDS edges while providing a greater challenge for existing algorithms. We are currently annotating ~5000 images from the MS COCO dataset.

查看译文

关键词

scene annotation,semantic amodal segmentation,common visual recognition tasks,object detection,visible pixels,partial depth order,visible portions,figure-ground edge information,semantic labels,multiple annotators,human annotations,image annotation,information capture,complex reasoning,BSDS datase,COCO,depth ordering

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要