Patchwork: A Patch-wise Attention Network for Efficient Object Detection and Segmentation in Video Streams
International Conference on Computer Vision, pp. 3415-3424, 2019.
The lost spatial context presents a fundamental limitation for the hard attention mechanism in deep networks, and we believe it is one of the main reasons why the hard attention idea has not been more popular in real-world applications
Recent advances in single-frame object detection and segmentation techniques have motivated a wide range of works to extend these methods to process video streams. In this paper, we explore the idea of hard attention aimed for latency-sensitive applications. Instead of reasoning about every frame separately, our method selects and only pr...More
PPT (Upload PPT)