Streaming Object Detection for 3-D Point Clouds

european conference on computer vision, pp. 423-441, 2020.

Cited by: 0|Bibtex|Views130
Other Links: arxiv.org|academic.microsoft.com
Weibo:
We have described streaming object detection for point clouds for self-driving car perceptions systems

Abstract:

Autonomous vehicles operate in a dynamic environment, where the speed with which a vehicle can perceive and react impacts the safety and efficacy of the system. LiDAR provides a prominent sensory modality that informs many existing perceptual systems including object detection, segmentation, motion estimation, and action recognition. Th...More

Code:

Data:

0
Introduction
  • Autonomous driving systems require detection and localization of objects to effectively respond to a dynamic enmeta-architecture baseline streaming localized RF stateful NMS stateful RNN larger model accuracy pedestrians

    54.9 40.1 52.9 53.5 60.1 vehicles vironment [15, 3].
  • Existing approaches to LiDARbased perception derive from a family of camera-based approaches [54, 44, 21, 5, 27, 11], requiring a complete 360◦ scan of the environment
  • This artificial requirement to have the complete scan limits the minimum latency a perception system can achieve, and effectively inserts the LiDAR scan period into the latency 1.
  • Unlike CCD cameras, many LiDAR systems are streaming data sources, where data arrives sequentially as the laser rotates around the z axis [2, 23]
Highlights
  • Autonomous driving systems require detection and localization of objects to effectively respond to a dynamic enmeta-architecture baseline streaming localized RF stateful non-maximum suppression stateful RNN larger model accuracy pedestrians

    54.9 40.1 52.9 53.5 60.1 vehicles vironment [15, 3]
  • We propose a series of modifications to standard meta-architectures that may generically adapt an object detection system to operate in a streaming manner
  • We explore how the proposed meta-architecture for streaming 3-D object detection compares to a standard object detection system, i.e. PointPillars [34] and StarNet [48]
  • We have described streaming object detection for point clouds for self-driving car perceptions systems
  • Such a problem offers an opportunity for blending ideas from object detection [54, 16, 44, 53], tracking [42, 43, 13] and sequential modeling [61]
  • We find that simple methods based on restricting the receptive field, adding temporal state to the non-maximum suppression, and learning a perception state across time via recurrence suffice for providing competitive if not superior detection performance on a large-scale self-driving dataset
Methods
  • The lasers precess around the z axis, and make a complete rotation at a 5-20 Hz scan rate.
  • The authors simulate a streaming system with the Waymo Open Dataset [58] by artificially manipulating the point cloud data 2.
  • The native format of point cloud data are range images whose resolution in height and width correspond to the number of lasers and the rotation speed and laser pulse rate [47].
  • The authors artificially slice the input range image into n vertical strips along the image width to provide an experimental setup to experiment with streaming detection models
Results
  • The authors present all results on the Waymo Open Dataset [58]. All models are trained with Adam [31] using the Lingvo machine learning framework [56] built on top of TensorFlow [1] 4.
  • All experiments use the first return from the medium range LiDAR in the Waymo Open Dataset, ignoring the four short range LiDARs for simplicity.
  • This results in slightly lower baseline and final accuracy as compared to previous results [48, 58, 68]
Conclusion
  • The authors have described streaming object detection for point clouds for self-driving car perceptions systems
  • Such a problem offers an opportunity for blending ideas from object detection [54, 16, 44, 53], tracking [42, 43, 13] and sequential modeling [61].
  • The resulting system achieves favorable computational performance (∼ 1/10th) and improved expected latency (∼ 1/15th) with respect to a baseline non-streaming system.
  • Such gains provide headroom to scale up the system to surpass baseline performance (60.1 vs 54.9 mAP) while maintaining a peak computational budget far below a non-streaming model
Summary
  • Introduction:

    Autonomous driving systems require detection and localization of objects to effectively respond to a dynamic enmeta-architecture baseline streaming localized RF stateful NMS stateful RNN larger model accuracy pedestrians

    54.9 40.1 52.9 53.5 60.1 vehicles vironment [15, 3].
  • Existing approaches to LiDARbased perception derive from a family of camera-based approaches [54, 44, 21, 5, 27, 11], requiring a complete 360◦ scan of the environment
  • This artificial requirement to have the complete scan limits the minimum latency a perception system can achieve, and effectively inserts the LiDAR scan period into the latency 1.
  • Unlike CCD cameras, many LiDAR systems are streaming data sources, where data arrives sequentially as the laser rotates around the z axis [2, 23]
  • Objectives:

    The goal of this work is to show how the authors can modify an existing object detection system – with a minimum set of changes and the addition of new operations – to efficiently and accurately emit detections as data arrives (Figure 1).
  • Methods:

    The lasers precess around the z axis, and make a complete rotation at a 5-20 Hz scan rate.
  • The authors simulate a streaming system with the Waymo Open Dataset [58] by artificially manipulating the point cloud data 2.
  • The native format of point cloud data are range images whose resolution in height and width correspond to the number of lasers and the rotation speed and laser pulse rate [47].
  • The authors artificially slice the input range image into n vertical strips along the image width to provide an experimental setup to experiment with streaming detection models
  • Results:

    The authors present all results on the Waymo Open Dataset [58]. All models are trained with Adam [31] using the Lingvo machine learning framework [56] built on top of TensorFlow [1] 4.
  • All experiments use the first return from the medium range LiDAR in the Waymo Open Dataset, ignoring the four short range LiDARs for simplicity.
  • This results in slightly lower baseline and final accuracy as compared to previous results [48, 58, 68]
  • Conclusion:

    The authors have described streaming object detection for point clouds for self-driving car perceptions systems
  • Such a problem offers an opportunity for blending ideas from object detection [54, 16, 44, 53], tracking [42, 43, 13] and sequential modeling [61].
  • The resulting system achieves favorable computational performance (∼ 1/10th) and improved expected latency (∼ 1/15th) with respect to a baseline non-streaming system.
  • Such gains provide headroom to scale up the system to surpass baseline performance (60.1 vs 54.9 mAP) while maintaining a peak computational budget far below a non-streaming model
Tables
  • Table1: Localized receptive field leads to duplicate detections. Vehicle detection performance (mAP) across subtended angle of ground truth objects for localized receptive field and stateful NMS (n=32 slices) for [<a class="ref-link" id="c34" href="#r34">34</a>]. Red indicates the percent drop from baseline
  • Table2: Stateful NMS achieves comparable performance gains as global NMS
  • Table3: PointPillars detection baseline [<a class="ref-link" id="c34" href="#r34">34</a>]
  • Table4: StarNet detection baseline [<a class="ref-link" id="c48" href="#r48">48</a>]
Download tables as Excel
Related work
  • 2.1. Object detection in camera images

    Object detection has a long history in computer vision as a central task in the field. Early work focused on framing the problem as a two-step process consisting of an initial search phase followed by a final discrimination of object location and identity [14, 10, 60]. Such strategies proved effective for academic datasets based on camera imagery [12, 40].

    The re-emergence of convolutional neural networks (CNN) for computer vision [33, 32] inspired the field to harness both the rich image features and final training objective of a CNN model for object detection [55]. In particular, the features of a CNN trained on an image classification task proved sufficient for providing reasonable candidate locations for objects [17]. Subsequent work demonstrated that a single CNN may be trained in an end-to-end fashion to sub-serve for both stages of an object detection system [54, 16]. The resulting two-stage systems, however, suffered from relatively poor computational performance, as the second stage necessitated performing inference on all candidate locations leading to trade-offs between thoroughly sampling the scene for candidate locations and predictive performance for localizing objects [26].
Reference
  • Martın Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, et al. Tensorflow: A system for large-scale machine learning. In 12th {USENIX}
    Google ScholarLocate open access versionFindings
  • Evan Ackerman. Lidar that will make self-driving cars affordable [news]. IEEE Spectrum, 53(10):14–14, 2016. 2
    Google ScholarLocate open access versionFindings
  • Holger Caesar, Varun Bankiti, Alex H. Lang, Sourabh Vora, Venice Erin Liong, Qiang Xu, Anush Krishnan, Yu Pan, Giancarlo Baldan, and Oscar Beijbom. nuscenes: A multimodal dataset for autonomous driving. arXiv preprint arXiv:1903.11027, 2019. 1, 3
    Findings
  • Yuning Chai. Patchwork: A patch-wise attention network for efficient object detection and segmentation in video streams. In IEEE Conference on Computer Vision and Pattern Recognition, 2019. 2
    Google ScholarLocate open access versionFindings
  • Liang-Chieh Chen, George Papandreou, Iasonas Kokkinos, Kevin Murphy, and Alan L Yuille. Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE transactions on pattern analysis and machine intelligence, 40(4):834–848, 2017. 2
    Google ScholarLocate open access versionFindings
  • Chung-Cheng Chiu, Tara N Sainath, Yonghui Wu, Rohit Prabhavalkar, Patrick Nguyen, Zhifeng Chen, Anjuli Kannan, Ron J Weiss, Kanishka Rao, Ekaterina Gonina, et al. State-of-the-art speech recognition with sequence-tosequence models. In 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 4774–4778. IEEE, 2018. 2
    Google ScholarLocate open access versionFindings
  • Hyunggi Cho, Young-Woo Seo, BVK Vijaya Kumar, and Ragunathan Raj Rajkumar. A multi-sensor fusion system for moving object detection and tracking in urban driving environments. In 2014 IEEE International Conference on Robotics and Automation (ICRA), pages 1836–1843. IEEE, 2014. 1, 2, 3
    Google ScholarLocate open access versionFindings
  • Dan Claudiu Ciresan, Ueli Meier, Luca Maria Gambardella, and Jurgen Schmidhuber. Deep big simple neural nets excel on handwritten digit recognition. CoRR, abs/1003.0358, 2010. 8
    Findings
  • Adam Coates, Honglak Lee, and Andrew Y. Ng. An analysis of single-layer networks in unsupervised feature learning. In In AISTATS, 1991. 8
    Google ScholarLocate open access versionFindings
  • Thomas Dean, Mark A Ruzon, Mark Segal, Jonathon Shlens, Sudheendra Vijayanarasimhan, and Jay Yagnik. Fast, accurate detection of 100,000 object classes on a single machine. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 1814–1821, 2013. 2
    Google ScholarLocate open access versionFindings
  • Alexey Dosovitskiy, Philipp Fischer, Eddy Ilg, Philip
    Google ScholarFindings
  • Mark Everingham, Luc Van Gool, Christopher KI Williams, John Winn, and Andrew Zisserman. The pascal visual object classes (voc) challenge. International journal of computer vision, 88(2):303–338, 2010. 2
    Google ScholarLocate open access versionFindings
  • Christoph Feichtenhofer, Axel Pinz, and Andrew Zisserman. Detect to track and track to detect. In Proceedings of the IEEE International Conference on Computer Vision, pages 3038–3046, 2017. 8
    Google ScholarLocate open access versionFindings
  • Pedro F Felzenszwalb, Ross B Girshick, David McAllester, and Deva Ramanan. Object detection with discriminatively trained part-based models. IEEE transactions on pattern analysis and machine intelligence, 32(9):1627–1645, 2010. 2, 6
    Google ScholarLocate open access versionFindings
  • Andreas Geiger, Philip Lenz, Christoph Stiller, and Raquel Urtasun. Vision meets robotics: The kitti dataset. The International Journal of Robotics Research, 32(11):1231–1237, 2013. 1, 2, 3
    Google ScholarLocate open access versionFindings
  • Ross Girshick. Fast r-cnn. In Proceedings of the IEEE international conference on computer vision, pages 1440–1448, 2015. 2, 8
    Google ScholarLocate open access versionFindings
  • Ross Girshick, Jeff Donahue, Trevor Darrell, and Jitendra Malik. Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 580–587, 2014. 2, 4, 6
    Google ScholarLocate open access versionFindings
  • Xavier Glorot and Yoshua Bengio. Understanding the difficulty of training deep feedforward neural networks. In Proceedings of the thirteenth international conference on artificial intelligence and statistics, pages 249–256, 2010. 13
    Google ScholarLocate open access versionFindings
  • Benjamin Graham and Laurens van der Maaten. Submanifold sparse convolutional networks. CoRR, abs/1706.01307, 2017. 4
    Findings
  • Alex Graves. Sequence transduction with recurrent neural networks. arXiv preprint arXiv:1211.3711, 2012. 2
    Findings
  • Kaiming He, Georgia Gkioxari, Piotr Dollar, and Ross Girshick. Mask r-cnn. In Proceedings of the IEEE international conference on computer vision, pages 2961–2969, 2017. 2
    Google ScholarLocate open access versionFindings
  • Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In Proceedings of the IEEE international conference on computer vision, pages 1026–1034, 2015. 13
    Google ScholarLocate open access versionFindings
  • Jeff Hecht. Lidar for self-driving cars. Optics and Photonics News, 29(1):26–33, 2018. 2
    Google ScholarLocate open access versionFindings
  • Joao F. Henriques and Andrea Vedaldi. Mapnet: An allocentric spatial memory for mapping environments. In IEEE Conference on Computer Vision and Pattern Recognition, 2018. 2
    Google ScholarLocate open access versionFindings
  • Sepp Hochreiter and Jurgen Schmidhuber. Long short-term memory. Neural computation, 9(8):1735–1780, 1997. 4
    Google ScholarLocate open access versionFindings
  • Jonathan Huang, Vivek Rathod, Chen Sun, Menglong Zhu, Anoop Korattikara, Alireza Fathi, Ian Fischer, Zbigniew Wojna, Yang Song, Sergio Guadarrama, et al. Speed/accuracy trade-offs for modern convolutional object detectors. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 7310–7311, 2017. 2
    Google ScholarLocate open access versionFindings
  • Eddy Ilg, Nikolaus Mayer, Tonmoy Saikia, Margret Keuper, Alexey Dosovitskiy, and Thomas Brox. Flownet 2.0: Evolution of optical flow estimation with deep networks. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2462–2470, 2017. 2
    Google ScholarLocate open access versionFindings
  • Navdeep Jaitly, David Sussillo, Quoc V Le, Oriol Vinyals, Ilya Sutskever, and Samy Bengio. A neural transducer. arXiv preprint arXiv:1511.04868, 2015. 2
    Findings
  • Hyun Ho Jeon and Yun-Ho Ko. Lidar data interpolation algorithm for visual odometry based on 3d-2d motion estimation. In 2018 International Conference on Electronics, Information, and Communication (ICEIC), pages 1–2. IEEE, 2018. 2
    Google ScholarLocate open access versionFindings
  • Junsung Kim, Hyoseung Kim, Karthik Lakshmanan, and Ragunathan Raj Rajkumar. Parallel scheduling for cyberphysical systems: Analysis and case study on a self-driving car. In Proceedings of the ACM/IEEE 4th international conference on cyber-physical systems, pages 31–40. ACM, 2013. 2
    Google ScholarLocate open access versionFindings
  • Diederik P Kingma and Jimmy Ba. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014. 4, 13
    Findings
  • Alex Krizhevsky and Geoffrey Hinton. Learning multiple layers of features from tiny images. Technical report, University of Toronto, 2009. 2
    Google ScholarFindings
  • Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. Imagenet classification with deep convolutional neural networks. In Advances in Neural Information Processing Systems, 2012. 2
    Google ScholarLocate open access versionFindings
  • Alex H Lang, Sourabh Vora, Holger Caesar, Lubing Zhou, Jiong Yang, and Oscar Beijbom. Pointpillars: Fast encoders for object detection from point clouds. arXiv preprint arXiv:1812.05784, 2018. 1, 2, 3, 4, 5, 6, 7, 8, 9, 13
    Findings
  • Hei Law and Jia Deng. Cornernet: Detecting objects as paired keypoints. In Proceedings of the European Conference on Computer Vision (ECCV), pages 734–750, 2018. 2
    Google ScholarLocate open access versionFindings
  • Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. Deep learning. nature, 521(7553):436, 2015. 8
    Google ScholarLocate open access versionFindings
  • Kai Li Lim, Thomas Drage, and Thomas Braunl. Implementation of semantic segmentation for road and lane detection on an autonomous ground vehicle with lidar. In 2017 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI), pages 429–434. IEEE, 2017. 2
    Google ScholarLocate open access versionFindings
  • Tsung-Yi Lin, Piotr Dollar, Ross Girshick, Kaiming He, Bharath Hariharan, and Serge Belongie. Feature pyramid networks for object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017. 2, 3
    Google ScholarLocate open access versionFindings
  • Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, and Piotr Dollar. Focal loss for dense object detection. In Proceedings of the IEEE international conference on computer vision, pages 2980–2988, 2017. 2
    Google ScholarLocate open access versionFindings
  • Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Dollar, and C Lawrence Zitnick. Microsoft coco: Common objects in context. In European conference on computer vision, pages 740–755.
    Google ScholarLocate open access versionFindings
  • Philipp Lindner, Eric Richter, Gerd Wanielik, Kiyokazu Takagi, and Akira Isogai. Multi-channel lidar processing for lane detection and estimation. In 2009 12th International IEEE Conference on Intelligent Transportation Systems, pages 1– 6. IEEE, 2009. 2
    Google ScholarLocate open access versionFindings
  • Mason Liu and Menglong Zhu. Mobile video object detection with temporally-aware feature maps. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 5686–5695, 2018. 2, 4, 8
    Google ScholarLocate open access versionFindings
  • Mason Liu, Menglong Zhu, Marie White, Yinxiao Li, and Dmitry Kalenichenko. Looking fast and slow: Memoryguided mobile video object detection. arXiv preprint arXiv:1903.10172, 2019. 2, 4, 8
    Findings
  • Wei Liu, Dragomir Anguelov, Dumitru Erhan, Christian Szegedy, Scott Reed, Cheng-Yang Fu, and Alexander C Berg. Ssd: Single shot multibox detector. In European conference on computer vision, pages 21–37.
    Google ScholarLocate open access versionFindings
  • Wenjie Luo, Bin Yang, and Raquel Urtasun. Fast and furious: Real time end-to-end 3d detection, tracking and motion forecasting with a single convolutional net. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 3569–3577, 2018. 3, 4, 8
    Google ScholarLocate open access versionFindings
  • Lane McIntosh, Niru Maheswaranathan, David Sussillo, and Jonathon Shlens. Recurrent segmentation for variable computational budgets. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pages 1648–1657, 2018. 2, 4
    Google ScholarLocate open access versionFindings
  • Gregory P Meyer, Ankit Laddha, Eric Kee, Carlos VallespiGonzalez, and Carl K Wellington. Lasernet: An efficient probabilistic 3d object detector for autonomous driving. arXiv preprint arXiv:1903.08701, 2019. 3, 4, 8
    Findings
  • Jiquan Ngiam, Benjamin Caine, Wei Han, Brandon Yang, Yuning Chai, Pei Sun, Yin Zhou, Xi Yi, Ouais Alsharif, Patrick Nguyen, et al. Starnet: Targeted computation for object detection in point clouds. arXiv preprint arXiv:1908.11069, 2019. 2, 3, 4, 5, 8, 9, 13
    Findings
  • Pedro Pinheiro and Ronan Collobert. Recurrent convolutional neural networks for scene labeling. In Eric P. Xing and Tony Jebara, editors, Proceedings of the 31st International Conference on Machine Learning, volume 32 of Proceedings of Machine Learning Research, pages 82–90, Bejing, China, 22–24 Jun 2014. PMLR. 3, 4
    Google ScholarLocate open access versionFindings
  • Charles R Qi, Wei Liu, Chenxia Wu, Hao Su, and Leonidas J Guibas. Frustum pointnets for 3d object detection from rgbd data. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 918–927, 2018. 3
    Google ScholarLocate open access versionFindings
  • Charles R Qi, Hao Su, Kaichun Mo, and Leonidas J Guibas. Pointnet: Deep learning on point sets for 3d classification and segmentation. In Proceedings of the IEEE Conference on
    Google ScholarLocate open access versionFindings
  • Charles Ruizhongtai Qi, Li Yi, Hao Su, and Leonidas J Guibas. Pointnet++: Deep hierarchical feature learning on point sets in a metric space. In Advances in Neural Information Processing Systems, pages 5099–5108, 2017. 3
    Google ScholarLocate open access versionFindings
  • Joseph Redmon, Santosh Divvala, Ross Girshick, and Ali Farhadi. You only look once: Unified, real-time object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 779–788, 2016. 2, 8
    Google ScholarLocate open access versionFindings
  • Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. Faster r-cnn: Towards real-time object detection with region proposal networks. In Advances in neural information processing systems, pages 91–99, 2015. 2, 8
    Google ScholarLocate open access versionFindings
  • Pierre Sermanet, David Eigen, Xiang Zhang, Michael Mathieu, Rob Fergus, and Yann LeCun. Overfeat: Integrated recognition, localization and detection using convolutional networks. arXiv preprint arXiv:1312.6229, 2013. 2
    Findings
  • Jonathan Shen, Patrick Nguyen, Yonghui Wu, Zhifeng Chen, Mia X Chen, Ye Jia, Anjuli Kannan, Tara Sainath, Yuan Cao, Chung-Cheng Chiu, et al. Lingvo: a modular and scalable framework for sequence-to-sequence modeling. arXiv preprint arXiv:1902.08295, 2019. 4
    Findings
  • Shaoshuai Shi, Xiaogang Wang, and Hongsheng Li. PointRCNN: 3d object proposal generation and detection from point cloud. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 770–779, 2019. 3
    Google ScholarLocate open access versionFindings
  • Pei Sun, Henrik Kretzschmar, Xerxes Dotiwalla, Aurelien Chouard, Vijaysai Patnaik, Paul Tsui, James Guo, Yin Zhou, Yuning Chai, Benjamin Caine, et al. Scalability in perception for autonomous driving: Waymo open dataset. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020. 1, 2, 3, 4, 7
    Google ScholarLocate open access versionFindings
  • Sebastian Thrun, Mike Montemerlo, Hendrik Dahlkamp, David Stavens, Andrei Aron, James Diebel, Philip Fong, John Gale, Morgan Halpenny, Gabriel Hoffmann, et al. Stanley: The robot that won the darpa grand challenge. Journal of field Robotics, 23(9):661–692, 2006. 1, 2, 3
    Google ScholarLocate open access versionFindings
  • Jasper RR Uijlings, Koen EA Van De Sande, Theo Gevers, and Arnold WM Smeulders. Selective search for object recognition. International journal of computer vision, 104(2):154–171, 2013. 2
    Google ScholarLocate open access versionFindings
  • Yonghui Wu, Mike Schuster, Zhifeng Chen, Quoc V Le, Mohammad Norouzi, Wolfgang Macherey, Maxim Krikun, Yuan Cao, Qin Gao, Klaus Macherey, et al. Google’s neural machine translation system: Bridging the gap between human and machine translation. arXiv preprint arXiv:1609.08144, 2016. 3, 4, 8
    Findings
  • [63] Yan Yan, Yuxing Mao, and Bo Li. Second: Sparsely embedded convolutional detection. Sensors, 18(10):3337, 2018. 2, 3, 4, 8
    Google ScholarLocate open access versionFindings
  • [64] Bin Yang, Ming Liang, and Raquel Urtasun. Hdnet: Exploiting HD maps for 3d object detection. In Conference on Robot Learning, pages 146–155, 2018. 2, 3, 4, 8
    Google ScholarLocate open access versionFindings
  • [65] Bin Yang, Wenjie Luo, and Raquel Urtasun. Pixor: Realtime 3d object detection from point clouds. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 7652–7660, 2018. 2, 3, 4, 8
    Google ScholarLocate open access versionFindings
  • [66] Zetong Yang, Yanan Sun, Shu Liu, Xiaoyong Shen, and Jiaya Jia. Ipod: Intensive point-based object detector for point cloud. arXiv preprint arXiv:1812.05276, 2018. 3
    Findings
  • [67] Ji Zhang and Sanjiv Singh. Visual-lidar odometry and mapping: Low-drift, robust, and fast. In 2015 IEEE International Conference on Robotics and Automation (ICRA), pages 2174–2181. IEEE, 2015. 2
    Google ScholarLocate open access versionFindings
  • [68] Yin Zhou, Pei Sun, Yu Zhang, Dragomir Anguelov, Jiyang Gao, Tom Ouyang, James Guo, Jiquan Ngiam, and Vijay Vasudevan. End-to-end multi-view fusion for 3d object detection in lidar point clouds. In Conference on Robot Learning (CoRL), 2019. 3, 4
    Google ScholarLocate open access versionFindings
  • [69] Yin Zhou and Oncel Tuzel. Voxelnet: End-to-end learning for point cloud based 3d object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 4490–4499, 2018. 2, 3, 4, 8
    Google ScholarLocate open access versionFindings
Full Text
Your rating :
0

 

Tags
Comments