AI helps you reading Science

AI generates interpretation videos

AI extracts and analyses the key points of the paper to generate videos automatically


pub
Go Generating

AI Traceability

AI parses the academic lineage of this thesis


Master Reading Tree
Generate MRT

AI Insight

AI extracts a summary of this paper


Weibo:
If xTj and xTj is within a distance threshold p, this correspondence is considered as valid and precision is defined as the valid ratio

RSKDD-Net: Random Sample-based Keypoint Detector and Descriptor

NIPS 2020, (2020)

Cited by: 6|Views44
EI
Full Text
Bibtex
Weibo

Abstract

Keypoint detector and descriptor are two main components of point cloud registration. Previous learning-based keypoint detectors rely on saliency estimation for each point or farthest point sample (FPS) for candidate points selection, which are inefficient and not applicable in large scale scenes. This paper proposes Random Sample-based...More
0
Introduction
  • Point cloud registration is an important problem in 3D computer vision, which aims to estimate the optimal rigid transformation between two point clouds. 3D keypoint detection and description are two fundamental components of point cloud registration.
  • With the rapid development of deep learning, many works have explored learning-based methods for 3D descriptors in point cloud [12,13,14,15].
  • Only a few works explore deep learning-based methods in 3D keypoint detection due to the lack of ground truth dataset for keypoint detector [16].
Highlights
  • Point cloud registration is an important problem in 3D computer vision, which aims to estimate the optimal rigid transformation between two point clouds. 3D keypoint detection and description are two fundamental components of point cloud registration
  • Based on the above observation, we propose our network named as Random Sample-based Keypoint Detector and Descriptor Network (RSKDD-Net), which jointly generates keypoints and the corresponding descriptors for large scale point cloud efficiently
  • If xTj and xTj is within a distance threshold p, this correspondence is considered as valid and precision is defined as the valid ratio
  • This paper proposes a learning-based method to jointly detect keypoints and generate descriptors in large scale point cloud
  • To overcome the drawback of random sampling, we propose a novel random dilation cluster strategy to enlarge the receptive field and an attention mechanism for positions and features aggregation
  • The results show that our approach achieves state-of-the-art performance with much lower computation time
  • We propose a soft assignment-based matching loss so that the descriptor network can be trained in a weakly supervised manner
Methods
  • The results show that the random dilation cluster significantly improves the repeatability and precision of the network.
  • The random dilation cluster performs similar to and in some scenes even slightly better than DPC.
  • The introduction of the attentive feature map results in an obvious increase in precision.
  • The precision for different numbers of keypoints increases by about 0.1 with the attentive feature map
Results
  • Evaluation metric

    The authors follow the same evaluation metrics as in 3DFeatNet [17] and USIP [16] for keypoint detector and descriptor, namely Repeatability, Precision and Registration performance.

    Repeatability is introduced in USIP to evaluate the stability of detected keypoints.
  • The authors follow the same evaluation metrics as in 3DFeatNet [17] and USIP [16] for keypoint detector and descriptor, namely Repeatability, Precision and Registration performance.
  • Noting that the computation time of 3DFeatNet and USIP increases mainly with the number of input points and the number of keypoints, respectively.
  • The authors' method is more than 30× faster than USIP and 3DFeatNet to detect 512 keypoints from 16384 input points
Conclusion
  • This paper proposes a learning-based method to jointly detect keypoints and generate descriptors in large scale point cloud.
  • The proposed RSKDD-Net achieves state-of-the-art performance with much faster inference speed.
  • The authors propose a soft assignment-based matching loss so that the descriptor network can be trained in a weakly supervised manner.
  • Extensive experiments are performed and demonstrate that the RSKDD-Net outperforms existing methods by a significant margin in repeatability, precision and registration performance.
Tables
  • Table1: Computation time (ms)
  • Table2: Registration performance on KITTI dataset and Ford dataset
Download tables as Excel
Related work
  • Existing approaches of keypoint detector and descriptor for point cloud can be categorized into handcrafted and learning-based approaches.

    Handcrafted approaches The current handcrafted 3D keypoint detectors and descriptors are mainly inspired by numerous handcrafted methods in 2D images. SIFT-3D [5] and Harris-3D [6] are 3D extensions of widely used 2D detectors SIFT [2] and Harris [3]. Intrinsic Shape Signatures (ISS) [4] selects points where the neighbor points in a ball region have large variations along each principal axis. For the description of keypoints, researchers have also developed several 3D descriptors based on the geometric features of points, like Point Feature Histograms (PFH) [7], Fast Point Feature Histograms (FPFH) [8] and Signature of Histograms of Orientations (SHOT) [9]. A comprehensive introduction of handcrafted 3D detectors and descriptors can be found in [20, 21].
Funding
  • Acknowledgments and Disclosure of Funding This work is funded by National Natural Science Foundation of China (No 61906138), the European Union’s Horizon 2020 Framework Programme for Research and Innovation under the Specific Grant Agreement No 945539 (Human Brain Project SGA3), and the Shanghai AI Innovation Development Program 2018
Study subjects and analysis
testing pairs: 100000
For testing, we use the current point cloud with the five consecutive frames before and after it as test data. Consequently, we obtain over 100,000 testing pairs in KITTI dataset and Ford dataset. 2We simply drop Sequence 08 because of the large errors of ground truth vehicle poses in this sequence

Reference
  • Ethan Rublee, Vincent Rabaud, Kurt Konolige, and Gary Bradski. Orb: An efficient alternative to sift or surf. In 2011 International conference on computer vision, pages 2564–2571.
    Google ScholarLocate open access versionFindings
  • David G Lowe. Distinctive image features from scale-invariant keypoints. International journal of computer vision, 60(2):91–110, 2004.
    Google ScholarLocate open access versionFindings
  • Christopher G Harris, Mike Stephens, et al. A combined corner and edge detector. In Alvey vision conference, volume 15, pages 10–5244.
    Google ScholarLocate open access versionFindings
  • Yu Zhong. Intrinsic shape signatures: A shape descriptor for 3d object recognition. In 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops, pages 689–696. IEEE, 2009.
    Google ScholarLocate open access versionFindings
  • Alex Flint, Anthony Dick, and Anton Van Den Hengel. Thrift: Local 3d structure recognition. In 9th Biennial Conference of the Australian Pattern Recognition Society on Digital Image Computing Techniques and Applications (DICTA 2007), pages 182–188. IEEE, 2007.
    Google ScholarLocate open access versionFindings
  • Ivan Sipiran and Benjamin Bustos. Harris 3d: a robust extension of the harris operator for interest point detection on 3d meshes. The Visual Computer, 27(11):963, 2011.
    Google ScholarLocate open access versionFindings
  • Radu Bogdan Rusu, Nico Blodow, Zoltan Csaba Marton, and Michael Beetz. Aligning point cloud views using persistent feature histograms. In 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems, pages 3384–3391. IEEE, 2008.
    Google ScholarLocate open access versionFindings
  • Radu Bogdan Rusu, Nico Blodow, and Michael Beetz. Fast point feature histograms (fpfh) for 3d registration. In 2009 IEEE international conference on robotics and automation, pages 3212–3217. IEEE, 2009.
    Google ScholarLocate open access versionFindings
  • Federico Tombari, Samuele Salti, and Luigi Di Stefano. Unique signatures of histograms for local surface description. In European conference on computer vision, pages 356–369.
    Google ScholarLocate open access versionFindings
  • Andrew E. Johnson and Martial Hebert. Using spin images for efficient object recognition in cluttered 3d scenes. IEEE Transactions on pattern analysis and machine intelligence, 21(5): 433–449, 1999.
    Google ScholarLocate open access versionFindings
  • Bastian Steder, Radu Bogdan Rusu, Kurt Konolige, and Wolfram Burgard. Narf: 3d range image features for object recognition. In Workshop on Defining and Solving Realistic Perception Problems in Personal Robotics at the IEEE/RSJ Int. Conf. on Intelligent Robots and Systems (IROS), volume 44, 2010.
    Google ScholarLocate open access versionFindings
  • Andy Zeng, Shuran Song, Matthias Nießner, Matthew Fisher, Jianxiong Xiao, and Thomas Funkhouser. 3dmatch: Learning local geometric descriptors from rgb-d reconstructions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 1802–1811, 2017.
    Google ScholarLocate open access versionFindings
  • Haowen Deng, Tolga Birdal, and Slobodan Ilic. Ppf-foldnet: Unsupervised learning of rotation invariant 3d local descriptors. In Proceedings of the European Conference on Computer Vision (ECCV), pages 602–618, 2018.
    Google ScholarLocate open access versionFindings
  • Haowen Deng, Tolga Birdal, and Slobodan Ilic. Ppfnet: Global context aware local features for robust 3d point matching. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 195–205, 2018.
    Google ScholarLocate open access versionFindings
  • Marc Khoury, Qian-Yi Zhou, and Vladlen Koltun. Learning compact geometric features. In Proceedings of the IEEE International Conference on Computer Vision, pages 153–161, 2017.
    Google ScholarLocate open access versionFindings
  • Jiaxin Li and Gim Hee Lee. Usip: Unsupervised stable interest point detection from 3d point clouds. In Proceedings of the IEEE International Conference on Computer Vision, pages 361–370, 2019.
    Google ScholarLocate open access versionFindings
  • Zi Jian Yew and Gim Hee Lee. 3dfeat-net: Weakly supervised local 3d features for point cloud registration. In European Conference on Computer Vision, pages 630–646.
    Google ScholarLocate open access versionFindings
  • Qingyong Hu, Bo Yang, Linhai Xie, Stefano Rosa, Yulan Guo, Zhihua Wang, Niki Trigoni, and Andrew Markham. Randla-net: Efficient semantic segmentation of large-scale point clouds. arXiv preprint arXiv:1911.11236, 2019.
    Findings
  • Fisher Yu and Vladlen Koltun. Multi-scale context aggregation by dilated convolutions. arXiv preprint arXiv:1511.07122, 2015.
    Findings
  • Ronny Hänsch, Thomas Weber, and Olaf Hellwich. Comparison of 3d interest point detectors and descriptors for point cloud fusion. ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, 2(3):57, 2014.
    Google ScholarLocate open access versionFindings
  • Yulan Guo, Mohammed Bennamoun, Ferdous Sohel, Min Lu, Jianwei Wan, and Ngai Ming Kwok. A comprehensive performance evaluation of 3d local feature descriptors. International Journal of Computer Vision, 116(1):66–89, 2016.
    Google ScholarLocate open access versionFindings
  • Charles R Qi, Hao Su, Kaichun Mo, and Leonidas J Guibas. Pointnet: Deep learning on point sets for 3d classification and segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 652–660, 2017.
    Google ScholarLocate open access versionFindings
  • Charles Ruizhongtai Qi, Li Yi, Hao Su, and Leonidas J Guibas. Pointnet++: Deep hierarchical feature learning on point sets in a metric space. In Advances in neural information processing systems, pages 5099–5108, 2017.
    Google ScholarLocate open access versionFindings
  • Loic Landrieu and Martin Simonovsky. Large-scale point cloud semantic segmentation with superpoint graphs. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 4558–4567, 2018.
    Google ScholarLocate open access versionFindings
  • Jiaxin Li, Ben M Chen, and Gim Hee Lee. So-net: Self-organizing network for point cloud analysis. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 9397–9406, 2018.
    Google ScholarLocate open access versionFindings
  • Binh-Son Hua, Minh-Khoi Tran, and Sai-Kit Yeung. Pointwise convolutional neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 984–993, 2018.
    Google ScholarLocate open access versionFindings
  • Hang Su, Varun Jampani, Deqing Sun, Subhransu Maji, Evangelos Kalogerakis, Ming-Hsuan Yang, and Jan Kautz. Splatnet: Sparse lattice networks for point cloud processing. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 2530–2539, 2018.
    Google ScholarLocate open access versionFindings
  • Wenxuan Wu, Zhongang Qi, and Li Fuxin. Pointconv: Deep convolutional networks on 3d point clouds. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 9621–9630, 2019.
    Google ScholarLocate open access versionFindings
  • Weixin Lu, Guowei Wan, Yao Zhou, Xiangyu Fu, Pengfei Yuan, and Shiyu Song. Deepvcp: An end-to-end deep neural network for point cloud registration. In Proceedings of the IEEE International Conference on Computer Vision, pages 12–21, 2019.
    Google ScholarLocate open access versionFindings
  • Guohao Li, Matthias Muller, Ali Thabet, and Bernard Ghanem. Deepgcns: Can gcns go as deep as cnns? In Proceedings of the IEEE International Conference on Computer Vision, pages 9267–9276, 2019.
    Google ScholarLocate open access versionFindings
  • Francis Engelmann, Theodora Kontogianni, and Bastian Leibe. Dilated point convolutions: On the receptive field of point convolutions. arXiv preprint arXiv:1907.12046, 2019.
    Findings
  • Anshul Paigwar, Ozgur Erkent, Christian Wolf, and Christian Laugier. Attentional pointnet for 3d-object detection in point clouds. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pages 0–0, 2019.
    Google ScholarLocate open access versionFindings
  • Wenxiao Zhang and Chunxia Xiao. Pcan: 3d attention map learning using contextual information for point cloud based retrieval. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 12436–12445, 2019.
    Google ScholarLocate open access versionFindings
  • Lei Wang, Yuchun Huang, Yaolin Hou, Shenman Zhang, and Jie Shan. Graph attention convolution for point cloud semantic segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 10296–10305, 2019.
    Google ScholarLocate open access versionFindings
  • Xu Wang, Jingming He, and Lin Ma. Exploiting local and global structure for point cloud semantic segmentation with contextual point representations. In Advances in Neural Information Processing Systems, pages 4573–4583, 2019.
    Google ScholarLocate open access versionFindings
  • Jiancheng Yang, Qiang Zhang, Bingbing Ni, Linguo Li, Jinxian Liu, Mengdie Zhou, and Qi Tian. Modeling point clouds with self-attention and gumbel subset sampling. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 3323–3332, 2019.
    Google ScholarLocate open access versionFindings
  • Andreas Geiger, Philip Lenz, and Raquel Urtasun. Are we ready for autonomous driving? the kitti vision benchmark suite. In Conference on Computer Vision and Pattern Recognition (CVPR), 2012.
    Google ScholarLocate open access versionFindings
  • Gaurav Pandey, James R McBride, and Ryan M Eustice. Ford campus vision and lidar data set. The International Journal of Robotics Research, 30(13):1543–1552, 2011.
    Google ScholarLocate open access versionFindings
  • Radu Bogdan Rusu and Steve Cousins. 3D is here: Point Cloud Library (PCL). In IEEE International Conference on Robotics and Automation (ICRA), Shanghai, China, May 9-13 2011.
    Google ScholarLocate open access versionFindings
  • Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, et al. Pytorch: An imperative style, high-performance deep learning library. In Advances in Neural Information Processing Systems, pages 8024–8035, 2019.
    Google ScholarLocate open access versionFindings
Author
Fan Lu
Fan Lu
Zhongnan Qu
Zhongnan Qu
Your rating :
0

 

Tags
Comments
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn
小科