AI helps you reading Science
AI Insight
AI extracts a summary of this paper
Weibo:
Local Spectral Graph Convolution For Point Set Feature Learning
COMPUTER VISION - ECCV 2018, PT IV, (2018): 56-71
EI
Keywords
Abstract
Feature learning on point clouds has shown great promise, with the introduction of effective and generalizable deep learning frameworks such as pointnet++. Thus far, however, point features have been abstracted in an independent and isolated manner, ignoring the relative layout of neighboring points as well as their features. In the prese...More
Code:
Data:
Introduction
- With the present availability of registered depth and appearance images of complex realworld scenes, there is tremendous interest in feature processing algorithms for classic computer vision problems including object detection, classification and segmentation
- In their latest incarnation, for example, depth sensors are found in the Apple iPhone X camera, making a whole new range of computer vision technology available to the common user.
- In this approach a network structure is designed to work directly with point cloud data, while
Highlights
- With the present availability of registered depth and appearance images of complex realworld scenes, there is tremendous interest in feature processing algorithms for classic computer vision problems including object detection, classification and segmentation
- The processing of 3D point clouds from such sensors remains challenging, since the sensed depth points can vary in spatial density, can be incomplete due to occlusion or perspective effects and can suffer from sensor noise
- Max pooling does not allow for the preservation of information from disjoint sets of points within the neighborhood, as the legs of the ant in the example in Fig. 1. To address this limitation we introduce a recursive spectral clustering and pooling module that yields an improved set activation function for the k nearest neighbors (k-NN), as discussed in Section (4)
- The use of spectral graph convolution on local point neighborhoods, followed by recursive cluster pooling on the derived representations, holds great promise for feature learning from unorganized point sets
- Our method’s ability to capture local structural information and geometric cues from such data presents an advance in deep learning approaches to feature abstraction for applications in computer vision
- The approach is not limited in application to point sets derived from cameras
Methods
- 2048 xyz points and their surface normals are used as input features and the network structure follows that of the 2k configurations in Table 1.
- ScanNet Dataset ScanNet is a large-scale semantic segmentation dataset constructed from real-world 3D scans of indoor scenes, and as such is more challenging than the synthesized 3D models in ShapeNet. Following [1][16], the authors remove RGB information in the experiments in Table 6 and the authors use the semantic voxel label prediction accuracy for evaluation.
- The 4l-pointnet++ model is applied for pointnet++ and the 4l-spec-cp is applied for the method. 1
Conclusion
- The use of spectral graph convolution on local point neighborhoods, followed by recursive cluster pooling on the derived representations, holds great promise for feature learning from unorganized point sets.
- The authors' method’s ability to capture local structural information and geometric cues from such data presents an advance in deep learning approaches to feature abstraction for applications in computer vision.
- The approach is not limited in application to point sets derived from cameras.
- It can be applied in settings where the vertices carry a more abstract interpretation, such as nodes in a graph representing a social network, where local feature attributes could play an important role
Tables
- Table1: Network architectures for the 1k experiments (top) and the 2k experiments
- Table2: Model Ablation Study on ModelNet40 (classification) and ShapeNet (segmentation). Acc stands for classification accuracy, 1k/2k refers to the number of points used and “+N” indicates the addition of surface normal features to xyz spatial coordinates
- Table3: McGill Shape Benchmark classification results. We report the instance and category level accuracy on both the entire database and on subsets (see Table 1 for network structures)
- Table4: MNIST classification results. To obtain the pointnet++ results we reproduced the experiments discussed in [<a class="ref-link" id="c1" href="#r1">1</a>]
- Table5: ModelNet40 results. “Acc” stands for 1k experiments with only xyz points as input features. “Acc + N” stands for 2k experiments with xyz points along with their surface normals as input features.“graph-cp” stands for recursive cluster pooling
- Table6: Segmentation Results. We compare our method with the state-of-the-art approaches, as well as with the results from pointnet++, which we have been able to reproduce experimentally. For ShapeNet, mIOU stands for mean intersection over union on points, and for ScanNet, Acc stands for voxel label prediction accuracy
Funding
- We are also grateful to the Natural Sciences and Engineering Research Council of Canada for research funding
Reference
- Qi, C.R., Yi, L., Su, H., Guibas, L.J.: Pointnet++: Deep hierarchical feature learning on point sets in a metric space. NIPS (2017)
- Qi, C.R., Su, H., Mo, K., Guibas, L.J.: Pointnet: Deep learning on point sets for 3d classification and segmentation. CVPR (2016)
- Defferrard, M., Bresson, X., Vandergheynst, P.: Convolutional neural networks on graphs with fast localized spectral filtering. NIPS (2016)
- Kipf, T.N., Welling, M.: Semi-supervised classification with graph convolutional networks. ICLR (2017)
- Shuman, D.I., Narang, S.K., Frossard, P., Ortega, A., Vandergheynst, P.: The emerging field of signal processing on graphs: Extending high-dimensional data analysis to networks and other irregular domains. IEEE Signal Processing Magazine 30(3) (2013) 83–98
- Wang, C., Pelillo, M., Siddiqi, K.: Dominant set clustering and pooling for multi-view 3d object recognition. BMVC (2017)
- Lombaert, H., Grady, L., Cheriet, F.: Focusr: feature oriented correspondence using spectral regularization–a method for precise surface matching. IEEE Transactions on Pattern Analysis and Machine Intelligence (2013)
- Bronstein, A.M., Bronstein, M.M., Kimmel, R.: Numerical Geometry of Non-rigid Shapes. Monographs in Computer Science. Springer (2008)
- Chung, F.R.: Spectral Graph Theory. Number 92 in Regional Conference Series in Mathematics. American Mathematical Soc. (1997)
- Shi, J., Malik, J.: Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence 22(8) (2000) 888–905
- Monti, F., Boscaini, D., Masci, J., Rodola, E., Svoboda, J., Bronstein, M.M.: Geometric deep learning on graphs and manifolds using mixture model cnns. CVPR (2017)
- Wu, Z., Song, S., Khosla, A., Yu, F., Zhang, L., Tang, X., Xiao, J.: 3D shapenets: A deep representation for volumetric shapes. CVPR (2015)
- Siddiqi, K., Zhang, J., Macrini, D., Shokoufandeh, A., Bouix, S., Dickinson, S.: Retrieving articulated 3d models using medial surfaces. Machine Vision and Applications 19(4) (2008) 261–274
- Yi, L., Kim, V.G., Ceylan, D., Shen, I., Yan, M., Su, H., Lu, A., Huang, Q., Sheffer, A., Guibas, L., et al.: A scalable active framework for region annotation in 3d shape collections. ACM Transactions on Graphics (TOG) (2016)
- Yi, L., Su, H., Guo, X., Guibas, L.: Syncspeccnn: Synchronized spectral cnn for 3d shape segmentation. CVPR (2017)
- Dai, A., Chang, A.X., Savva, M., Halber, M., Funkhouser, T., Nießner, M.: ScanNet: Richlyannotated 3d reconstructions of indoor scenes. CVPR (2017)
- Simard, P.Y., Steinkraus, D., Platt, J.C., et al.: Best practices for convolutional neural networks applied to visual document analysis. ICDAR (2003)
- LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proceedings of the IEEE 86(11) (1998) 2278–2324
- Lin, M., Chen, Q., Yan, S.: Network in network. ICLR (2014)
- Qi, C.R., Su, H., Niessner, M., Dai, A., Yan, M., Guibas, L.J.: Volumetric and multi-view cnns for object classification on 3d data. CVPR (2016)
- Su, H., Maji, S., Kalogerakis, E., Learned-Miller, E.: Multi-view convolutional neural networks for 3D shape recognition. ICCV (2015)
- Yi, L., Su, H., Guo, X., Guibas, L.: Syncspeccnn: Synchronized spectral cnn for 3d shape segmentation. CVPR (2017)
Tags
Comments
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn