3D Shape Reconstruction from Vision and Touch

NIPS 2020, 2020.

Cited by: 0|Bibtex|Views312
EI
Other Links: arxiv.org|dblp.uni-trier.de|academic.microsoft.com
Weibo:
We study this problem and present an effective chart-based approach to fusing vision and touch, which leverages advances in graph convolutional networks

Abstract:

When a toddler is presented a new toy, their instinctual behaviour is to pick it up and inspect it with their hand and eyes in tandem, clearly searching over its surface to properly understand what they are playing with. Here, touch provides high fidelity localized information while vision provides complementary global context. However,...More
0
Introduction
  • From an early age children clearly and often loudly demonstrate that they need to both look and touch any new object that has peaked their interest.
  • Touch provides localized 3D shape information, including the point of contact in space as well as high spatial resolution of the shape, but fails quickly when extrapolating without global context or strong priors.
  • Combining both modalities should lead to richer information and better models for 3D understanding.
  • An overview of 3D shape reconstruction from vision and touch is displayed in Figure 1
Highlights
  • From an early age children clearly and often loudly demonstrate that they need to both look and touch any new object that has peaked their interest
  • Inspired by the papier-mâché technique of [20] and leveraging recent advances in graph convolutional networks (GCN) [37], we aim to represent a 3D object with a collection of disjoint mesh surface elements, which we call charts, where some charts are reserved for tactile signals and others are used to represent visual information
  • We demonstrate the intuitive property that learning from touch exclusively translates into decreased performance, as the 3D shape reconstruction suffers from poor global context while learning from vision exclusively suffers from occlusions and leads to lower local reconstruction accuracy
  • Our main contributions can be summarized as: (1) we introduce a chart-based approach to 3D object reconstruction, leveraging GCNs to combine visual and haptic signals; (2) we build a dataset of simulated haptic object interactions to benchmark 3D shape reconstructions algorithms in this setting; and (3) through an extensive evaluation, we highlight the benefits of the proposed approach, which effectively exploits the complementarity of both modalities
  • We compare our approach to three other modality fusion strategies on the validation set: (1) Sphere-based, where the chart-based initialization is replaced with a sphere-based one, and the sphere vertices contain a concatenation of projected vision features and touch features extracted from a simple CNN; (2) Chart-based, where we remove the hard copying of local touch charts in the prediction; and (3) Chart-based, where we remove the sharing of local chart information in the GCN and only copy them to the final prediction
  • The benefit of fusing vision and touch is further emphasized by the ability of our model to gracefully extrapolate around touch sites, and by the improved reconstruction accuracy when providing an increasing number of grasps, which suggests that the active sensing of visual and touch signals is a promising avenue to improve 3D shape reconstruction
Results
  • The authors describe the experiments designed to validate the approach to 3D reconstruction that leverages both visual and haptic sensory information.
  • The authors compare the approach to three other modality fusion strategies on the validation set: (1) Sphere-based, where the chart-based initialization is replaced with a sphere-based one, and the sphere vertices contain a concatenation of projected vision features and touch features extracted from a simple CNN; (2) Chart-based, where the authors remove the hard copying of local touch charts in the prediction; and (3) Chart-based, where the authors remove the sharing of local chart information in the GCN and only copy them to the final prediction.
  • For all comparisons the authors consider both the occluded and unoccluded vision signals
Conclusion
  • The authors explored the problem of 3D shape reconstruction from vision and touch.
  • The authors' results consistently highlight the benefit of combining both modalities to improve upon single modality baselines, and show the potential of using a chart-based approach to combine vision and touch signal in a principled way.
  • The benefit of fusing vision and touch is further emphasized by the ability of the model to gracefully extrapolate around touch sites, and by the improved reconstruction accuracy when providing an increasing number of grasps, which suggests that the active sensing of visual and touch signals is a promising avenue to improve 3D shape reconstruction
Summary
  • Introduction:

    From an early age children clearly and often loudly demonstrate that they need to both look and touch any new object that has peaked their interest.
  • Touch provides localized 3D shape information, including the point of contact in space as well as high spatial resolution of the shape, but fails quickly when extrapolating without global context or strong priors.
  • Combining both modalities should lead to richer information and better models for 3D understanding.
  • An overview of 3D shape reconstruction from vision and touch is displayed in Figure 1
  • Objectives:

    Inspired by the papier-mâché technique of [20] and leveraging recent advances in graph convolutional networks (GCN) [37], the authors aim to represent a 3D object with a collection of disjoint mesh surface elements, which the authors call charts, where some charts are reserved for tactile signals and others are used to represent visual information.
  • Results:

    The authors describe the experiments designed to validate the approach to 3D reconstruction that leverages both visual and haptic sensory information.
  • The authors compare the approach to three other modality fusion strategies on the validation set: (1) Sphere-based, where the chart-based initialization is replaced with a sphere-based one, and the sphere vertices contain a concatenation of projected vision features and touch features extracted from a simple CNN; (2) Chart-based, where the authors remove the hard copying of local touch charts in the prediction; and (3) Chart-based, where the authors remove the sharing of local chart information in the GCN and only copy them to the final prediction.
  • For all comparisons the authors consider both the occluded and unoccluded vision signals
  • Conclusion:

    The authors explored the problem of 3D shape reconstruction from vision and touch.
  • The authors' results consistently highlight the benefit of combining both modalities to improve upon single modality baselines, and show the potential of using a chart-based approach to combine vision and touch signal in a principled way.
  • The benefit of fusing vision and touch is further emphasized by the ability of the model to gracefully extrapolate around touch sites, and by the improved reconstruction accuracy when providing an increasing number of grasps, which suggests that the active sensing of visual and touch signals is a promising avenue to improve 3D shape reconstruction
Tables
  • Table1: Model selection. We report the per-class Chamfer distance for the validation set together with average value. Note that O stands for occluded and U for unoccluded
  • Table2: Test set results for 3D reconstruction tasks
  • Table3: Chamfer distance per class for with different input modalities: combination of touch local point cloud prediction at each touch readings and occluded or unoccluded vision signal. site
  • Table4: Per-Class dataset statistics of the number of objects, grasps, touches and percentage of successful touches in each class
  • Table5: Local Chamfer distance in increasingly large square rings around each touch sites
  • Table6: Chamfer distance when increasing the number of grasps provided to the models
  • Table7: Single image 3D shape reconstructing results on the ShapeNet Dataset. This evaluation is performed using the evaluation standard from [<a class="ref-link" id="c18" href="#r18">18</a>] and [<a class="ref-link" id="c65" href="#r65">65</a>]
Download tables as Excel
Related work
  • Iterative Refinement

    Vision Signal Touch Signals Image Features

    Infer Local Chart

    Vision Charts Touch Charts

    Chart Deformation

    Replace Touch Charts

    Global Prediction [52, 61, 22] have long dominated the deep learning-based 3D reconstruction literature. However, recent advances in graph neural networks [6, 13, 37, 64, 21] have enabled the effective processing and increasing use of surface meshes [34, 65, 32, 29, 24, 57, 8] and hybrid representations [19, 18]. While more complex in their encoding, mesh-based representations benefit greatly from their arbitrary resolution over other more naive representations. Our chosen representation more closely relates to the one of [19], which combines deformed sheets of points to form 3D shapes. However, unlike [19], our proposed approach exploits the neighborhood connectivity of meshes. Finally, 3D reconstruction has also been posed as a shape completion problem [58, 72], where the input is a partial point cloud obtained from depth information and the prediction is the complete version of it.
Funding
  • We would like to acknowledge the NSERC Canadian Robotics Network, the Natural Sciences and Engineering Research Council, and the Fonds de recherche du Québec – Nature et Technologies for their funding support, as granted to the McGill University authors
Reference
  • Peter Allen. Surface descriptions from vision and touch. In IEEE International Conference on Robotics and Automation (ICRA), volume 1, pages 394–397. IEEE, 1984.
    Google ScholarLocate open access versionFindings
  • P. N. Belhumeur, D. J. Kriegman, and A. L. Yuille. The bas-relief ambiguity. In Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, pages 1060–1066, 1997.
    Google ScholarLocate open access versionFindings
  • A. Bierbaum, I. Gubarev, and R. Dillmann. Robust shape recovery for sparse contact location and normal data from haptic exploration. In 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 3200–3205, 2008.
    Google ScholarLocate open access versionFindings
  • Alexander Bierbaum, Matthias Rambow, Tamim Asfour, and Rudiger Dillmann. A potential field approach to dexterous tactile exploration of unknown objects. In IEEE-RAS International Conference on Humanoid Robots (Humanoids), pages 360–366. IEEE, 2008.
    Google ScholarLocate open access versionFindings
  • Mårten Björkman, Yasemin Bekiroglu, Virgile Högman, and Danica Kragic. Enhancing visual perception of shape through tactile glances. In IEEE International Conference on Intelligent Robots and Systems (IROS), 11 2013.
    Google ScholarLocate open access versionFindings
  • Joan Bruna, Wojciech Zaremba, Arthur Szlam, and Yann Lecun. Spectral networks and locally connected networks on graphs. In International Conference on Learning Representations (ICLR), 2014.
    Google ScholarLocate open access versionFindings
  • Angel X Chang, Thomas Funkhouser, Leonidas Guibas, Pat Hanrahan, Qixing Huang, Zimo Li, Silvio Savarese, Manolis Savva, Shuran Song, Hao Su, et al. Shapenet: An information-rich 3d model repository. arXiv preprint arXiv:1512.03012, 2015.
    Findings
  • Wenzheng Chen, Huan Ling, Jun Gao, Edward Smith, Jaakko Lehtinen, Alec Jacobson, and Sanja Fidler. Learning to predict 3d objects with an interpolation-based differentiable renderer. In Advances in Neural Information Processing Systems, pages 9609–9619, 2019.
    Google ScholarLocate open access versionFindings
  • Christopher B Choy, Danfei Xu, JunYoung Gwak, Kevin Chen, and Silvio Savarese. 3d-r2n2: A unified approach for single and multi-view 3d object reconstruction. In Proceedings of the European Conference on Computer Vision (ECCV), pages 628–644.
    Google ScholarLocate open access versionFindings
  • Blender Online Community. Blender - a 3D modelling and rendering package. Blender Foundation, Stichting Blender Foundation, Amsterdam, 2018.
    Google ScholarFindings
  • Erwin Coumans and Yunfei Bai. Pybullet, a python module for physics simulation for games, robotics and machine learning. GitHub repository, 2016.
    Google ScholarFindings
  • A. Dame, V. A. Prisacariu, C. Y. Ren, and I. Reid. Dense reconstruction using 3d object shape priors. In 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 1288–1295, 2013.
    Google ScholarLocate open access versionFindings
  • Michaël Defferrard, Xavier Bresson, and Pierre Vandergheynst. Convolutional neural networks on graphs with fast localized spectral filtering. In Proceedings of the 30th International Conference on Neural Information Processing Systems, NIPS’16, pages 3844–3852, USA, 2016. Curran Associates Inc.
    Google ScholarLocate open access versionFindings
  • Danny Driess, Peter Englert, and Marc Toussaint. Active learning with query paths for tactile object shape exploration. In n IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2017.
    Google ScholarLocate open access versionFindings
  • Haoqiang Fan, Hao Su, and Leonidas Guibas. A point set generation network for 3d object reconstruction from a single image. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), volume 38, 2017.
    Google ScholarLocate open access versionFindings
  • D. F. Fouhey, A. Gupta, and M. Hebert. Data-driven 3d primitives for single image understanding. In 2013 IEEE International Conference on Computer Vision (CVPR), pages 3392–3399, 2013.
    Google ScholarLocate open access versionFindings
  • Gabriela Zarzar Gandler, Carl Henrik Ek, Mårten Björkman, Rustam Stolkin, and Yasemin Bekiroglu. Object shape estimation and modeling, based on sparse gaussian process implicit surfaces, combining visual data and tactile exploration. Robotics and Autonomous Systems, 126:103433, 2020.
    Google ScholarLocate open access versionFindings
  • Georgia Gkioxari, Jitendra Malik, and Justin Johnson. Mesh r-cnn. IEEE International Conference on Computer Vision (ICCV), 2019.
    Google ScholarLocate open access versionFindings
  • Thibault Groueix, Matthew Fisher, Vladimir G Kim, Bryan C Russell, and Mathieu Aubry. 3dcoded: 3d correspondences by deep deformation. In Proceedings of the European Conference on Computer Vision (ECCV), pages 230–246, 2018.
    Google ScholarLocate open access versionFindings
  • Thibault Groueix, Matthew Fisher, Vladimir G Kim, Bryan C Russell, and Mathieu Aubry. A papier-mâché approach to learning 3d surface generation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 216–224, 2018.
    Google ScholarLocate open access versionFindings
  • Will Hamilton, Zhitao Ying, and Jure Leskovec. Inductive representation learning on large graphs. In Advances in Neural Information Processing Systems (NeurIPS), pages 1024–1034, 2017.
    Google ScholarLocate open access versionFindings
  • Christian Häne, Shubham Tulsiani, and Jitendra Malik. Hierarchical surface prediction for 3d object reconstruction. arXiv preprint arXiv:1704.00710, 2017.
    Findings
  • John C Hart. Sphere tracing: A geometric method for the antialiased ray tracing of implicit surfaces. The Visual Computer, 12(10):527–545, 1996.
    Google ScholarLocate open access versionFindings
  • Paul Henderson and Vittorio Ferrari. Learning to generate and reconstruct 3d meshes with only 2d supervision. arXiv preprint arXiv:1807.09259, 2018.
    Findings
  • D. Hoiem, A. A. Efros, and M. Hebert. Geometric context from a single image. In IEEE International Conference on Computer Vision (ICCV), volume 1, pages 654–661 Vol. 1, 2005.
    Google ScholarLocate open access versionFindings
  • C. Häne, N. Savinov, and M. Pollefeys. Class specific 3d object shape priors using surface normals. In 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 652–659, 2014.
    Google ScholarLocate open access versionFindings
  • Jarmo Ilonen, Jeannette Bohg, and Ville Kyrki. Three-dimensional object reconstruction of symmetric objects by fusing visual and tactile sensing. The International Journal of Robotics Research, 33(2):321–341, 2014.
    Google ScholarLocate open access versionFindings
  • Eldar Insafutdinov and Alexey Dosovitskiy. Unsupervised learning of shape and pose with differentiable point clouds. arXiv preprint arXiv:1810.09381, 2018.
    Findings
  • Dominic Jack, Jhony K Pontes, Sridha Sridharan, Clinton Fookes, Sareh Shirazi, Frederic Maire, and Anders Eriksson. Learning free-form deformations for 3d object reconstruction. arXiv preprint arXiv:1803.10932, 2018.
    Findings
  • N. Jamali, C. Ciliberto, L. Rosasco, and L. Natale. Active perception: Building objects’ models using tactile exploration. In IEEE-RAS International Conference on Humanoid Robots (Humanoids), pages 179–185, Nov 2016.
    Google ScholarLocate open access versionFindings
  • Krishna Murthy Jatavallabhula, Edward Smith, Jean-Francois Lafleche, Clement Fuji Tsang, Artem Rozantsev, Wenzheng Chen, and Tommy Xiang. Kaolin: A pytorch library for accelerating 3d deep learning research. arXiv preprint arXiv:1911.05063, 2019.
    Findings
  • Angjoo Kanazawa, Shubham Tulsiani, Alexei A Efros, and Jitendra Malik. Learning categoryspecific mesh reconstruction from image collections. arXiv preprint arXiv:1803.07549, 2018.
    Findings
  • Abhishek Kar, Christian Häne, and Jitendra Malik. Learning a multi-view stereo machine. In I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, editors, Advances in Neural Information Processing Systems (NeurIPS), pages 365–376. Curran Associates, Inc., 2017.
    Google ScholarLocate open access versionFindings
  • Hiroharu Kato, Yoshitaka Ushiku, and Tatsuya Harada. Neural 3d mesh renderer. arXiv preprint arXiv:1711.07566, 2017.
    Findings
  • Alex Kendall, Hayk Martirosyan, Saumitro Dasgupta, and Peter Henry. End-to-end learning of geometry and context for deep stereo regression. In IEEE International Conference on Computer Vision (ICCV), pages 66–75. IEEE Computer Society, 2017.
    Google ScholarLocate open access versionFindings
  • Diederik P Kingma and Jimmy Ba. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
    Findings
  • Thomas N. Kipf and Max Welling. Semi-supervised classification with graph convolutional networks. In Proceedings of the 5th International Conference on Learning Representations (ICLR), 2017.
    Google ScholarLocate open access versionFindings
  • Jan Koenderink, Andrea Doorn, and Astrid Kappers. Surface perception in picture. Percept. Psychophys., 52:487–496, 09 1992.
    Google ScholarLocate open access versionFindings
  • Mike Lambeta, Po-Wei Chou, Stephen Tian, Brian Yang, Benjamin Maloon, Victoria Rose Most, Dave Stroud, Raymond Santos, Ahmad Byagowi, Gregg Kammerer, Dinesh Jayaraman, and Roberto Calandra. DIGIT: A novel design for a low-cost compact high-resolution tactile sensor with application to in-hand manipulation. IEEE Robotics and Automation Letters (RA-L), 5(3):3838–3845, 2020.
    Google ScholarLocate open access versionFindings
  • M. A. Lee, Y. Zhu, K. Srinivasan, P. Shah, S. Savarese, L. Fei-Fei, A. Garg, and J. Bohg. Making sense of vision and touch: Self-supervised learning of multimodal representations for contact-rich tasks. In 2019 International Conference on Robotics and Automation (ICRA), pages 8943–8950, 2019.
    Google ScholarLocate open access versionFindings
  • Jae Hyun Lim, Pedro O. Pinheiro, Negar Rostamzadeh, Chris Pal, and Sungjin Ahn. Neural multisensory scene inference. In Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, 8-14 December 2019, Vancouver, BC, Canada, pages 8994–9004, 2019.
    Google ScholarLocate open access versionFindings
  • Shan Luo, Joao Bimbo, Ravinder Dahiya, and Hongbin Liu. Robotic Tactile Perception of Object Properties: A Review. arXiv e-prints, page arXiv:1711.03810, November 2017.
    Findings
  • Uriel Martinez-Hernandez, Tony Dodd, Lorenzo Natale, Giorgio Metta, Tony Prescott, and Nathan Lepora. Active contour following to explore object shape with robot touch. In 2013 World Haptics Conference, WHC 2013, 04 2013.
    Google ScholarLocate open access versionFindings
  • Lars Mescheder, Michael Oechsle, Michael Niemeyer, Sebastian Nowozin, and Andreas Geiger. Occupancy networks: Learning 3d reconstruction in function space. In Proceedings IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), 2019.
    Google ScholarLocate open access versionFindings
  • J Krishna Murthy, GV Sai Krishna, Falak Chhaya, and K Madhava Krishna. Reconstructing vehicles from a single image: Shape priors for road scene understanding. In 2017 IEEE International Conference on Robotics and Automation (ICRA), pages 724–731. IEEE, 2017.
    Google ScholarLocate open access versionFindings
  • David Novotny, Diane Larlus, and Andrea Vedaldi. Learning 3d object categories by looking around them. In IEEE International Conference on Computer Vision (ICCV), pages 5228–5237. IEEE, 2017.
    Google ScholarLocate open access versionFindings
  • Simon Ottenhaus, Martin Miller, David Schiebener, Nikolaus Vahrenkamp, and Tamim Asfour. Local implicit surface estimation for haptic exploration. In IEEE-RAS International Conference on Humanoid Robots (Humanoids), pages 850–856, 11 2016.
    Google ScholarLocate open access versionFindings
  • Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas Kopf, Edward Yang, Zachary DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala. Pytorch: An imperative style, highperformance deep learning library. In H. Wallach, H. Larochelle, A. Beygelzimer, F. d'AlchéBuc, E. Fox, and R. Garnett, editors, Advances in Neural Information Processing Systems 32, pages 8024–8035. Curran Associates, Inc., 2019.
    Google ScholarLocate open access versionFindings
  • Z. Pezzementi, C. Reyda, and G. D. Hager. Object mapping, recognition, and localization from tactile geometry. In 2011 IEEE International Conference on Robotics and Automation, pages 5942–5948, 2011.
    Google ScholarLocate open access versionFindings
  • Bui Tuong Phong. Illumination for computer generated pictures. Communications of the ACM, 18(6):311–317, 1975.
    Google ScholarLocate open access versionFindings
  • Charles R Qi, Hao Su, Kaichun Mo, and Leonidas J Guibas. Pointnet: Deep learning on point sets for 3d classification and segmentation. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 1(2):4, 2017.
    Google ScholarLocate open access versionFindings
  • Gernot Riegler, Ali Osman Ulusoy, and Andreas Geiger. Octnet: Learning deep 3d representations at high resolutions. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 6620–6629, 2017.
    Google ScholarLocate open access versionFindings
  • J. Rock, T. Gupta, J. Thorsen, J. Gwak, D. Shin, and D. Hoiem. Completing 3d object shape from one depth image. In 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 2484–2493, 2015.
    Google ScholarLocate open access versionFindings
  • Olaf Ronneberger, Philipp Fischer, and Thomas Brox. U-net: Convolutional networks for biomedical image segmentation. In International Conference on Medical image computing and computer-assisted intervention, pages 234–241.
    Google ScholarLocate open access versionFindings
  • SimLab. Allegro hand overview, 2016. [Online; accessed 25-May-2020].
    Google ScholarFindings
  • Edward J. Smith, Scott Fujimoto, and David Meger. Multi-view silhouette and depth decomposition for high resolution 3d object representation. In Advances in Neural Information Processing Systems, pages 6479–6489, 2018.
    Google ScholarLocate open access versionFindings
  • Edward J. Smith, Scott Fujimoto, Adriana Romero, and David Meger. Geometrics: Exploiting geometric structure for graph-encoded objects. In Kamalika Chaudhuri and Ruslan Salakhutdinov, editors, Proceedings of the 36th International Conference on Machine Learning, ICML 2019, volume 97 of Proceedings of Machine Learning Research, pages 5866–5876. PMLR, 2019.
    Google ScholarLocate open access versionFindings
  • Edward J Smith and David Meger. Improved adversarial systems for 3d object generation and reconstruction. In Conference on Robot Learning (CoRL), pages 87–96, 2017.
    Google ScholarLocate open access versionFindings
  • Nicolas Sommer, Miao Li, and Aude Billard. Bimanual compliant tactile exploration for grasping unknown objects. Proceedings - IEEE International Conference on Robotics and Automation, pages 6400–6407, 09 2014.
    Google ScholarLocate open access versionFindings
  • Xingyuan Sun, Jiajun Wu, Xiuming Zhang, Zhoutong Zhang, Chengkai Zhang, Tianfan Xue, Joshua B. Tenenbaum, and William T. Freeman. Pix3d: Dataset and methods for single-image 3d shape modeling. CoRR, abs/1804.04610, 2018.
    Findings
  • Maxim Tatarchenko, Alexey Dosovitskiy, and Thomas Brox. Octree generating networks: Efficient convolutional architectures for high-resolution 3d outputs. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 2088–2096, 2017.
    Google ScholarLocate open access versionFindings
  • Shubham Tulsiani, Tinghui Zhou, Alexei A Efros, and Jitendra Malik. Multi-view supervision for single-view reconstruction via differentiable ray consistency. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 209–217. IEEE, 2017.
    Google ScholarLocate open access versionFindings
  • Dmitry Ulyanov, Andrea Vedaldi, and Victor Lempitsky. Instance normalization: The missing ingredient for fast stylization. arXiv preprint arXiv:1607.08022, 2016.
    Findings
  • Petar Velickovic, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Liò, and Yoshua Bengio. Graph Attention Networks. International Conference on Learning Representations (ICLR), 2018.
    Google ScholarLocate open access versionFindings
  • Nanyang Wang, Yinda Zhang, Zhuwen Li, Yanwei Fu, Wei Liu, and Yu-Gang Jiang. Pixel2mesh: Generating 3d mesh models from single rgb images. arXiv preprint arXiv:1804.01654, 2018.
    Findings
  • S. Wang, J. Wu, X. Sun, W. Yuan, W. T. Freeman, J. B. Tenenbaum, and E. H. Adelson. 3d shape perception from monocular vision, touch, and shape priors. In IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 1606–1613, Oct 2018.
    Google ScholarLocate open access versionFindings
  • David Watkins-Valls, Jacob Varley, and Peter Allen. Multi-modal geometric learning for grasping and manipulation. In 2019 International Conference on Robotics and Automation (ICRA), pages 7339–7345. IEEE, 2019.
    Google ScholarLocate open access versionFindings
  • Jiajun Wu, Yifan Wang, Tianfan Xue, Xingyuan Sun, William T Freeman, and Joshua B Tenenbaum. MarrNet: 3D Shape Reconstruction via 2.5D Sketches. In Advances In Neural Information Processing Systems (NeurIPS), 2017.
    Google ScholarLocate open access versionFindings
  • Jiajun Wu, Chengkai Zhang, Tianfan Xue, William T Freeman, and Joshua B Tenenbaum. Learning a probabilistic latent space of object shapes via 3d generative-adversarial modeling. In Advances in Neural Information Processing Systems (NeurIPS), pages 82–90, 2016.
    Google ScholarLocate open access versionFindings
  • Jiajun Wu, Chengkai Zhang, Xiuming Zhang, Zhoutong Zhang, William T Freeman, and Joshua B Tenenbaum. Learning shape priors for single-view 3d completion and reconstruction. In Proceedings of the European Conference on Computer Vision (ECCV), pages 673–691.
    Google ScholarLocate open access versionFindings
  • Zhengkun Yi, Roberto Calandra, Filipe Fernandes Veiga, Herke van Hoof, Tucker Hermans, Yilei Zhang, and Jan Peters. Active tactile object exploration with Gaussian processes. In IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 4925–4930, 2016.
    Google ScholarLocate open access versionFindings
  • Wentao Yuan, Tejas Khot, David Held, Christoph Mertz, and Martial Hebert. Pcn: Point completion network. In 2018 International Conference on 3D Vision (3DV), pages 728–737, 2018.
    Google ScholarLocate open access versionFindings
  • [65] MeshRCNN [18] (Pretty) MeshRCNN [18] (Best)
    Google ScholarFindings
Full Text
Your rating :
0

 

Tags
Comments