Occupancy Anticipation for Efficient Exploration and Navigation

Santhosh K. Ramakrishnan
Santhosh K. Ramakrishnan

european conference on computer vision, pp. 400-418, 2020.

Cited by: 0|Bibtex|Views28
Other Links: arxiv.org|academic.microsoft.com
Weibo:
We introduced the idea of occupancy anticipation from egocentric views in 3D environments

Abstract:

State-of-the-art navigation methods leverage a spatial memory to generalize to new environments, but their occupancy maps are limited to capturing the geometric structures directly observed by the agent. We propose occupancy anticipation, where the agent uses its egocentric RGB-D observations to infer the occupancy state beyond the visi...More

Code:

Data:

0
Introduction
  • An agent must move intelligently through a 3D environment in order to reach a goal.
  • One of the key factors for success in navigation has been the movement towards complex map-based architectures [20,46,11,10] that capture both geometry [20,11,10] and semantics [20,46,19,24], thereby facilitating efficient policy learning and planning
  • These learned maps allow an agent to exploit prior knowledge from training scenes when navigating in novel test environments
Highlights
  • In visual navigation, an agent must move intelligently through a 3D environment in order to reach a goal
  • We have presented our occupancy anticipation approach supposing a single RGB-D observation as input
  • Having defined the core occupancy anticipation components, we demonstrate how our model can be used to benefit embodied navigation in 3D environments
  • We introduced the idea of occupancy anticipation from egocentric views in 3D environments
  • We demonstrate our idea both for individual local maps, as well as integrated within sequential models for exploration and navigation, where the agent continually refines its map of the world
  • Does occupancy anticipation successfully map the environment, but it allows the agent to move to a specified goal more quickly by modeling the navigable spaces. This apples-to-apples comparison shows that our idea improves the state of the art for PointNav
  • Our results clearly demonstrate the advantages on multiple datasets, including improvements to the state-of-the-art embodied AI model for exploration and navigation
Methods
  • IoU % F1 score %

    free occ. mean free occ. mean all-free all-occupied ANS(rgb) ANS(depth) View-extrap.

    OccAnt(rgb) OccAnt(depth) OccAnt(rgbd)

    OccAnt(·), substantially improve the map quality and extent, showing the advantage of learning to anticipate 3D structures beyond those directly observed.

    the Adam optimizer and train on episodes of length 1000 for 1.5 − 2 million frames of experience.
  • IoU % F1 score %.
  • Free occ.
  • Mean free occ.
  • Mean all-free all-occupied ANS(rgb) ANS(depth) View-extrap.
  • OccAnt(·), substantially improve the map quality and extent, showing the advantage of learning to anticipate 3D structures beyond those directly observed.
  • – ANS(rgb) [10]: This is the state-of-the-art Active Neural SLAM approach for exploration and navigation.
  • The authors use the original mapper architecture [10], which infers the visible occupancy from RGB.
Results
  • By exploiting context in both the egocentric views and top-down maps the model successfully anticipates a broader map of the environment, with performance significantly better than strong baselines.
  • Does occupancy anticipation successfully map the environment, but it allows the agent to move to a specified goal more quickly by modeling the navigable spaces.
  • This apples-to-apples comparison shows that the idea improves the state of the art for PointNav
Conclusion
  • The authors introduced the idea of occupancy anticipation from egocentric views in 3D environments.
  • By learning to anticipate the navigable areas beyond the agent’s actual field of view, the authors obtain more accurate maps more efficiently in novel environments.
  • The authors demonstrate the idea both for individual local maps, as well as integrated within sequential models for exploration and navigation, where the agent continually refines its map of the world.
  • The authors' results clearly demonstrate the advantages on multiple datasets, including improvements to the state-of-the-art embodied AI model for exploration and navigation
Summary
  • Introduction:

    An agent must move intelligently through a 3D environment in order to reach a goal.
  • One of the key factors for success in navigation has been the movement towards complex map-based architectures [20,46,11,10] that capture both geometry [20,11,10] and semantics [20,46,19,24], thereby facilitating efficient policy learning and planning
  • These learned maps allow an agent to exploit prior knowledge from training scenes when navigating in novel test environments
  • Objectives:

    The authors' goal is to accelerate navigation and map creation. The authors' goal is to show the impact of the occupancy model, while fixing the backbone navigation architecture and policy learning approach across methods for a fair comparison.
  • Methods:

    IoU % F1 score %

    free occ. mean free occ. mean all-free all-occupied ANS(rgb) ANS(depth) View-extrap.

    OccAnt(rgb) OccAnt(depth) OccAnt(rgbd)

    OccAnt(·), substantially improve the map quality and extent, showing the advantage of learning to anticipate 3D structures beyond those directly observed.

    the Adam optimizer and train on episodes of length 1000 for 1.5 − 2 million frames of experience.
  • IoU % F1 score %.
  • Free occ.
  • Mean free occ.
  • Mean all-free all-occupied ANS(rgb) ANS(depth) View-extrap.
  • OccAnt(·), substantially improve the map quality and extent, showing the advantage of learning to anticipate 3D structures beyond those directly observed.
  • – ANS(rgb) [10]: This is the state-of-the-art Active Neural SLAM approach for exploration and navigation.
  • The authors use the original mapper architecture [10], which infers the visible occupancy from RGB.
  • Results:

    By exploiting context in both the egocentric views and top-down maps the model successfully anticipates a broader map of the environment, with performance significantly better than strong baselines.
  • Does occupancy anticipation successfully map the environment, but it allows the agent to move to a specified goal more quickly by modeling the navigable spaces.
  • This apples-to-apples comparison shows that the idea improves the state of the art for PointNav
  • Conclusion:

    The authors introduced the idea of occupancy anticipation from egocentric views in 3D environments.
  • By learning to anticipate the navigable areas beyond the agent’s actual field of view, the authors obtain more accurate maps more efficiently in novel environments.
  • The authors demonstrate the idea both for individual local maps, as well as integrated within sequential models for exploration and navigation, where the agent continually refines its map of the world.
  • The authors' results clearly demonstrate the advantages on multiple datasets, including improvements to the state-of-the-art embodied AI model for exploration and navigation
Tables
  • Table1: Occupancy anticipation results on the Gibson validation set. Our models,
  • Table2: Timed exploration results: Map quality at T =500 for all models and datasets. See text for details
  • Table3: PointNav results: Our approach provides more efficient navigation
  • Table4: Habitat Challenge 2020 results: Our approach is the winning entry
  • Table5: Per-frame occupancy anticipation ablation study
  • Table6: Timed exploration ablation: Map quality at T =500 for all models and datasets
  • Table7: PointGoal navigation ablation: Time taken refers to the average number of agent actions required; the maximum time budget is T =1000
  • Table8: Policy and mapper hyperparameters used to train our models
  • Table9: Comparing model capacity of different approaches
Download tables as Excel
Related work
  • Navigation Classical approaches to visual navigation perform passive or active SLAM to reconstruct geometric point-clouds [71,23] or semantic maps [5,55], facilitated by loop closures or learned odometry [7,39,8]. More recent work uses deep learning to learn navigation [79,20,56,41,77,75,59,64] or exploration [48,6,57,28,51] policies in an end-to-end fashion. Explicit map-based navigation models [21,46,19,11] usually outperform their implicit counterparts by being more sample-efficient, generalizing well to unseen environments, and even transferring from simulation to real robots [20,10]. However, existing approaches only encode visible regions for mapping (i.e., the ground plane projection of the observed or inferred depth). In contrast, our model goes beyond the visible cues and anticipates maps for unseen regions to accelerate navigation.

    Layout estimation Recent work predicts 3D Manhattan layouts of indoor scenes given 360 panoramas [80,76,70,73,15]. These methods predict structured outputs such as layout boundaries [80,70], corners [80], and floor/ceiling probability maps [76]. However, they do not extrapolate to unseen regions. FloorNet [36] and Floor-SP [29] use walkthroughs of previously scanned buildings to reconstruct detailed floorplans that may include predictions for the room type, doors, objects, etc. However, they assume that the layouts are polygonal, the scene is fully explored, and that detailed human annotations are available. Our occupancy map representation can be seen as a new way for the agent to infer the layout of its surroundings. Unlike any of the above approaches, our model does not make strict assumptions on the scene structure, nor does it require detailed semantic annotations. Furthermore, the proposed anticipation model is learned jointly with the exploration policy and without human guidance. Finally, unlike prior work, our goal is to accelerate navigation and map creation.
Funding
  • UT Austin is supported in part by DARPA Lifelong Learning Machines and the GCP Research Credits Program
Study subjects and analysis
RGB-D observations: 128
On average, the Matterport3D environments are larger. Our observation space consists of 128 × 128 RGB-D observations and odometry sensor readings that denote the change in the agent’s pose x, y, θ. Our action space consists of three actions: move-forward by 25cm, turn-left by 10◦, turn-right by 10◦

Reference
  • The Habitat Challenge 2020. https://aihabitat.org/challenge/2020/
    Findings
  • Anderson, P., Chang, A., Chaplot, D.S., Dosovitskiy, A., Gupta, S., Koltun, V., Kosecka, J., Malik, J., Mottaghi, R., Savva, M., et al.: On evaluation of embodied navigation agents. arXiv preprint arXiv:1807.06757 (2018)
    Findings
  • Anderson, P., Wu, Q., Teney, D., Bruce, J., Johnson, M., Sunderhauf, N., Reid, I., Gould, S., van den Hengel, A.: Vision-and-language navigation: Interpreting visually-grounded navigation instructions in real environments. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2018)
    Google ScholarLocate open access versionFindings
  • Armeni, I., Sax, A., Zamir, A.R., Savarese, S.: Joint 2D-3D-Semantic Data for Indoor Scene Understanding. ArXiv e-prints (Feb 2017) 5. Bao, S.Y., Bagra, M., Chao, Y.W., Savarese, S.: Semantic structure from motion with points, regions, and objects. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition. pp. 2703–2710. IEEE (2012)
    Google ScholarLocate open access versionFindings
  • 6. Burda, Y., Edwards, H., Pathak, D., Storkey, A., Darrell, T., Efros, A.A.: Largescale study of curiosity-driven learning. In: arXiv:1808.04355 (2018)
    Findings
  • 7. Cadena, C., Carlone, L., Carrillo, H., Latif, Y., Scaramuzza, D., Neira, J., Reid, I., Leonard, J.J.: Past, present, and future of simultaneous localization and mapping: Toward the robust-perception age. IEEE Transactions on robotics 32(6), 1309– 1332 (2016)
    Google ScholarLocate open access versionFindings
  • 8. Carrillo, H., Reid, I., Castellanos, J.A.: On the comparison of uncertainty criteria for active slam. In: 2012 IEEE International Conference on Robotics and Automation. pp. 2080–208IEEE (2012)
    Google ScholarLocate open access versionFindings
  • 9. Chang, A., Dai, A., Funkhouser, T., Nießner, M., Savva, M., Song, S., Zeng, A., Zhang, Y.: Matterport3d: Learning from rgb-d data in indoor environments. In: Proceedings of the International Conference on 3D Vision (3DV) (2017), matterPort3D dataset license available at: http://kaldir.vc.in.tum.de/matterport/ MP_TOS.pdf.
    Locate open access versionFindings
  • 10. Chaplot, D.S., Gupta, S., Gandhi, D., Gupta, A., Salakhutdinov, R.: Learning to explore using active neural mapping. 8th International Conference on Learning Representations, ICLR 2020 (2020)
    Google ScholarLocate open access versionFindings
  • 11. Chen, T., Gupta, S., Gupta, A.: Learning exploration policies for navigation. In: 7th International Conference on Learning Representations, ICLR 2019 (2019)
    Google ScholarLocate open access versionFindings
  • 12. Choi, S., Zhou, Q.Y., Koltun, V.: Robust reconstruction of indoor scenes. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2015)
    Google ScholarLocate open access versionFindings
  • 13. Das, A., Datta, S., Gkioxari, G., Lee, S., Parikh, D., Batra, D.: Embodied question answering. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops. pp. 2054–2063 (2018)
    Google ScholarLocate open access versionFindings
  • 14. Datta, S., Maksymets, O., Hoffman, J., Lee, S., Batra, D., Parikh, D.: Integrating egocentric localization for more realistic pointgoal navigation agents. CVPR 2020 Embodied AI Workshop (2020)
    Google ScholarFindings
  • 15. Dhamo, H., Navab, N., Tombari, F.: Object-driven multi-layer scene decomposition from a single image. In: The IEEE International Conference on Computer Vision (ICCV) (October 2019) 16. Elhafsi, A., Ivanovic, B., Janson, L., Pavone, M.: Map-predictive motion planning in unknown environments. arXiv preprint arXiv:1910.08184 (2019)
    Findings
  • 17. Fang, K., Toshev, A., Fei-Fei, L., Savarese, S.: Scene memory transformer for embodied agents in long-horizon tasks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 538–547 (2019)
    Google ScholarLocate open access versionFindings
  • 18. Gan, C., Zhang, Y., Wu, J., Gong, B., Tenenbaum, J.B.: Look, listen, and act: Towards audio-visual embodied navigation. arXiv preprint arXiv:1912.11684 (2019)
    Findings
  • 19. Gordon, D., Kembhavi, A., Rastegari, M., Redmon, J., Fox, D., Farhadi, A.: Iqa: Visual question answering in interactive environments. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 4089–4098 (2018)
    Google ScholarLocate open access versionFindings
  • 20. Gupta, S., Davidson, J., Levine, S., Sukthankar, R., Malik, J.: Cognitive mapping and planning for visual navigation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 2616–2625 (2017)
    Google ScholarLocate open access versionFindings
  • 21. Gupta, S., Fouhey, D., Levine, S., Malik, J.: Unifying map and landmark based representations for visual navigation. arXiv preprint arXiv:1712.08125 (2017)
    Findings
  • 22. Hart, P., Nilsson, N., Raphael, B.: A formal basis for the heuristic determination of minimum cost paths. IEEE Transactions on Systems Science and Cybernetics 4(2), 100–107 (1968). https://doi.org/10.1109/tssc.1968.300136, https://doi.org/10.1109/tssc.1968.300136
    Locate open access versionFindings
  • 23. Hartley, R., Zisserman, A.: Multiple view geometry in computer vision. Cambridge university press (2003)
    Google ScholarFindings
  • 24. Henriques, J.F., Vedaldi, A.: Mapnet: An allocentric spatial memory for mapping environments. In: proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 8476–8484 (2018)
    Google ScholarLocate open access versionFindings
  • 25. Hoermann, S., Bach, M., Dietmayer, K.: Dynamic occupancy grid prediction for urban autonomous driving: A deep learning approach with fully automatic labeling. In: 2018 IEEE International Conference on Robotics and Automation (ICRA). pp. 2056–2063. IEEE (2018)
    Google ScholarLocate open access versionFindings
  • 26. Iizuka, S., Simo-Serra, E., Ishikawa, H.: Globally and locally consistent image completion. ACM Transactions on Graphics (ToG) 36(4), 1–14 (2017)
    Google ScholarLocate open access versionFindings
  • 27. Jayaraman, D., Gao, R., Grauman, K.: Shapecodes: self-supervised feature learning by lifting views to viewgrids. In: Proceedings of the European Conference on Computer Vision (ECCV). pp. 120–136 (2018)
    Google ScholarLocate open access versionFindings
  • 28. Jayaraman, D., Grauman, K.: Learning to look around: Intelligently exploring unseen environments for unknown tasks. In: Computer Vision and Pattern Recognition, 2018 IEEE Conference on (2018)
    Google ScholarLocate open access versionFindings
  • 29. Jiacheng Chen, Chen Liu, J.W., Furukawa, Y.: Floor-sp: Inverse cad for floorplans by sequential room-wise shortest path. In: The IEEE International Conference on Computer Vision (ICCV) (2019)
    Google ScholarLocate open access versionFindings
  • 30. Karkus, P., Ma, X., Hsu, D., Kaelbling, L.P., Lee, W.S., Lozano-Perez, T.: Differentiable algorithm networks for composable robot learning. arXiv preprint arXiv:1905.11602 (2019)
    Findings
  • 31. Katyal, K., Popek, K., Paxton, C., Burlina, P., Hager, G.D.: Uncertainty-aware occupancy map prediction using generative networks for robot navigation. In: 2019 International Conference on Robotics and Automation (ICRA). pp. 5453–5459. IEEE (2019)
    Google ScholarLocate open access versionFindings
  • 32. Katyal, K., Popek, K., Paxton, C., Moore, J., Wolfe, K., Burlina, P., Hager, G.D.: Occupancy map prediction using generative and fully convolutional networks for vehicle navigation. arXiv preprint arXiv:1803.02007 (2018)
    Findings
  • 33. Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
    Findings
  • 34. Kolve, E., Mottaghi, R., Han, W., VanderBilt, E., Weihs, L., Herrasti, A., Gordon, D., Zhu, Y., Gupta, A., Farhadi, A.: AI2-THOR: An Interactive 3D Environment for Visual AI. arXiv (2017)
    Google ScholarLocate open access versionFindings
  • 35. Li, Y., Liu, S., Yang, J., Yang, M.H.: Generative face completion. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 3911–3919 (2017)
    Google ScholarLocate open access versionFindings
  • 36. Liu, C., Wu, J., Furukawa, Y.: Floornet: A unified framework for floorplan reconstruction from 3d scans. In: Proceedings of the European Conference on Computer Vision (ECCV). pp. 201–217 (2018)
    Google ScholarLocate open access versionFindings
  • 37. Lu, C., Dubbelman, G.: Hallucinating beyond observation: Learning to complete with partial observation and unpaired prior knowledge (2019)
    Google ScholarFindings
  • 38. Manolis Savva*, Abhishek Kadian*, Oleksandr Maksymets*, Zhao, Y., Wijmans, E., Jain, B., Straub, J., Liu, J., Koltun, V., Malik, J., Parikh, D., Batra, D.: Habitat: A Platform for Embodied AI Research. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) (2019)
    Google ScholarLocate open access versionFindings
  • 39. Martinez-Cantin, R., De Freitas, N., Brochu, E., Castellanos, J., Doucet, A.: A bayesian exploration-exploitation approach for optimal online sensing and planning with a visually guided mobile robot. Autonomous Robots 27(2), 93–103 (2009)
    Google ScholarLocate open access versionFindings
  • 40. Mohajerin, N., Rohani, M.: Multi-step prediction of occupancy grid maps with recurrent neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 10600–10608 (2019)
    Google ScholarLocate open access versionFindings
  • 41. Mousavian, A., Toshev, A., Fiser, M., Kosecka, J., Wahid, A., Davidson, J.: Visual representations for semantic target driven navigation. In: 2019 International Conference on Robotics and Automation (ICRA). pp. 8846–8852. IEEE (2019)
    Google ScholarLocate open access versionFindings
  • 42. Muller, M., Dosovitskiy, A., Ghanem, B., Koltun, V.: Driving policy transfer via modularity and abstraction. arXiv preprint arXiv:1804.09364 (2018)
    Findings
  • 43. Murali, A., Chen, T., Alwala, K.V., Gandhi, D., Pinto, L., Gupta, S., Gupta, A.: Pyrobot: An open-source robotics framework for research and benchmarking. arXiv preprint arXiv:1906.08236 (2019)
    Findings
  • 44. Odena, A., Dumoulin, V., Olah, C.: Deconvolution and checkerboard artifacts. Distill 1(10), e3 (2016)
    Google ScholarLocate open access versionFindings
  • 45. OCallaghan, S.T., Ramos, F.T.: Gaussian process occupancy maps. The International Journal of Robotics Research 31(1), 42–62 (2012)
    Google ScholarLocate open access versionFindings
  • 46. Parisotto, E., Salakhutdinov, R.: Neural map: Structured memory for deep reinforcement learning. arXiv preprint arXiv:1702.08360 (2017)
    Findings
  • 47. Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., Desmaison, A., Kopf, A., Yang, E., DeVito, Z., Raison, M., Tejani, A., Chilamkurthy, S., Steiner, B., Fang, L., Bai, J., Chintala, S.: Pytorch: An imperative style, high-performance deep learning library. In: Advances in Neural Information Processing Systems 32, pp. 8024–8035. Curran Associates, Inc. (2019), http://papers.neurips.cc/paper/9015-pytorch-an-imperative-style-high-performance-deep-learning-library.pdf
    Locate open access versionFindings
  • 48. Pathak, D., Agrawal, P., Efros, A.A., Darrell, T.: Curiosity-driven exploration by self-supervised prediction. In: International Conference on Machine Learning (2017)
    Google ScholarLocate open access versionFindings
  • 49. Pathak, D., Krahenbuhl, P., Donahue, J., Darrell, T., Efros, A.A.: Context encoders: Feature learning by inpainting. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (June 2016)
    Google ScholarLocate open access versionFindings
  • 50. Ramakrishnan, S.K., Grauman, K.: Sidekick policy learning for active visual exploration. In: Proceedings of the European Conference on Computer Vision (ECCV). pp. 413–430 (2018)
    Google ScholarLocate open access versionFindings
  • 51. Ramakrishnan, S.K., Jayaraman, D., Grauman, K.: Emergence of exploratory look-around behaviors through active observation completion. Science Robotics 4(30) (2019). https://doi.org/10.1126/scirobotics.aaw6326, https://robotics.sciencemag.org/content/4/30/eaaw6326
    Locate open access versionFindings
  • 52. Ramakrishnan, S.K., Jayaraman, D., Grauman, K.: An exploration of embodied visual exploration. arXiv preprint arXiv:2001.02192 (2020)
    Findings
  • 53. Ramos, F., Ott, L.: Hilbert maps: scalable continuous occupancy mapping with stochastic gradient descent. The International Journal of Robotics Research 35(14), 1717–1730 (2016)
    Google ScholarLocate open access versionFindings
  • 54. Ronneberger, O., Fischer, P., Brox, T.: U-net: Convolutional networks for biomedical image segmentation. In: International Conference on Medical image computing and computer-assisted intervention. pp. 234–241. Springer (2015)
    Google ScholarLocate open access versionFindings
  • 55. Salas-Moreno, R.F., Newcombe, R.A., Strasdat, H., Kelly, P.H., Davison, A.J.: Slam++: Simultaneous localisation and mapping at the level of objects. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 1352–1359 (2013)
    Google ScholarLocate open access versionFindings
  • 56. Savinov, N., Dosovitskiy, A., Koltun, V.: Semi-parametric topological memory for navigation. arXiv preprint arXiv:1803.00653 (2018)
    Findings
  • 57. Savinov, N., Raichuk, A., Marinier, R., Vincent, D., Pollefeys, M., Lillicrap, T., Gelly, S.: Episodic curiosity through reachability. arXiv preprint arXiv:1810.02274 (2018)
    Findings
  • 58. Savva, M., Chang, A.X., Dosovitskiy, A., Funkhouser, T., Koltun, V.: Minos: Multimodal indoor simulator for navigation in complex environments. arXiv preprint arXiv:1712.03931 (2017)
    Findings
  • 59. Sax, A., Emi, B., Zamir, A.R., Guibas, L., Savarese, S., Malik, J.: Mid-level visual representations improve generalization and sample efficiency for learning visuomotor policies. arXiv preprint arXiv:1812.11971 (2018)
    Findings
  • 60. Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347 (2017)
    Findings
  • 61. Seifi, S., Tuytelaars, T.: Where to look next: Unsupervised active visual exploration on 360 {\deg} input. arXiv preprint arXiv:1909.10304 (2019)
    Findings
  • 62. Senanayake, R., Ganegedara, T., Ramos, F.: Deep occupancy maps: a continuous mapping technique for dynamic environments (2017)
    Google ScholarFindings
  • 63. Sethian, J.A.: A fast marching level set method for monotonically advancing fronts. Proceedings of the National Academy of Sciences 93(4), 1591–1595 (1996)
    Google ScholarLocate open access versionFindings
  • 64. Shen, W.B., Xu, D., Zhu, Y., Guibas, L.J., Fei-Fei, L., Savarese, S.: Situational fusion of visual representation for visual navigation. In: Proceedings of the IEEE International Conference on Computer Vision. pp. 2881–2890 (2019)
    Google ScholarLocate open access versionFindings
  • 65. Shrestha, R., Tian, F.P., Feng, W., Tan, P., Vaughan, R.: Learned map prediction for enhanced mobile robot exploration. In: 2019 International Conference on Robotics and Automation (ICRA). pp. 1197–1204. IEEE (2019)
    Google ScholarLocate open access versionFindings
  • 66. Sless, L., Cohen, G., Shlomo, B.E., Oron, S.: Self supervised occupancy grid learning from sparse radar for autonomous driving. arXiv preprint arXiv:1904.00415 (2019)
    Findings
  • 67. Song, S., Yu, F., Zeng, A., Chang, A.X., Savva, M., Funkhouser, T.: Semantic scene completion from a single depth image. Proceedings of 30th IEEE Conference on Computer Vision and Pattern Recognition (2017)
    Google ScholarLocate open access versionFindings
  • 68. Song, S., Zeng, A., Chang, A.X., Savva, M., Savarese, S., Funkhouser, T.: Im2pano3d: Extrapolating 360 structure and semantics beyond the field of view. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 3847–3856 (2018)
    Google ScholarLocate open access versionFindings
  • 69. Straub, J., Whelan, T., Ma, L., Chen, Y., Wijmans, E., Green, S., Engel, J.J., MurArtal, R., Ren, C., Verma, S., Clarkson, A., Yan, M., Budge, B., Yan, Y., Pan, X., Yon, J., Zou, Y., Leon, K., Carter, N., Briales, J., Gillingham, T., Mueggler, E., Pesqueira, L., Savva, M., Batra, D., Strasdat, H.M., Nardi, R.D., Goesele, M., Lovegrove, S., Newcombe, R.: The Replica dataset: A digital replica of indoor spaces. arXiv preprint arXiv:1906.05797 (2019)
    Findings
  • 70. Sun, C., Hsiao, C.W., Sun, M., Chen, H.T.: Horizonnet: Learning room layout with 1d representation and pano stretch data augmentation. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (June 2019)
    Google ScholarLocate open access versionFindings
  • 71. Thrun, S.: Probabilistic robotics. Communications of the ACM 45(3), 52–57 (2002)
    Google ScholarLocate open access versionFindings
  • 72. Wijmans, E., Kadian, A., Morcos, A., Lee, S., Essa, I., Parikh, D., Savva, M., Batra, D.: Dd-ppo: Learning near-perfect pointgoal navigators from 2.5 billion frames (2020)
    Google ScholarFindings
  • 73. Wu, W., Fu, X.M., Tang, R., Wang, Y., Qi, Y.H., Liu, L.: Data-driven interior plan generation for residential buildings. ACM Trans. Graph. 38(6) (Nov 2019). https://doi.org/10.1145/3355089.3356556, https://doi.org/10.1145/3355089.3356556 74. Xia, F., Zamir, A.R., He, Z., Sax, A., Malik, J., Savarese, S.: Gibson env: Realworld perception for embodied agents. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 9068–9079 (2018), gibson dataset license agreement available at https://storage.googleapis.com/gibson_material/Agreement%20GDS%2006-04-18.pdf 75.
    Locate open access versionFindings
  • Yang, J., Ren, Z., Xu, M., Chen, X., Crandall, D., Parikh, D., Batra, D.: Embodied amodal recognition: Learning to move to perceive objects. In: ICCV (2019)
    Google ScholarFindings
  • 76. Yang, S.T., Wang, F.E., Peng, C.H., Wonka, P., Sun, M., Chu, H.K.: Dula-net: A dual-projection network for estimating room layouts from a single rgb panorama. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 3363–3372 (2019)
    Google ScholarLocate open access versionFindings
  • 77. Yang, W., Wang, X., Farhadi, A., Gupta, A., Mottaghi, R.: Visual semantic navigation using scene priors. arXiv preprint arXiv:1810.06543 (2018)
    Findings
  • 78. Yang, Z., Pan, J.Z., Luo, L., Zhou, X., Grauman, K., Huang, Q.: Extreme relative pose estimation for rgb-d scans via scene completion. In: The IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (June 2019) 79. Zhu, Y., Gordon, D., Kolve, E., Fox, D., Fei-Fei, L., Gupta, A., Mottaghi, R., Farhadi, A.: Visual Semantic Planning using Deep Successor Representations. In: Computer Vision, 2017 IEEE International Conference on (2017)
    Google ScholarLocate open access versionFindings
  • 80. Zou, C., Colburn, A., Shan, Q., Hoiem, D.: Layoutnet: Reconstructing the 3d room layout from a single rgb image. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp. 2051–2059 (2018)
    Google ScholarLocate open access versionFindings
Full Text
Your rating :
0

 

Tags
Comments