Program Guided Agent

ICLR, 2020.

Cited by: 0|Bibtex|Views37|Links
EI
Keywords:
Program Execution Program Executor Program Understanding Program Guided Agent Learning to ExecuteMore(1+)
Weibo:
The experimental results on a 2D Minecraft environment demonstrate that the proposed framework learns to reliably fulfill program instructions and generalize well to more complex instructions without additional training

Abstract:

Developing agents that can learn to follow natural language instructions has been an emerging research direction. While being accessible and flexible, natural language instructions can sometimes be ambiguous even to humans. To address this, we propose to utilize programs, structured in a formal language, as a precise and expressive way to...More

Code:

Data:

0
Introduction
  • Humans are capable of leveraging instructions to accomplish complex tasks. A comprehensive instruction usually comprises a set of descriptions detailing a variety of situations and the corresponding subtasks that are required to be fulfilled.
  • Andreas et al (2017a); Oh et al (2017) investigate a hierarchical approach, where the instructions consist of a set of symbolically represented subtasks.
  • Those instructions are not a function of states, which substantially limits their expressiveness
Highlights
  • Humans are capable of leveraging instructions to accomplish complex tasks
  • A comprehensive instruction usually comprises a set of descriptions detailing a variety of situations and the corresponding subtasks that are required to be fulfilled
  • We propose a modular framework, program guided agent, which exploits the structural nature of programs to decompose and execute them as well as learn to ground program tokens with the environment
  • We investigate if learning a multitask policy with the learned modulation mechanism is more effective
  • We propose to utilize programs, structured in a formal language, as an expressive and precise way to specify tasks instead of commonly used natural language instructions
  • The experimental results on a 2D Minecraft environment demonstrate that the proposed framework learns to reliably fulfill program instructions and generalize well to more complex instructions without additional training
Methods
  • To obtain the natural language counterparts of those instructions, the authors asked annotators to construct natural language translations of all the programs.
  • The data collection details, as well as sample programs and their corresponding natural language translations, can be found in Section E.3, and figure 10 respectively.
  • The authors include a brief discussion on how annotated natural language instructions can be ambiguously interpreted as several valid programs
Results
  • The authors train the proposed framework and the end-to-end learning models on training programs and evaluate their performances using the percentage of completed instructions on test and test-complex sets.
  • The authors' proposed framework achieves a satisfactory test performance and only suffers a negligible drop when it is evaluated on test-complex set
  • This can be attributed to the modular design, which explicitly utilizes the structure and grammar of programs, allowing the two learning modules to focus on their local jobs.
Conclusion
  • The authors propose to utilize programs, structured in a formal language, as an expressive and precise way to specify tasks instead of commonly used natural language instructions.
  • The authors introduce the problem of developing a framework that can comprehend a program as well as perceive and interact with the environment to accomplish the desired task
  • To address this problem, the authors devise a modular framework, program guided agent, which executes programs with a program interpreter by altering between querying a perception module when a branching condition is encountered and instructing a policy to fulfill subtasks.
  • The authors investigate the performance of various models that learn from programs and natural language descriptions in an end-to-end fashion
Summary
  • Introduction:

    Humans are capable of leveraging instructions to accomplish complex tasks. A comprehensive instruction usually comprises a set of descriptions detailing a variety of situations and the corresponding subtasks that are required to be fulfilled.
  • Andreas et al (2017a); Oh et al (2017) investigate a hierarchical approach, where the instructions consist of a set of symbolically represented subtasks.
  • Those instructions are not a function of states, which substantially limits their expressiveness
  • Methods:

    To obtain the natural language counterparts of those instructions, the authors asked annotators to construct natural language translations of all the programs.
  • The data collection details, as well as sample programs and their corresponding natural language translations, can be found in Section E.3, and figure 10 respectively.
  • The authors include a brief discussion on how annotated natural language instructions can be ambiguously interpreted as several valid programs
  • Results:

    The authors train the proposed framework and the end-to-end learning models on training programs and evaluate their performances using the percentage of completed instructions on test and test-complex sets.
  • The authors' proposed framework achieves a satisfactory test performance and only suffers a negligible drop when it is evaluated on test-complex set
  • This can be attributed to the modular design, which explicitly utilizes the structure and grammar of programs, allowing the two learning modules to focus on their local jobs.
  • Conclusion:

    The authors propose to utilize programs, structured in a formal language, as an expressive and precise way to specify tasks instead of commonly used natural language instructions.
  • The authors introduce the problem of developing a framework that can comprehend a program as well as perceive and interact with the environment to accomplish the desired task
  • To address this problem, the authors devise a modular framework, program guided agent, which executes programs with a program interpreter by altering between querying a perception module when a branching condition is encountered and instructing a policy to fulfill subtasks.
  • The authors investigate the performance of various models that learn from programs and natural language descriptions in an end-to-end fashion
Tables
  • Table1: Task completion rate. For each method, we iterate over all the programs in a testing set by randomly sampling ten initial environment states and running three models trained using different random seeds for this method. The averaged task completion rates and their standard deviations are reported. Note that all the end-to-end learning models learning from natural language descriptions and programs suffer from a significant performance drop when evaluated on the more complex testing set
  • Table2: Architectural details for end-to-end learning models
Download tables as Excel
Related work
Funding
  • Proposes to utilize programs, written in a formal language, as a structured, expressive, and unambiguous representation to specify tasks
  • Proposes a modular framework, program guided agent, which exploits the structural nature of programs to decompose and execute them as well as learn to ground program tokens with the environment
  • Introduces a learned modulation mechanism that leverages a subtask to modulate the encoded state features instead of concatenating them
Reference
  • Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, et al. Tensorflow: A system for large-scale machine learning. In USENIX Symposium on Operating Systems Design and Implementation, 2016.
    Google ScholarLocate open access versionFindings
  • Amjad Almahairi, Sai Rajeswar, Alessandro Sordoni, Philip Bachman, and Aaron Courville. Augmented cyclegan: Learning many-to-many mappings from unpaired data. In International Conference on Machine Learning, 2018.
    Google ScholarLocate open access versionFindings
  • Uri Alon, Omer Levy, and Eran Yahav. code2seq: Generating sequences from structured representations of code. International Conference on Learning Representations, 2019.
    Google ScholarLocate open access versionFindings
  • David Andre and Stuart J Russell. Programmable reinforcement learning agents. In Advances in Neural Information Processing Systems, 2001.
    Google ScholarLocate open access versionFindings
  • David Andre and Stuart J Russell. State abstraction for programmable reinforcement learning agents. In National Conference on Artificial Intelligence, 2002.
    Google ScholarLocate open access versionFindings
  • Jacob Andreas, Dan Klein, and Sergey Levine. Modular multitask reinforcement learning with policy sketches. In International Conference on Machine Learning, 2017a.
    Google ScholarLocate open access versionFindings
  • Jacob Andreas, Dan Klein, and Sergey Levine. Learning with latent language. In North American Chapter of the Association for Computational Linguistics, 2017b.
    Google ScholarLocate open access versionFindings
  • Yusuf Aytar, Tobias Pfaff, David Budden, Thomas Paine, Ziyu Wang, and Nando de Freitas. Playing hard exploration games by watching youtube. In Advances in Neural Information Processing Systems. 2018.
    Google ScholarLocate open access versionFindings
  • Pierre-Luc Bacon, Jean Harb, and Doina Precup. The option-critic architecture. In Association for the Advancement of Artificial Intelligence, 2017.
    Google ScholarLocate open access versionFindings
  • Dzmitry Bahdanau, Felix Hill, Jan Leike, Edward Hughes, Pushmeet Kohli, and Edward Grefenstette. Learning to understand goal specifications by modelling reward. In International Conference on Learning Representations, 2019.
    Google ScholarLocate open access versionFindings
  • Bram Bakker, Jürgen Schmidhuber, et al. Hierarchical reinforcement learning based on subgoal discovery and subpolicy specialization. In Intelligent Autonomous Systems, 2004.
    Google ScholarLocate open access versionFindings
  • Matko Bošnjak, Tim Rocktäschel, Jason Naradowsky, and Sebastian Riedel. Programming with a differentiable forth interpreter. In International Conference on Machine Learning, 2017.
    Google ScholarLocate open access versionFindings
  • Satchuthananthavale RK Branavan, Harr Chen, Luke S Zettlemoyer, and Regina Barzilay. Reinforcement learning for mapping instructions to actions. In Association for Computational Linguistics, 2009.
    Google ScholarLocate open access versionFindings
  • SRK Branavan, Nate Kushman, Tao Lei, and Regina Barzilay. Learning high-level planning from text. In Association for Computational Linguistics, 2012.
    Google ScholarLocate open access versionFindings
  • Rudy R Bunel, Matthew Hausknecht, Jacob Devlin, Rishabh Singh, and Pushmeet Kohli. Leveraging grammar and reinforcement learning for neural program synthesis. In International Conference on Learning Representations, 2018.
    Google ScholarLocate open access versionFindings
  • Jonathon Cai, Richard Shin, and Dawn Song. Making neural programming architectures generalize via recursion. In International Conference on Learning Representations, 2017.
    Google ScholarLocate open access versionFindings
  • Xinyun Chen, Chang Liu, and Dawn Song. Execution-guided neural program synthesis. In International Conference on Learning Representations, 2019.
    Google ScholarLocate open access versionFindings
  • John D Co-Reyes, Abhishek Gupta, Suvansh Sanjeev, Nick Altieri, John DeNero, Pieter Abbeel, and Sergey Levine. Guiding policies with language via meta-learning. International Conference on Learning Representations, 2019.
    Google ScholarLocate open access versionFindings
  • Misha Denil, Sergio Gómez Colmenarejo, Serkan Cabi, David Saxton, and Nando de Freitas. Programmable agents. arXiv preprint arXiv:1706.06383, 2017.
    Findings
  • Aditya Desai, Sumit Gulwani, Vineet Hingorani, Nidhi Jain, Amey Karkare, Mark Marron, Subhajit Roy, et al. Program synthesis using natural language. In International Conference on Software Engineering, 2016.
    Google ScholarLocate open access versionFindings
  • Jacob Devlin, Rudy R Bunel, Rishabh Singh, Matthew Hausknecht, and Pushmeet Kohli. Neural program meta-induction. In Advances in Neural Information Processing Systems, 2017a.
    Google ScholarLocate open access versionFindings
  • Jacob Devlin, Jonathan Uesato, Surya Bhupatiraju, Rishabh Singh, Abdel-rahman Mohamed, and Pushmeet Kohli. Robustfill: Neural program learning under noisy i/o. In International Conference on Machine Learning, 2017b.
    Google ScholarLocate open access versionFindings
  • Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. Bert: Pre-training of deep bidirectional transformers for language understanding. In North American Chapter of the Association for Computational Linguistics, 2018.
    Google ScholarLocate open access versionFindings
  • Prafulla Dhariwal, Christopher Hesse, Oleg Klimov, Alex Nichol, Matthias Plappert, Alec Radford, John Schulman, Szymon Sidor, Yuhuai Wu, and Peter Zhokhov. Openai baselines, 2017.
    Google ScholarLocate open access versionFindings
  • Bhuwan Dhingra, Hanxiao Liu, Zhilin Yang, William W Cohen, and Ruslan Salakhutdinov. Gatedattention readers for text comprehension. In Association for Computational Linguistics, 2017.
    Google ScholarLocate open access versionFindings
  • Nat Dilokthanakul, Christos Kaplanis, Nick Pawlowski, and Murray Shanahan. Feature control as intrinsic motivation for hierarchical reinforcement learning. arXiv preprint arXiv:1705.06769, 2017.
    Findings
  • Yan Duan, Marcin Andrychowicz, Bradly Stadie, OpenAI Jonathan Ho, Jonas Schneider, Ilya Sutskever, Pieter Abbeel, and Wojciech Zaremba. One-shot imitation learning. In Advances in Neural Information Processing Systems, 2017.
    Google ScholarLocate open access versionFindings
  • Vincent Dumoulin, Jonathon Shlens, and Manjunath Kudlur. A learned representation for artistic style. In International Conference on Learning Representations, 2017.
    Google ScholarLocate open access versionFindings
  • Chelsea Finn, Tianhe Yu, Tianhao Zhang, Pieter Abbeel, and Sergey Levine. One-shot visual imitation learning via meta-learning. In Conference on Robot Learning, 2017.
    Google ScholarLocate open access versionFindings
  • Kevin Frans, Jonathan Ho, Xi Chen, Pieter Abbeel, and John Schulman. Meta learning shared hierarchies. In International Conference on Learning Representations, 2018.
    Google ScholarLocate open access versionFindings
  • Daniel Fried, Jacob Andreas, and Dan Klein. Unified pragmatic models for generating and following instructions. In North American Chapter of the Association for Computational Linguistics, 2017.
    Google ScholarLocate open access versionFindings
  • Daniel Fried, Ronghang Hu, Volkan Cirik, Anna Rohrbach, Jacob Andreas, Louis-Philippe Morency, Taylor Berg-Kirkpatrick, Kate Saenko, Dan Klein, and Trevor Darrell. Speaker-follower models for vision-and-language navigation. In Neural Information Processing Systems, 2018.
    Google ScholarLocate open access versionFindings
  • Alex Graves, Greg Wayne, and Ivo Danihelka. Neural turing machines. arXiv preprint arXiv:1410.5401, 2014.
    Findings
  • Chi Han, Jiayuan Mao, Chuang Gan, Josh Tenenbaum, and Jiajun Wu. Visual concept-metaconcept learning. In Neural Information Processing Systems. 2019.
    Google ScholarLocate open access versionFindings
  • Sepp Hochreiter and Jürgen Schmidhuber. Long short-term memory. Neural Computation, 1997.
    Google ScholarLocate open access versionFindings
  • Jie Hu, Li Shen, and Gang Sun. Squeeze-and-excitation networks. In IEEE Conference on Computer Vision and Pattern Recognition, 2018.
    Google ScholarLocate open access versionFindings
  • Minyoung Huh, Shao-Hua Sun, and Ning Zhang. Feedback adversarial learning: Spatial feedback for improving generative adversarial networks. In IEEE Conference on Computer Vision and Pattern Recognition, 2019.
    Google ScholarLocate open access versionFindings
  • Michael Janner, Karthik Narasimhan, and Regina Barzilay. Representation learning for grounded spatial reasoning. Association for Computational Linguistics, 2018.
    Google ScholarLocate open access versionFindings
  • Jermsak Jermsurawong and Nizar Habash. Predicting the structure of cooking recipes. In Empirical Methods in Natural Language Processing, 2015.
    Google ScholarLocate open access versionFindings
  • Łukasz Kaiser and Ilya Sutskever. Neural gpus learn algorithms. In International Conference on Learning Representations, 2016.
    Google ScholarLocate open access versionFindings
  • Russell Kaplan, Christopher Sauer, and Alexander Sosa. Beating atari with natural language guided reinforcement learning. arXiv preprint arXiv:1704.05539, 2017.
    Findings
  • Tero Karras, Samuli Laine, and Timo Aila. A style-based generator architecture for generative adversarial networks. In IEEE Conference on Computer Vision and Pattern Recognition, 2019.
    Google ScholarLocate open access versionFindings
  • Chloé Kiddon, Ganesa Thandavam Ponnuraj, Luke Zettlemoyer, and Yejin Choi. Mise en place: Unsupervised interpretation of instructional recipes. In Empirical Methods in Natural Language Processing, 2015.
    Google ScholarLocate open access versionFindings
  • George Konidaris, Leslie Pack Kaelbling, and Tomas Lozano-Perez. From skills to symbols: Learning symbolic representations for abstract high-level planning. Journal of Artificial Intelligence Research, 2018.
    Google ScholarLocate open access versionFindings
  • Tejas D Kulkarni, Karthik Narasimhan, Ardavan Saeedi, and Josh Tenenbaum. Hierarchical deep reinforcement learning: Integrating temporal abstraction and intrinsic motivation. In Advances in Neural Information Processing Systems, 2016.
    Google ScholarLocate open access versionFindings
  • Miguel Lázaro-Gredilla, Dianhuan Lin, J Swaroop Guntupalli, and Dileep George. Beyond imitation: Zero-shot task transfer on robots by learning concepts as cognitive programs. arXiv preprint arXiv:1812.02788, 2018.
    Findings
  • Yoonho Lee and Seungjin Choi. Gradient-based meta-learning with learned layerwise metric and subspace. In International Conference on Machine Learning, 2018.
    Google ScholarLocate open access versionFindings
  • Youngwoon Lee, Shao-Hua Sun, Sriram Somasundaram, Edward Hu, and Joseph J. Lim. Composing complex skills by learning transition policies. In International Conference on Learning Representations, 2019.
    Google ScholarLocate open access versionFindings
  • Yuan-Hong Liao, Xavier Puig, Marko Boben, Antonio Torralba, and Sanja Fidler. Synthesizing environment-aware activities via activity sketches. In IEEE Conference on Computer Vision and Pattern Recognition, 2019.
    Google ScholarLocate open access versionFindings
  • Xi Victoria Lin, Chenglong Wang, Luke Zettlemoyer, and Michael D Ernst. Nl2bash: A corpus and semantic parser for natural language interface to the linux operating system. In International Conference on Language Resources and Evaluation, 2018.
    Google ScholarLocate open access versionFindings
  • Yunchao Liu, Jiajun Wu, Zheng Wu, Daniel Ritchie, William T. Freeman, and Joshua B. Tenenbaum. Learning to describe scenes with programs. In International Conference on Learning Representations, 2019.
    Google ScholarLocate open access versionFindings
  • Jonathan Malmaud, Earl Wagner, Nancy Chang, and Kevin Murphy. Cooking with semantics. In Workshop on Semantic Parsing at Association for Computational Linguistics, 2014.
    Google ScholarFindings
  • Jiayuan Mao, Honghua Dong, and Joseph J. Lim. Universal agent for disentangling environments and tasks. In International Conference on Learning Representations, 2018.
    Google ScholarLocate open access versionFindings
  • Jiayuan Mao, Chuang Gan, Pushmeet Kohli, Joshua B Tenenbaum, and Jiajun Wu. The neurosymbolic concept learner: Interpreting scenes, words, and sentences from natural supervision. In International Conference on Learning Representations, 2019.
    Google ScholarLocate open access versionFindings
  • Dipendra Misra, John Langford, and Yoav Artzi. Mapping instructions and visual observations to actions with reinforcement learning. In Empirical Methods in Natural Language Processing, 2017.
    Google ScholarLocate open access versionFindings
  • Dipendra Misra, Andrew Bennett, Valts Blukis, Eyvind Niklasson, Max Shatkhin, and Yoav Artzi. Mapping instructions to actions in 3d environments with visual goal prediction. In Empirical Methods in Natural Language Processing, 2018.
    Google ScholarLocate open access versionFindings
  • Volodymyr Mnih, Adria Puigdomenech Badia, Mehdi Mirza, Alex Graves, Timothy Lillicrap, Tim Harley, David Silver, and Koray Kavukcuoglu. Asynchronous methods for deep reinforcement learning. In International Conference on Machine Learning, 2016.
    Google ScholarLocate open access versionFindings
  • Katharina Mülling, Jens Kober, Oliver Kroemer, and Jan Peters. Learning to select and generalize striking movements in robot table tennis. International Journal of Robotics Research, 2013.
    Google ScholarLocate open access versionFindings
  • Ofir Nachum, Shixiang Shane Gu, Honglak Lee, and Sergey Levine. Data-efficient hierarchical reinforcement learning. In Neural Information Processing Systems, 2018.
    Google ScholarLocate open access versionFindings
  • Arvind Neelakantan, Quoc V Le, and Ilya Sutskever. Neural programmer: Inducing latent programs with gradient descent. In International Conference on Learning Representations, 2015.
    Google ScholarLocate open access versionFindings
  • Junhyuk Oh, Satinder Singh, Honglak Lee, and Pushmeet Kohli. Zero-shot task generalization with multi-task deep reinforcement learning. In International Conference on Machine Learning, 2017.
    Google ScholarLocate open access versionFindings
  • Boris N. Oreshkin, Pau Rodriguez, and Alexandre Lacoste. Tadam: Task dependent adaptive metric for improved few-shot learning. In Neural Information Processing Systems, 2018.
    Google ScholarLocate open access versionFindings
  • Emilio Parisotto, Abdel-rahman Mohamed, Rishabh Singh, Lihong Li, Dengyong Zhou, and Pushmeet Kohli. Neuro-symbolic program synthesis. In International Conference on Learning Representations, 2017.
    Google ScholarLocate open access versionFindings
  • Taesung Park, Ming-Yu Liu, Ting-Chun Wang, and Jun-Yan Zhu. Semantic image synthesis with spatially-adaptive normalization. In IEEE Conference on Computer Vision and Pattern Recognition, 2019.
    Google ScholarLocate open access versionFindings
  • Ronald Parr and Stuart J Russell. Reinforcement learning with hierarchies of machines. In Advances in Neural Information Processing Systems, 1998.
    Google ScholarLocate open access versionFindings
  • Deepak Pathak, Parsa Mahmoudieh, Guanghao Luo, Pulkit Agrawal, Dian Chen, Yide Shentu, Evan Shelhamer, Jitendra Malik, Alexei A Efros, and Trevor Darrell. Zero-shot visual imitation. In International Conference on Learning Representations, 2018.
    Google ScholarLocate open access versionFindings
  • Jeffrey Pennington, Richard Socher, and Christopher D. Manning. Glove: Global vectors for word representation. In Empirical Methods in Natural Language Processing, 2014.
    Google ScholarLocate open access versionFindings
  • Ethan Perez, Harm De Vries, Florian Strub, Vincent Dumoulin, and Aaron Courville. Learning visual reasoning without strong priors. 2017.
    Google ScholarFindings
  • Ethan Perez, Florian Strub, Harm de Vries, Vincent Dumoulin, and Aaron Courville. FiLM: Visual Reasoning with a General Conditioning Layer. In Association for the Advancement of Artificial Intelligence, 2018.
    Google ScholarLocate open access versionFindings
  • Mohammad Raza, Sumit Gulwani, and Natasa Milic-Frayling. Compositional program synthesis from natural language and examples. In International Joint Conference on Artificial Intelligence, 2015.
    Google ScholarLocate open access versionFindings
  • Scott Reed and Nando De Freitas. Neural programmer-interpreters. In International Conference on Learning Representations, 2016.
    Google ScholarLocate open access versionFindings
  • Nobuyuki Shimizu and Andrew Haas. Learning to follow navigational route instructions. In International Joint Conference on Artificial Intelligence, 2009.
    Google ScholarLocate open access versionFindings
  • Richard Shin, Illia Polosukhin, and Dawn Song. Improving neural program synthesis with inferred execution traces. In Neural Information Processing Systems. 2018.
    Google ScholarLocate open access versionFindings
  • Sungryull Sohn, Junhyuk Oh, and Honglak Lee. Hierarchical reinforcement learning for zero-shot generalization with subtask dependencies. In Advances in Neural Information Processing Systems, 2018.
    Google ScholarLocate open access versionFindings
  • Bradly C. Stadie, Pieter Abbeel, and Ilya Sutskever. Third person imitation learning. In International Conference on Learning Representations, 2017.
    Google ScholarLocate open access versionFindings
  • Shao-Hua Sun, Hyeonwoo Noh, Sriram Somasundaram, and Joseph Lim. Neural program synthesis from diverse demonstration videos. In International Conference on Machine Learning, 2018.
    Google ScholarLocate open access versionFindings
  • Kai Sheng Tai, Richard Socher, and Christopher D Manning. Improved semantic representations from tree-structured long short-term memory networks. Association for Computational Linguistics, 2015.
    Google ScholarLocate open access versionFindings
  • Stefanie Tellex, Thomas Kollar, Steven Dickerson, Matthew R Walter, Ashis Gopal Banerjee, Seth J Teller, and Nicholas Roy. Understanding natural language commands for robotic navigation and mobile manipulation. In Association for the Advancement of Artificial Intelligence, 2011.
    Google ScholarLocate open access versionFindings
  • Richard Socher Tianmin Shu, Caiming Xiong. Hierarchical and interpretable skill acquisition in multi-task reinforcement learning. International Conference on Learning Representations, 2018.
    Google ScholarLocate open access versionFindings
  • Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need. In Advances in Neural Information Processing Systems, 2017.
    Google ScholarLocate open access versionFindings
  • Alexander Sasha Vezhnevets, Simon Osindero, Tom Schaul, Nicolas Heess, Max Jaderberg, David Silver, and Koray Kavukcuoglu. Feudal networks for hierarchical reinforcement learning. arXiv preprint arXiv:1703.01161, 2017.
    Findings
  • Adam Vogel and Daniel Jurafsky. Learning to follow navigational directions. In Association for Computational Linguistics, 2010.
    Google ScholarLocate open access versionFindings
  • Risto Vuorio, Shao-Hua Sun, Hexiang Hu, and Joseph J. Lim. Toward multimodal model-agnostic meta-learning. In Meta-Learning Workshop at Neural Information Processing Systems, 2018.
    Google ScholarLocate open access versionFindings
  • Risto Vuorio, Shao-Hua Sun, Hexiang Hu, and Joseph J Lim. Multimodal model-agnostic metalearning via task-aware modulation. In Neural Information Processing Systems, 2019.
    Google ScholarLocate open access versionFindings
  • Sida I Wang, Samuel Ginn, Percy Liang, and Christoper D Manning. Naturalizing a programming language via interactive learning. In Association for Computational Linguistics, 2017a.
    Google ScholarLocate open access versionFindings
  • Ziyu Wang, Josh S Merel, Scott E Reed, Nando de Freitas, Gregory Wayne, and Nicolas Heess. Robust imitation of diverse behaviors. In Advances in Neural Information Processing Systems, 2017b.
    Google ScholarLocate open access versionFindings
  • Da Xiao, Jo-Yu Liao, and Xingyuan Yuan. Improving the universality and learnability of neural programmer-interpreters with combinator abstraction. In International Conference on Learning Representations, 2018.
    Google ScholarLocate open access versionFindings
  • Saining Xie, Sainan Liu, Zeyu Chen, and Zhuowen Tu. Attentional shapecontextnet for point cloud recognition. In IEEE Conference on Computer Vision and Pattern Recognition, 2018.
    Google ScholarLocate open access versionFindings
  • Danfei Xu, Suraj Nair, Yuke Zhu, Julian Gao, Animesh Garg, Li Fei-Fei, and Silvio Savarese. Neural task programming: Learning to generalize across hierarchical tasks. In International Conference on Robotics and Automation, 2018.
    Google ScholarLocate open access versionFindings
  • Pengcheng Yin and Graham Neubig. Tranx: A transition-based neural abstract syntax parser for semantic parsing and code generation. In Empirical Methods in Natural Language Processing, 2018.
    Google ScholarLocate open access versionFindings
  • Tao Yu, Rui Zhang, Kai Yang, Michihiro Yasunaga, Dongxu Wang, Zifan Li, James Ma, Irene Li, Qingning Yao, Shanelle Roman, et al. Spider: A large-scale human-labeled dataset for complex and cross-domain semantic parsing and text-to-sql task. In Empirical Methods in Natural Language Processing, 2018a.
    Google ScholarLocate open access versionFindings
  • Tianhe Yu, Chelsea Finn, Annie Xie, Sudeep Dasari, Tianhao Zhang, Pieter Abbeel, and Sergey Levine. One-shot imitation from observing humans via domain-adaptive meta-learning. In Robotics: Science and Systems, 2018b.
    Google ScholarLocate open access versionFindings
  • Hierarchical reinforcement learning. Our work is also closely related to hierarchical reinforcement learning, where a meta-controller learns to predict which sub-policy to take at each time step (Kulkarni et al., 2016; Bacon et al., 2017; Dilokthanakul et al., 2017; Frans et al., 2018; Vezhnevets et al., 2017; Lee et al., 2019; Bakker et al., 2004; Nachum et al., 2018; Mao et al., 2018). Previous works also investigated in explicitly specifying sub-policy with symbolic representations for meta-controller to utilize, or an explicit selection process of lower-level motor skills (Mülling et al., 2013; Tianmin Shu, 2018).
    Google ScholarLocate open access versionFindings
  • Programmable agents. We would like to emphasize that our work differs from programmable agents (Denil et al., 2017) in motivation, problem formulations, proposed methods, and contributions. First, Denil et al. (2017) concern declarative programs which specify what to be computed (e.g. a target object in a reaching task). However, the programs considered in our work are imperative, which how this is to be computed (i.e. a procedure). Also, Denil et al. (2017) consider only one-liner programs that contain only AND, OR, and object attributes. On the other hand, we consider programs that are much longer and describe more complex procedures. While Denil et al. (2017) aim to generalize to novel combinations of object attributes, our work is mainly interested in generalizing to more complex tasks (i.e. programs) by leveraging the structure of programs.
    Google ScholarLocate open access versionFindings
  • Programs vs. natural language instructions. In this work, we advocate utilizing programs as a task representation and propose a modular framework that can leverage the structure of programs to address this problem. Yet, natural language instructions enjoy better accessibility and are more intuitive to users who do not have experience in programming languages. While addressing the accessibility of programs or converting a natural language instruction to a more structural form is beyond the scope of this work, we look forward to future research that leverages the strengths of both programs and natural language instructions by bridging the gap between these two representations, such as synthesizing programs from natural language (Lin et al., 2018; Desai et al., 2016; Raza et al., 2015), semantic parsing that bridges unstructured languages and structural formal languages (Yu et al., 2018a; Yin & Neubig, 2018), and naturalizing program (Wang et al., 2017a).
    Google ScholarLocate open access versionFindings
  • To fuse the information from an input domain (e.g. an image) with another condition domain (e.g. a language query, image such as segmentation map, noise, etc.), a wide range of works have demonstrated the effectiveness of predicting affine transforms based on the condition to scale and bias the input in visual question answering (Perez et al., 2018; 2017), image synthesis (Almahairi et al., 2018; Karras et al., 2019; Park et al., 2019; Huh et al., 2019), style transfer (Dumoulin et al., 2017), recognition (Hu et al., 2018; Xie et al., 2018), reading comprehension (Dhingra et al., 2017), few-shot learning (Oreshkin et al., 2018; Lee & Choi, 2018), etc. Many of those works present an extensive ablation study to compare the learned modulation against traditional ways to merge the information from the input and condition domains.
    Google ScholarLocate open access versionFindings
Your rating :
0

 

Tags
Comments