Neural Learning of One-of-Many Solutions for Combinatorial Problems in Structured Output Spaces

Yatin Nandwani
Yatin Nandwani
Deepanshu Jindal
Deepanshu Jindal

ICLR 2021, 2021.

Cited by: 0|Bibtex|Views22
Other Links: arxiv.org
Weibo:
This work identifies and proposes a solution for handling solution multiplicity while learning neural methods for combinatorial problems in structured output spaces.

Abstract:

Recent research has proposed neural architectures for solving combinatorial problems in structured output spaces. In many such problems, there may exist multiple solutions for a given input, e.g. a partially filled Sudoku puzzle may have many completions satisfying all constraints. Further, we are often interested in finding {\em any on...More

Code:

Data:

0
Introduction
  • Neural networks have become the de-facto standard for solving perceptual tasks over low level representations, such as pixels in an image or audio signals.
  • ‘Partial Label Learning’ (PLL) (Jin & Ghahramani, 2002; Cour et al, 2011; Xu et al, 2019; Feng & An, 2019; Cabannes et al, 2020) involves learning from the training data where, for each input, a noisy set of candidate labels is given amongst which only one label is correct.
  • This exponentially increases the size of the output space, making it intractable to enumerate all possible solutions as is typically done in existing approaches for PLL (Jin & Ghahramani, 2002)
Highlights
  • Neural networks have become the de-facto standard for solving perceptual tasks over low level representations, such as pixels in an image or audio signals
  • The main goal of our experiments is to evaluate MINLOSS, informed exploration (I-EXPLR) and RL based exploration (SELECTR), when compared to baseline approaches that completely disregard the problem of solution multiplicity
  • We report the mean over three random runs, and the accuracy on the best of these runs selected via the devset
  • This suggests that, 1oML models that explicitly handle solution multiplicity, even if by discarding multiple solutions, are much better than those that do not recognize it at all. Both MINLOSS and exploration based techniques vastly improve upon the performance of naïve baselines, with a dramatic 13-52 pt accuracy gains between Unique and SELECTR on queries with multiple solutions
  • We have identified solution multiplicity as an important aspect of the problem, which if not handled properly, may result in sub-optimal models
  • Experiments on three different tasks using two different prediction networks demonstrate the effectiveness of our approach in training robust models under solution multiplicity
Methods
  • The main goal of the experiments is to evaluate MINLOSS, informed exploration (I-EXPLR) and RL based exploration (SELECTR), when compared to baseline approaches that completely disregard the problem of solution multiplicity.
  • The authors wish to assess the performance gap, if any, between queries with a unique solution and those with many possible solutions
  • To answer these questions, the authors conduct experiments on three different tasks (N-Queens, Futoshiki and Sudoku), which are trained over two different prediction networks, as described below..
  • To model N-Queens within NLM, the authors represent a query x and the target y as N 2 dimensional Boolean vectors with 1 at locations where a Queen is placed.
  • The authors use another smaller NLM architecture as the latent model Gφ
Results
  • RESULTS AND DISCUSSION

    The authors report the accuracies across all tasks and models in Table 2.
  • The authors first observe that Naïve and Random perform significantly worse than Unique in all the tasks, on MS, but on OS as well
  • This suggests that, 1oML models that explicitly handle solution multiplicity, even if by discarding multiple solutions, are much better than those that do not recognize it at all.
Conclusion
  • CONCLUSION AND FUTURE WORK

    In this paper, the authors have defined 1oML: the task of learning one of many solutions for combinatorial problems in structured output spaces.
  • The authors have identified solution multiplicity as an important aspect of the problem, which if not handled properly, may result in sub-optimal models.
  • As a first cut solution, the authors proposed a greedy approach: MINLOSS formulation.
  • The authors identified certain shortcomings with the greedy approach and proposed an RL based formulation, SELECTR, which overcomes some of the issues in MINLOSS by exploring the locally sub-optimal choices for better global optimization.
  • Experiments on three different tasks using two different prediction networks demonstrate the effectiveness of the approach in training robust models under solution multiplicity.
  • The authors will make all the code and the datasets publicly available
Summary
  • Introduction:

    Neural networks have become the de-facto standard for solving perceptual tasks over low level representations, such as pixels in an image or audio signals.
  • ‘Partial Label Learning’ (PLL) (Jin & Ghahramani, 2002; Cour et al, 2011; Xu et al, 2019; Feng & An, 2019; Cabannes et al, 2020) involves learning from the training data where, for each input, a noisy set of candidate labels is given amongst which only one label is correct.
  • This exponentially increases the size of the output space, making it intractable to enumerate all possible solutions as is typically done in existing approaches for PLL (Jin & Ghahramani, 2002)
  • Methods:

    The main goal of the experiments is to evaluate MINLOSS, informed exploration (I-EXPLR) and RL based exploration (SELECTR), when compared to baseline approaches that completely disregard the problem of solution multiplicity.
  • The authors wish to assess the performance gap, if any, between queries with a unique solution and those with many possible solutions
  • To answer these questions, the authors conduct experiments on three different tasks (N-Queens, Futoshiki and Sudoku), which are trained over two different prediction networks, as described below..
  • To model N-Queens within NLM, the authors represent a query x and the target y as N 2 dimensional Boolean vectors with 1 at locations where a Queen is placed.
  • The authors use another smaller NLM architecture as the latent model Gφ
  • Results:

    RESULTS AND DISCUSSION

    The authors report the accuracies across all tasks and models in Table 2.
  • The authors first observe that Naïve and Random perform significantly worse than Unique in all the tasks, on MS, but on OS as well
  • This suggests that, 1oML models that explicitly handle solution multiplicity, even if by discarding multiple solutions, are much better than those that do not recognize it at all.
  • Conclusion:

    CONCLUSION AND FUTURE WORK

    In this paper, the authors have defined 1oML: the task of learning one of many solutions for combinatorial problems in structured output spaces.
  • The authors have identified solution multiplicity as an important aspect of the problem, which if not handled properly, may result in sub-optimal models.
  • As a first cut solution, the authors proposed a greedy approach: MINLOSS formulation.
  • The authors identified certain shortcomings with the greedy approach and proposed an RL based formulation, SELECTR, which overcomes some of the issues in MINLOSS by exploring the locally sub-optimal choices for better global optimization.
  • Experiments on three different tasks using two different prediction networks demonstrate the effectiveness of the approach in training robust models under solution multiplicity.
  • The authors will make all the code and the datasets publicly available
Tables
  • Table1: Statistics of datasets. ‘Train’, ‘Test’ and task names are abbreviated. Devset similar to test
  • Table2: Mean (Max) test accuracy over three runs for MINLOSS and SELECTR compared with baselines. OS: test queries with only one solution, MS: queries with more than one solution
  • Table3: Mean test accuracy and standard error over three runs for MINLOSS and SELECTR compared with baselines. OS: test queries with only one solution, MS: queries with more than one solution
  • Table4: Seed wise gains of SELECTR over MINLOSS across different random seeds and experiments
Download tables as Excel
Funding
  • Experiments on three different domains, and using two different prediction networks, demonstrate that our framework significantly improves the accuracy in our setting, obtaining up to 21 pt gain over the baselines
  • Our preliminary analysis of a state-of-the-art neural Sudoku solver (Palm et al, 2018)1, which trains and tests on instances with single solutions, showed that it achieves a high accuracy of 96% on instances with single solution, but the accuracy drops to less than 25%, when tested on inputs that have multiple solutions
Reference
  • Dzmitry Bahdanau, Kyunghyun Cho, and Yoshua Bengio. Neural machine translation by jointly learning to align and translate. In Yoshua Bengio and Yann LeCun (eds.), 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings, 2015. URL http://arxiv.org/abs/1409.0473.
    Findings
  • Rudy Bunel, Matthew J. Hausknecht, Jacob Devlin, Rishabh Singh, and Pushmeet Kohli. Leveraging grammar and reinforcement learning for neural program synthesis. In 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30 - May 3, 2018, Conference Track Proceedings. OpenReview.net, 2018. URL https://openreview.net/forum?id=H1Xw62kRZ.
    Locate open access versionFindings
  • Vivien Cabannes, Alessandro Rudi, and Francis Bach. Structured prediction with partial labelling through the infimum loss. CoRR, abs/2003.00920, 2020. URL https://arxiv.org/abs/2003.00920.
    Findings
  • Timothée Cour, Benjamin Sapp, and Ben Taskar. Learning from partial labels. J. Mach. Learn. Res., 12:1501–1536, 2011. URL http://dl.acm.org/citation.cfm?id=2021049.
    Locate open access versionFindings
  • Emily Denton and Rob Fergus. Stochastic video generation with a learned prior. In Jennifer G. Dy and Andreas Krause (eds.), Proceedings of the 35th International Conference on Machine Learning, ICML 2018, Stockholmsmässan, Stockholm, Sweden, July 10-15, 2018, volume 80 of Proceedings of Machine Learning Research, pp. 1182–1191. PMLR, 2018. URL http://proceedings.mlr.press/v80/denton18a.html.
    Locate open access versionFindings
  • Jacob Devlin, Jonathan Uesato, Surya Bhupatiraju, Rishabh Singh, Abdel-rahman Mohamed, and Pushmeet Kohli. Robustfill: Neural program learning under noisy I/O. In Proceedings of the 34th International Conference on Machine Learning, ICML 2017, Sydney, NSW, Australia, 6-11 August 2017, volume 70 of Proceedings of Machine Learning Research, pp. 990–998. PMLR, 2017. URL http://proceedings.mlr.press/v70/devlin17a.html.
    Locate open access versionFindings
  • Honghua Dong, Jiayuan Mao, Tian Lin, Chong Wang, Lihong Li, and Denny Zhou. Neural logic machines. In 7th International Conference on Learning Representations, ICLR 2019, New Orleans, LA, USA, May 6-9, 2019. OpenReview.net, 2019. URL https://openreview.net/forum?id=B1xY-hRctX.
    Locate open access versionFindings
  • Richard Evans and Edward Grefenstette. Learning explanatory rules from noisy data. J. Artif. Intell. Res., 61:1–64, 201doi: 10.1613/jair.5714. URL https://doi.org/10.1613/jair.5714.
    Locate open access versionFindings
  • Jun Feng, Minlie Huang, Li Zhao, Yang Yang, and Xiaoyan Zhu. Reinforcement learning for relation classification from noisy data. In Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, (AAAI-18), the 30th innovative Applications of Artificial Intelligence (IAAI18), and the 8th AAAI Symposium on Educational Advances in Artificial Intelligence (EAAI-18), New Orleans, Louisiana, USA, February 2-7, 2018, pp. 5779–5786. AAAI Press, 2018. URL https://www.aaai.org/ocs/index.php/AAAI/AAAI18/paper/view/17151.
    Locate open access versionFindings
  • Lei Feng and Bo An. Partial label learning with self-guided retraining. In The Thirty-Third AAAI Conference on Artificial Intelligence, AAAI 2019, The Thirty-First Innovative Applications of Artificial Intelligence Conference, IAAI 2019, The Ninth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2019, Honolulu, Hawaii, USA, January 27 - February 1, 2019, pp. 3542–3549. AAAI Press, 2019. doi: 10.1609/aaai.v33i01.33013542. URL https://doi.org/10.1609/aaai.v33i01.33013542.
    Locate open access versionFindings
  • Mikael Henaff, Junbo Jake Zhao, and Yann LeCun. Prediction under uncertainty with error-encoding networks. CoRR, abs/1711.04994, 2017. URL http://arxiv.org/abs/1711.04994.
    Findings
  • Rong Jin and Zoubin Ghahramani. Learning with multiple labels. In Suzanna Becker, Sebastian Thrun, and Klaus Obermayer (eds.), Advances in Neural Information Processing Systems 15 [Neural Information Processing Systems, NIPS 2002, December 9-14, 2002, Vancouver, British Columbia, Canada], pp. 897–904. MIT Press, 2002. URL http://papers.nips.cc/paper/2234-learning-with-multiple-labels.
    Locate open access versionFindings
  • Yogesh S. Mahajan, Zhaohui Fu, and Sharad Malik. Zchaff2004: An efficient SAT solver. In Holger H. Hoos and David G. Mitchell (eds.), Theory and Applications of Satisfiability Testing, 7th International Conference, SAT 2004, Vancouver, BC, Canada, May 10-13, 2004, Revised Selected Papers, volume 3542 of Lecture Notes in Computer Science, pp. 360–375.
    Google ScholarLocate open access versionFindings
  • Springer, 2004. doi: 10.1007/11527695\_27. URL https://doi.org/10.1007/11527695_27.
    Findings
  • Gary McGuire, Bastian Tugemann, and Gilles Civario. There is no 16-clue sudoku: Solving the sudoku minimum number of clues problem via hitting set enumeration. Experimental Mathematics, 23:190–217, 2012.
    Google ScholarLocate open access versionFindings
  • Pasquale Minervini, Matko Bošnjak, Tim Rocktäschel, Sebastian Riedel, and Edward Grefenstette. Differentiable reasoning on large knowledge bases and natural language. In Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence. AAAI Press, 2020.
    Google ScholarLocate open access versionFindings
  • Ramesh Nallapati, Feifei Zhai, and Bowen Zhou. Summarunner: A recurrent neural network based sequence model for extractive summarization of documents. In Satinder P. Singh and Shaul Markovitch (eds.), Proceedings of the Thirty-First AAAI Conference on Artificial Intelligence, February 4-9, 2017, San Francisco, California, USA, pp. 3075–3081. AAAI Press, 20URL http://aaai.org/ocs/index.php/AAAI/AAAI17/paper/view/14636.
    Locate open access versionFindings
  • Rasmus Berg Palm, Ulrich Paquet, and Ole Winther. Recurrent relational networks. In Samy Bengio, Hanna M. Wallach, Hugo Larochelle, Kristen Grauman, Nicolò Cesa-Bianchi, and Roman Garnett (eds.), Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, NeurIPS 2018, 3-8 December 2018, Montréal, Canada, pp. 3372–3382, 20URL http://papers.nips.cc/paper/7597-recurrent-relational-networks.
    Locate open access versionFindings
  • Kyubyong Park. Can convolutional neural networks crack sudoku puzzles? https://github.com/Kyubyong/sudoku, 2018.
    Findings
  • Romain Paulus, Caiming Xiong, and Richard Socher. A deep reinforced model for abstractive summarization. In 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, April 30 - May 3, 2018, Conference Track Proceedings. OpenReview.net, 2018. URL https://openreview.net/forum?id=HkAClQgA-.
    Locate open access versionFindings
  • Pengda Qin, Weiran Xu, and William Yang Wang. Robust distant supervision relation extraction via deep reinforcement learning. In Iryna Gurevych and Yusuke Miyao (eds.), Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, ACL 2018, Melbourne, Australia, July 15-20, 2018, Volume 1: Long Papers, pp. 2137–2147. Association for Computational Linguistics, 2018. doi: 10.18653/v1/P18-1199. URL https://www.aclweb.org/anthology/P18-1199/.
    Locate open access versionFindings
  • Tim Rocktäschel, Sameer Singh, and Sebastian Riedel. Injecting logical background knowledge into embeddings for relation extraction. In NAACL HLT 2015, The 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Denver, Colorado, USA, May 31 - June 5, 2015, pp. 1119–1129, 2015. URL http://aclweb.org/anthology/N/N15/N15-1118.pdf.
    Findings
  • Gordon Royle. Minimum sudoku. https://staffhome.ecm.uwa.edu.au/~00013890/sudokumin.php, 2014.
    Findings
  • Adam Santoro, Ryan Faulkner, David Raposo, Jack W. Rae, Mike Chrzanowski, Theophane Weber, Daan Wierstra, Oriol Vinyals, Razvan Pascanu, and Timothy P. Lillicrap. Relational recurrent neural networks. In Samy Bengio, Hanna M. Wallach, Hugo Larochelle, Kristen Grauman, Nicolò Cesa-Bianchi, and Roman Garnett (eds.), Advances in Neural Information Processing Systems 31: Annual Conference on Neural Information Processing Systems 2018, NeurIPS 2018, 3-8 December 2018, Montréal, Canada, pp. 7310–7321, 2018. URL http://papers.nips.cc/paper/7960-relational-recurrent-neural-networks.
    Locate open access versionFindings
Full Text
Your rating :
0

 

Tags
Comments