AI helps you reading Science

AI generates interpretation videos

AI extracts and analyses the key points of the paper to generate videos automatically


pub
Go Generating

AI Traceability

AI parses the academic lineage of this thesis


Master Reading Tree
Generate MRT

AI Insight

AI extracts a summary of this paper


Weibo:
We have introduced a novel approach for modeling permutation invariant learning problems by using recurrent architectures

Regularizing Towards Permutation Invariance In Recurrent Models

NIPS 2020, (2020)

Cited by: 1|Views18
EI
Full Text
Bibtex
Weibo

Abstract

In many machine learning problems the output should not depend on the order of the input. Such "permutation invariant" functions have been studied extensively recently. Here we argue that temporal architectures such as RNNs are highly relevant for such problems, despite the inherent dependence of RNNs on order. We show that RNNs can be ...More

Code:

Data:

0
Introduction
  • One of the most successful models of the current deep-learning Renaissance are convolutional neural nets (CNN) [Krizhevsky et al, 2012], which utilize domain specific properties such as invariance of images to specific spatial transformations.
  • In this work the authors consider the setting where learned functions are such that the order of the inputs does not affect the output value.
Highlights
  • In recent years deep learning has shown remarkable performance in a vast range of applications from natural language processing to autonomous vehicles.

    One of the most successful models of the current deep-learning Renaissance are convolutional neural nets (CNN) [Krizhevsky et al, 2012], which utilize domain specific properties such as invariance of images to specific spatial transformations
  • We focus on standard recurrent neural networks (RNNs) in what follows, but our approach applies to any recurrent model
  • We have introduced a novel approach for modeling permutation invariant learning problems by using recurrent architectures
  • We further discuss the permutation invariant parity function, for which fixed aggregation based methods such as DeepSets need a large number of parameters whereas simple recurrent models can implement parity with Op1q parameters
  • In addition to the above, we consider a setting where the data is partially permutation invariant. This property cannot be captured by architectures that are fully permutation invariant by design, and this non-invariant case is typically solved using RNNs
  • We show that adding our regularization term helps in such “semi” permutation invariant problems
Methods
  • 100 pts 1000 pts 5000 pts

    DeepSets 0.825 0.872

    In Table 1 the authors report results for n “ 100, 1000, 5000.
  • One advantage of the method is the possibility to tune the level of invariance an RNN should capture.
  • This may be useful in real-world datasets where the data is permutation invariant to some extent.
  • Len=20 λ “ 0.0 0.9346 (0.006) 0.9461 (0.001) 0.9678 (0.005) λ “ 0.01 0.9584 (0.008) 0.9658 (0.008) 0.9780 (0.004) Table 2: Learning Semi Permutation Invariant models on the half-range problem.
  • Note that setting λ “ 0 amounts to vanilla RNN without regularization
Conclusion
  • The authors have introduced a novel approach for modeling permutation invariant learning problems by using recurrent architectures.
  • In addition to the above, the authors consider a setting where the data is partially permutation invariant.
  • This property cannot be captured by architectures that are fully permutation invariant by design, and this non-invariant case is typically solved using RNNs. The authors show that adding the regularization term helps in such “semi” permutation invariant problems.
Tables
  • Table1: Point cloud classification results
  • Table2: Learning Semi Permutation Invariant models on the half-range problem. Test accuracy for two regularization coefficients and different sequence length. Note that setting λ “ 0 amounts to vanilla RNN without regularization
Download tables as Excel
Related work
  • In recent years, the question of invariances and network architecture has attracted considerable attention, and in particular for various forms of permutation invariances. Several works have focused on characterizing architectures that are “by–design” permutation invariant [Zaheer et al, 2017, Vinyals et al, 2016, Qi et al, 2017, Hartford et al, 2018, Lee et al, 2019, Zellers et al, 2018].

    While the above works address invariance for sets, there has also been work on invariance of computations on graphs [Maron et al, 2019, Herzig et al, 2018]. In these, the focus is on problems that take a graph as input, and the goal is for the output to be invariant to all equivalent representations of the graph.

    The most relevant line of work relating to ours is Murphy et al [2018] which suggests viewing a permutation invariant function as an average of the output of all possible orderings applied to a permutation variant function. As this approach is intractable, the authors suggest a few efficient approximations.
Funding
  • This project has received funding from the European Research Council (ERC) under the European Unions Horizon 2020 research and innovation programme (grant ERC HOLI 819080)
Reference
  • Manzil Zaheer, Satwik Kottur, Siamak Ravanbakhsh, Barnabas Poczos, Ruslan R Salakhutdinov, and Alexander J Smola. Deep Sets. In Advances in neural information processing systems, pages 3391–3401, 2017.
    Google ScholarLocate open access versionFindings
  • Juho Lee, Yoonho Lee, Jungtaek Kim, Adam Kosiorek, Seungjin Choi, and Yee Whye Teh. Set transformer: A framework for attention-based permutation-invariant neural networks. In International Conference on Machine Learning, pages 3744–3753, 2019.
    Google ScholarLocate open access versionFindings
  • Ryan L Murphy, Balasubramaniam Srinivasan, Vinayak Rao, and Bruno Ribeiro. Janossy pooling: Learning deep permutation-invariant functions for variable-size inputs. arXiv preprint arXiv:1811.01900, 2018.
    Findings
  • 2 Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems, pages 1097–1105, 2012.
    Google ScholarLocate open access versionFindings
  • Charles R Qi, Hao Su, Kaichun Mo, and Leonidas J Guibas. Pointnet: Deep learning on point sets for 3d classification and segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 652–660, 2017.
    Google ScholarLocate open access versionFindings
  • Oriol Vinyals, Samy Bengio, and Manjunath Kudlur. Order matters: Sequence to sequence for sets. In 4th International Conference on Learning Representations, ICLR 2016, San Juan, Puerto Rico, May 2-4, 2016, Conference Track Proceedings, 2016.
    Google ScholarLocate open access versionFindings
  • Sepp Hochreiter and Jurgen Schmidhuber. Long short-term memory. Neural computation, 9(8): 1735–1780, 1997.
    Google ScholarLocate open access versionFindings
  • Junyoung Chung, Caglar Gulcehre, KyungHyun Cho, and Yoshua Bengio. Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555, 2014.
    Findings
  • Noga Alon, Yossi Matias, and Mario Szegedy. The space complexity of approximating the frequency moments. Journal of Computer and system sciences, 58(1):137–147, 1999.
    Google ScholarLocate open access versionFindings
  • Guido F Montufar, Razvan Pascanu, Kyunghyun Cho, and Yoshua Bengio. On the number of linear regions of deep neural networks. In Advances in neural information processing systems, pages 2924–2932, 2014.
    Google ScholarLocate open access versionFindings
  • Jason Hartford, Devon Graham, Kevin Leyton-Brown, and Siamak Ravanbakhsh. Deep models of interactions across sets. In International Conference on Machine Learning, pages 1914–1923, 2018.
    Google ScholarLocate open access versionFindings
  • Rowan Zellers, Mzaark Yatskar, Sam Thomson, and Yejin Choi. Neural motifs: Scene graph parsing with global context. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 5831–5840, 2018.
    Google ScholarLocate open access versionFindings
  • Haggai Maron, Heli Ben Hamu, Nadav Shamir, and Yaron Lipman. Invariant and equivariant graph networks. In 7th International Conference on Learning Representations, ICLR, 2019.
    Google ScholarLocate open access versionFindings
  • Roei Herzig, Moshiko Raboh, Gal Chechik, Jonathan Berant, and Amir Globerson. Mapping images to scene graphs with permutation-invariant structured prediction. In Advances in Neural Information Processing Systems, pages 7211–7221, 2018.
    Google ScholarLocate open access versionFindings
  • Angel X Chang, Thomas Funkhouser, Leonidas Guibas, Pat Hanrahan, Qixing Huang, Zimo Li, Silvio Savarese, Manolis Savva, Shuran Song, Hao Su, et al. Shapenet: An information-rich 3d model repository. arXiv preprint arXiv:1512.03012, 2015.
    Findings
  • Zhirong Wu, Shuran Song, Aditya Khosla, Fisher Yu, Linguang Zhang, Xiaoou Tang, and Jianxiong Xiao. 3d shapenets: A deep representation for volumetric shapes. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1912–1920, 2015.
    Google ScholarLocate open access versionFindings
  • Yann LeCun, Leon Bottou, Yoshua Bengio, and Patrick Haffner. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278–2324, 1998.
    Google ScholarLocate open access versionFindings
  • Hugo Larochelle, Dumitru Erhan, Aaron Courville, James Bergstra, and Yoshua Bengio. An empirical evaluation of deep architectures on problems with many factors of variation. In Proceedings of the 24th international conference on Machine learning, pages 473–480, 2007.
    Google ScholarLocate open access versionFindings
  • Norman Mu and Justin Gilmer. Mnist-c: A robustness benchmark for computer vision. arXiv preprint arXiv:1906.02337, 2019.
    Findings
  • Ian J Goodfellow, Mehdi Mirza, Da Xiao, Aaron Courville, and Yoshua Bengio. An empirical investigation of catastrophic forgetting in gradient-based neural networks. arXiv preprint arXiv:1312.6211, 2013.
    Findings
  • Rupesh K Srivastava, Jonathan Masci, Sohrob Kazerounian, Faustino Gomez, and Jurgen Schmidhuber. Compete to compute. In Advances in neural information processing systems, pages 2310– 2318, 2013.
    Google ScholarLocate open access versionFindings
Author
Edo Cohen-Karlik
Edo Cohen-Karlik
Avichai Ben David
Avichai Ben David
Your rating :
0

 

Tags
Comments
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn
小科