AI helps you reading Science
AI generates interpretation videos
AI extracts and analyses the key points of the paper to generate videos automatically
AI parses the academic lineage of this thesis
AI extracts a summary of this paper
We have introduced a novel approach for modeling permutation invariant learning problems by using recurrent architectures
Regularizing Towards Permutation Invariance In Recurrent Models
NIPS 2020, (2020)
In many machine learning problems the output should not depend on the order of the input. Such "permutation invariant" functions have been studied extensively recently. Here we argue that temporal architectures such as RNNs are highly relevant for such problems, despite the inherent dependence of RNNs on order. We show that RNNs can be ...More
PPT (Upload PPT)
- One of the most successful models of the current deep-learning Renaissance are convolutional neural nets (CNN) [Krizhevsky et al, 2012], which utilize domain specific properties such as invariance of images to specific spatial transformations.
- In this work the authors consider the setting where learned functions are such that the order of the inputs does not affect the output value.
- In recent years deep learning has shown remarkable performance in a vast range of applications from natural language processing to autonomous vehicles.
One of the most successful models of the current deep-learning Renaissance are convolutional neural nets (CNN) [Krizhevsky et al, 2012], which utilize domain specific properties such as invariance of images to specific spatial transformations
- We focus on standard recurrent neural networks (RNNs) in what follows, but our approach applies to any recurrent model
- We have introduced a novel approach for modeling permutation invariant learning problems by using recurrent architectures
- We further discuss the permutation invariant parity function, for which fixed aggregation based methods such as DeepSets need a large number of parameters whereas simple recurrent models can implement parity with Op1q parameters
- In addition to the above, we consider a setting where the data is partially permutation invariant. This property cannot be captured by architectures that are fully permutation invariant by design, and this non-invariant case is typically solved using RNNs
- We show that adding our regularization term helps in such “semi” permutation invariant problems
- 100 pts 1000 pts 5000 pts
DeepSets 0.825 0.872
In Table 1 the authors report results for n “ 100, 1000, 5000.
- One advantage of the method is the possibility to tune the level of invariance an RNN should capture.
- This may be useful in real-world datasets where the data is permutation invariant to some extent.
- Len=20 λ “ 0.0 0.9346 (0.006) 0.9461 (0.001) 0.9678 (0.005) λ “ 0.01 0.9584 (0.008) 0.9658 (0.008) 0.9780 (0.004) Table 2: Learning Semi Permutation Invariant models on the half-range problem.
- Note that setting λ “ 0 amounts to vanilla RNN without regularization
- The authors have introduced a novel approach for modeling permutation invariant learning problems by using recurrent architectures.
- In addition to the above, the authors consider a setting where the data is partially permutation invariant.
- This property cannot be captured by architectures that are fully permutation invariant by design, and this non-invariant case is typically solved using RNNs. The authors show that adding the regularization term helps in such “semi” permutation invariant problems.
- Table1: Point cloud classification results
- Table2: Learning Semi Permutation Invariant models on the half-range problem. Test accuracy for two regularization coefficients and different sequence length. Note that setting λ “ 0 amounts to vanilla RNN without regularization
- In recent years, the question of invariances and network architecture has attracted considerable attention, and in particular for various forms of permutation invariances. Several works have focused on characterizing architectures that are “by–design” permutation invariant [Zaheer et al, 2017, Vinyals et al, 2016, Qi et al, 2017, Hartford et al, 2018, Lee et al, 2019, Zellers et al, 2018].
While the above works address invariance for sets, there has also been work on invariance of computations on graphs [Maron et al, 2019, Herzig et al, 2018]. In these, the focus is on problems that take a graph as input, and the goal is for the output to be invariant to all equivalent representations of the graph.
The most relevant line of work relating to ours is Murphy et al  which suggests viewing a permutation invariant function as an average of the output of all possible orderings applied to a permutation variant function. As this approach is intractable, the authors suggest a few efficient approximations.
- This project has received funding from the European Research Council (ERC) under the European Unions Horizon 2020 research and innovation programme (grant ERC HOLI 819080)
- Manzil Zaheer, Satwik Kottur, Siamak Ravanbakhsh, Barnabas Poczos, Ruslan R Salakhutdinov, and Alexander J Smola. Deep Sets. In Advances in neural information processing systems, pages 3391–3401, 2017.
- Juho Lee, Yoonho Lee, Jungtaek Kim, Adam Kosiorek, Seungjin Choi, and Yee Whye Teh. Set transformer: A framework for attention-based permutation-invariant neural networks. In International Conference on Machine Learning, pages 3744–3753, 2019.
- Ryan L Murphy, Balasubramaniam Srinivasan, Vinayak Rao, and Bruno Ribeiro. Janossy pooling: Learning deep permutation-invariant functions for variable-size inputs. arXiv preprint arXiv:1811.01900, 2018.
- 2 Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. Imagenet classification with deep convolutional neural networks. In Advances in neural information processing systems, pages 1097–1105, 2012.
- Charles R Qi, Hao Su, Kaichun Mo, and Leonidas J Guibas. Pointnet: Deep learning on point sets for 3d classification and segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 652–660, 2017.
- Oriol Vinyals, Samy Bengio, and Manjunath Kudlur. Order matters: Sequence to sequence for sets. In 4th International Conference on Learning Representations, ICLR 2016, San Juan, Puerto Rico, May 2-4, 2016, Conference Track Proceedings, 2016.
- Sepp Hochreiter and Jurgen Schmidhuber. Long short-term memory. Neural computation, 9(8): 1735–1780, 1997.
- Junyoung Chung, Caglar Gulcehre, KyungHyun Cho, and Yoshua Bengio. Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555, 2014.
- Noga Alon, Yossi Matias, and Mario Szegedy. The space complexity of approximating the frequency moments. Journal of Computer and system sciences, 58(1):137–147, 1999.
- Guido F Montufar, Razvan Pascanu, Kyunghyun Cho, and Yoshua Bengio. On the number of linear regions of deep neural networks. In Advances in neural information processing systems, pages 2924–2932, 2014.
- Jason Hartford, Devon Graham, Kevin Leyton-Brown, and Siamak Ravanbakhsh. Deep models of interactions across sets. In International Conference on Machine Learning, pages 1914–1923, 2018.
- Rowan Zellers, Mzaark Yatskar, Sam Thomson, and Yejin Choi. Neural motifs: Scene graph parsing with global context. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 5831–5840, 2018.
- Haggai Maron, Heli Ben Hamu, Nadav Shamir, and Yaron Lipman. Invariant and equivariant graph networks. In 7th International Conference on Learning Representations, ICLR, 2019.
- Roei Herzig, Moshiko Raboh, Gal Chechik, Jonathan Berant, and Amir Globerson. Mapping images to scene graphs with permutation-invariant structured prediction. In Advances in Neural Information Processing Systems, pages 7211–7221, 2018.
- Angel X Chang, Thomas Funkhouser, Leonidas Guibas, Pat Hanrahan, Qixing Huang, Zimo Li, Silvio Savarese, Manolis Savva, Shuran Song, Hao Su, et al. Shapenet: An information-rich 3d model repository. arXiv preprint arXiv:1512.03012, 2015.
- Zhirong Wu, Shuran Song, Aditya Khosla, Fisher Yu, Linguang Zhang, Xiaoou Tang, and Jianxiong Xiao. 3d shapenets: A deep representation for volumetric shapes. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1912–1920, 2015.
- Yann LeCun, Leon Bottou, Yoshua Bengio, and Patrick Haffner. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11):2278–2324, 1998.
- Hugo Larochelle, Dumitru Erhan, Aaron Courville, James Bergstra, and Yoshua Bengio. An empirical evaluation of deep architectures on problems with many factors of variation. In Proceedings of the 24th international conference on Machine learning, pages 473–480, 2007.
- Norman Mu and Justin Gilmer. Mnist-c: A robustness benchmark for computer vision. arXiv preprint arXiv:1906.02337, 2019.
- Ian J Goodfellow, Mehdi Mirza, Da Xiao, Aaron Courville, and Yoshua Bengio. An empirical investigation of catastrophic forgetting in gradient-based neural networks. arXiv preprint arXiv:1312.6211, 2013.
- Rupesh K Srivastava, Jonathan Masci, Sohrob Kazerounian, Faustino Gomez, and Jurgen Schmidhuber. Compete to compute. In Advances in neural information processing systems, pages 2310– 2318, 2013.