## AI helps you reading Science

## AI Insight

AI extracts a summary of this paper

Weibo:

# Baxter Permutation Process

NIPS 2020, (2020)

EI

Keywords

Abstract

In this paper, a Bayesian nonparametric (BNP) model for Baxter permutations (BPs), termed BP process (BPP) is proposed and applied to relational data analysis. The BPs are a well-studied class of permutations, and it has been demonstrated that there is one-to-one correspondence between BPs and several interesting objects including floorpl...More

Code:

Data:

Introduction

- Bayesian nonparametric (BNP) methods can overcome the model complexity problem of machine learning tasks, as they can be regarded as an analysis of finite subsets of potentially infinite data using infinite-dimensional probabilistic models, i.e., stochastic processes.
- The authors develop a BNP model of Baxter permutations (BPs)
- This model involves new stochastic processes and is applied to relational data analysis.
- For RP models, the infinite relational model (IRM) [33] and the Mondrian process (MP) [49, 48] have been widely studied and applied to real world applications
- These models cannot represent arbitrary RPs. these models cannot represent arbitrary RPs
- That is, their supports are limited to some subsets of all possible RPs (Figure 1, second and third).
- It has too complicated procedures for the model construction due to its projectivity property, and is not well-suited for Bayesian inference

Highlights

- Bayesian nonparametric (BNP) methods can overcome the model complexity problem of machine learning tasks, as they can be regarded as an analysis of finite subsets of potentially infinite data using infinite-dimensional probabilistic models, i.e., stochastic processes
- In order to describe the evolution of the Baxter permutation (BP) process (BPP), we introduce auxiliary variables, consisting of a sequence of independent and identically distributed (i.i.d.) uniform random variables U1, U2, . . . on [0, 1]
- We introduce a sequence of i.i.d beta random variables into the BPP to control the size of the rooms of the floorplan partitioning (FP) drawn from the BPP
- Our main contributions are as follows: (1) We have presented the BNP model of the BP as a Markov process consisting of a sequence of i.i.d. uniform random variables on [0, 1]
- Owing to the one-to-one correspondence between BP and FP, the model can be used as a probabilistic model on the set of all possible FPs. (2) We combined the BPP with the block-breaking process (BBP) to obtain a stochastic process for arbitrary rectangular partitioning (RP)
- The blockbreaking process (BBP) can be regarded as a multi-dimensional extension of clustering and it has a potential to give a new perspective to relational data analysis, for it would reveal latent structures in relational data in much more flexible manner than other existing clustering methods, without tuning the model complexity

Results

- The authors held out 20% cells of the input data for testing, and each model was trained by the MCMC using the remaining 80% of the cells.

Conclusion

- This paper has proposed new stochastic processes. The authors' main contributions are as follows: (1) The authors have presented the BNP model of the BP as a Markov process consisting of a sequence of i.i.d. uniform random variables on [0, 1].
- (2) The authors combined the BPP with the BBP to obtain a stochastic process for arbitrary RPs. As in conventional methods, the authors applied this process to the AHK representation to construct a BNP stochastic block model for relational data, and compared its predictive performance with that of the IRM, MP, and RTP.
- The blockbreaking process (BBP) can be regarded as a multi-dimensional extension of clustering and it has a potential to give a new perspective to relational data analysis, for it would reveal latent structures in relational data in much more flexible manner than other existing clustering methods, without tuning the model complexity

Summary

## Introduction:

Bayesian nonparametric (BNP) methods can overcome the model complexity problem of machine learning tasks, as they can be regarded as an analysis of finite subsets of potentially infinite data using infinite-dimensional probabilistic models, i.e., stochastic processes.- The authors develop a BNP model of Baxter permutations (BPs)
- This model involves new stochastic processes and is applied to relational data analysis.
- For RP models, the infinite relational model (IRM) [33] and the Mondrian process (MP) [49, 48] have been widely studied and applied to real world applications
- These models cannot represent arbitrary RPs. these models cannot represent arbitrary RPs
- That is, their supports are limited to some subsets of all possible RPs (Figure 1, second and third).
- It has too complicated procedures for the model construction due to its projectivity property, and is not well-suited for Bayesian inference
## Objectives:

Contributions - The aim of this paper is to construct a new BNP model for arbitrary RPs, so that it has a simple description and high affinity with Bayesian inference.## Results:

The authors held out 20% cells of the input data for testing, and each model was trained by the MCMC using the remaining 80% of the cells.## Conclusion:

This paper has proposed new stochastic processes. The authors' main contributions are as follows: (1) The authors have presented the BNP model of the BP as a Markov process consisting of a sequence of i.i.d. uniform random variables on [0, 1].- (2) The authors combined the BPP with the BBP to obtain a stochastic process for arbitrary RPs. As in conventional methods, the authors applied this process to the AHK representation to construct a BNP stochastic block model for relational data, and compared its predictive performance with that of the IRM, MP, and RTP.
- The blockbreaking process (BBP) can be regarded as a multi-dimensional extension of clustering and it has a potential to give a new perspective to relational data analysis, for it would reveal latent structures in relational data in much more flexible manner than other existing clustering methods, without tuning the model complexity

- Table1: Perplexity comparison for real-world relational data analysis (mean±std)

Funding

- Funding disclosure Funding in direct support of this work is from NTT Corporation, without any third party funding.

Study subjects and analysis

Reference

- http://snap.stanford.edu/data/wiki-Vote.html
- http://snap.stanford.edu/data/ego-Facebook.html
- http://snap.stanford.edu/data/ego-Twitter.html
- http://snap.stanford.edu/data/soc-Epinions1.html
- Airoldi, E.M., Costa, T.B., Chan, S.H.: Stochastic blockmodel approximation of a graphon: Theory andconsistent estimation. In: Advances in Neural Information Processing Systems (2013)
- Aldous, D.J.: Representations for partially exchangeable arrays of random variables. Journal of Multivariate Analysis 11, 581–598 (1981)
- Aldous, D.J.: Exchangeability and related topics. École d’Été St Flour, Lecture Notes in Mathematics 1117, 1–198 (1985)
- Andrieu, C., Doucet, A., Holenstein, R.: Particle markov chain monte carlo methods. Journal of the Royal Statistical Society: Series B (Statistical Methodology) 72(3), 269–342 (2010)
- Baxter, G.: On fixed points of the composite of commuting functions. Proceedings of American Mathematical Society 15, 851–855 (1964)
- Bochner, S.: Harmonic analysis and the theory of probability. University of California Press (1955)
- Bouchard-Côté, A., Jordan, M.: Variational inference over combinatorial spaces. In: Advances in Nueral Information Processing Systems (2010)
- Burridge, J., Cowan, R., Ma, I.: Full and half Gilbert tessellations with rectangular cells. Advances in Applied Probability 1, 1–19 (2013)
- Caldas, J., Kaski, S.: Bayesian biclustering with the plaid model. In: 2008 IEEE Workshop on Machine Learning for Signal Processing. pp. 291–296 (2008)
- Choi, D.S., Wolfe, P.J.: Co-clustering separately exchangeable network data. Annals of Statistics 42, 29–63 (2014)
- Chung, F., Graham, R., Hoggatt, V., Kleiman, M.: The number of Baxter permutations. Journal of Combinatorics Theory, Series A 24, 382–394 (1978)
- Crane, H.: Infinitely exchangeable partition, tree and graph-valued stochastic process. Ph.D. thesis, Department of Statistics, The University of Chicago (2012)
- Dilks, K.: Quarter-turn Baxter permutations. arXiv:1710.07007 (2017)
- Dulucq, S., Guibert, O.: Baxter permutations. Discrete Mathematics 180, 143–156 (1998)
- Fan, X., Li, B., Sisson, S.A.: The binary space partitioning-tree process. In: International Conference on Artificial Intelligence and Statistics. pp. 1859–1867 (2018)
- Fan, X., Li, B., Luo, L., Sisson, S.A.: Bayesian nonparametric space partitions: A survey. arXiv:2002.11394 (2020)
- Fan, X., Li, B., Sisson, S.: Rectangular bounding process. In: Advances in Neural Information Processing Systems. pp. 7631–7641 (2018)
- Fan, X., Li, B., Sisson, S.A.: Online binary space partitioning forests. arXiv:2003.00269 (2020)
- Fan, X., Li, B., Sisson, S.A.: Binary space partitioning forests. arXiv:1903.09348 (2019)
- Fan, X., Li, B., Wang, Y., Wang, Y., Chen, F.: The Ostomachion Process. In: AAAI Conference on Artificial Intelligence. pp. 1547–1553 (2016)
- Felsner, S., Fusy, E., Noy, M., Orden, D.: Bijections for Baxter families and related objects. Journal of Combinatorial Theory, Series A 118(3), 993 – 1020 (2011)
- Ge, S., Wang, S., Teh, Y.W., Wang, L., Elliott, L.: Random tessellation forests. In: Advances in Neural Information Processing Systems 32, pp. 9575–9585 (2019)
- Gilbert, E.N.: Surface films of needle-shaped crystals. Applications of Undergraduate Mathematics in Engineering pp. 329–346 (1967)
- Hong, X., Huang, G., Cai, Y., Gu, J., Dong, S., Cheng, C., Gu, J.: Corner block list: an effective and efficient topological representation of non-slicing floorplan. In: Proceedings of the IEEE/ACM International Conference on Computer-Aided Design (2000)
- Hoover, D.N.: Relations on probability spaces and arrays of random variables. Tech. rep., Institute of Advanced Study, Princeton (1979)
- Ishiguro, K., Sato, I., Nakano, M., Kimura, A., Ueda, N.: Infinite plaid models for infinite bi-clustering. In: AAAI Conference on Artificial Intelligence
- Kallenberg, O.: On the representation theorem for exchangeable arrays. Journal of Multivariate Analysis 30(1), 137–154 (1989)
- Kallenberg, O.: Symmetries on random arrays and set-indexed processes. Journal of Theoretical Probability 5(4), 727–765 (1992)
- Kemp, C., Tenenbaum, J.B., Griffiths, T.L., Yamada, T., Ueda, N.: Learning systems of concepts with an infinite relational model. In: AAAI Conference on Artificial Intelligence. pp. 381–388 (2006)
- Lakshminarayanan, B., Roy, D., Teh, Y.W.: Mondrian forests: Efficient online random forests. In: Advances in Neural Information Processing Systems (2014)
- Leskovec, J., Huttenlocher, D., Kleinberg, J.: Predicting positive and negative links in online social networks. In: Proceedings of the 19th International Conference on World Wide Web. pp. 641–650 (2010)
- Lin, D., Fisher, J.: Efficient sampling from combinatorial space via bridging. In: International Conference on Artificial Intelligence and Statistics (2012)
- Lloyd, J., Orbanz, P., Ghahramani, Z., Roy, D.M.: Random function priors for exchangeable arrays with applications to graphs and relational data. In: Advances in Neural Information Processing Systems (2012)
- Lovász, L.: Very large graphs. Current Developments in Mathematics 11, 67–128 (2009)
- Mackisack, M.S., Miles, R.E.: Homogeneous rectangular tessellation. Advances on Applied Probability 28, 993 (1996)
- Miller, K., Jordan, M.I., Griffiths, T.L.: Nonparametric latent feature models for link prediction. In: Advances in Neural Information Processing Systems, pp. 1276–1284 (2009)
- Muthukrishnan, S., Poosala, V., Suel, T.: On rectangular partitionings in two dimensions: algorithms, complexity, and applications. In: The International Conference on Database Theory (1999)
- Nakano, M., Ishiguro, K., Kimura, A., Yamada, T., Ueda, N.: Rectangular tiling process. In: Proceedings of the 31st International Conference on Machine Learning. Proceedings of Machine Learning Research, vol. 32, pp. 361–369 (2014)
- Orbanz, P.: Infinite-dimensional exponential families in the cluster analysis of structured data. Ph.D. thesis, Eidgenössische Technische Hochschule Zürich (2008)
- Orbanz, P.: Construction of nonparametric Bayesian models from parametric Bayes equations. In: Advances in Neural Information Processing Systems (2009)
- Orbanz, P.: Conjugate projective limits. arXiv:1012.0363 (2011)
- Orbanz, P., Roy, D.M.: Bayesian models of graphs, arrays and other exchangeable random structures. IEEE Transactions on Pattern Analysis and Machine Intelligence 37, 437–461 (2013)
- Rodriguez, A., Ghosh, K.: Nested partition models. Tech. rep., JackBaskin School of Engineering (2009)
- Roy, D.M.: Computability, inference and modeling in probabilistic programming. Ph.D. thesis, Massachusetts Institute of Technology (2011)
- Roy, D.M., Teh, Y.W.: The Mondrian process. In: Advances in Neural Information Processing Systems
- Sakanushi, K., Kajitani, Y., Mehta, D.P.: The quarter-state-sequence floorplan representation. IEEE Trans. on Circuits and Systems I: Fundamental Theory and Applications 50, 376–386 (2003)
- Sethuraman, J.: A constructive definition of Dirichlet priors. Statistica Sinica 4, 639–650 (1994)
- Shan, H., Banerjee, A.: Bayesian co-clustering. In: IEEE International Conference on Data Mining. pp.
- Wang, P., Laskey, K.B., Domeniconi, C., Jordan, M.I.: Nonparametric bayesian co-clustering ensembles. In: SIAM International conference on Data Mining. pp. 331–342 (2011)
- Zafarani, R., Liu., H.: Social computing data repository at ASU (2009)
- Zhang, X., Kajitani, Y.: Space-planning: placement of modules with controlled empty area by singlesequence. In: Proceedings of Asia and South Pacific Design Automation Conference (2004)

Tags

Comments