## AI helps you reading Science

## AI Insight

AI extracts a summary of this paper

Weibo:

# Online MAP Inference of Determinantal Point Processes

NIPS 2020, (2020)

EI

Keywords

Abstract

In this paper, we provide an efficient approximation algorithm for finding the most likelihood configuration (MAP) of size k for Determinantal Point Processes (DPP) in the online setting where the data points arrive in an arbitrary order and the algorithm cannot discard the selected elements from its local memory. Given a tolerance additi...More

Code:

Data:

Introduction

- Probabilistic modeling of data, along with complex inference techniques, have become an important ingredient of the modern machine learning toolbox.
- Determinantal point processes (DPP) are elegant probabilistic models of repulsion that admit such criteria, namely, efficient sampling, marginalization, and conditioning [Kulesza and Taskar, 2012a,b].
- They have first been introduced by [Macchi, 1975] in quantum physics to model negative interactions among particles.

Highlights

- Probabilistic modeling of data, along with complex inference techniques, have become an important ingredient of the modern machine learning toolbox
- Determinantal point processes (DPP) have found numerous applications in machine learning that rely on a diverse subset selection, such as different forms of data summarization [Mirzasoleiman et al, 2013, Kulesza and Taskar, 2012b, Feldman et al, 2018, Gong et al, 2014], multi-label classification [Xie et al, 2017], recommender systems [Lee et al, 2017, Qin and Zhu, 2013], to name a few
- The essential characteristic of a DPP is that the inclusion of an item makes the inclusion of similar items less likely
- A DPP assigns the probability of sampling a set of vectors indexed by S ✓ V as follows: Pr(S) / det(VSVST ) = vol2(S), (1)
- We present our algorithmic results for online DPP
- We developed online algorithms for finding the most likelihood configuration of size k for DPPs

Methods

- The authors compare the experimental performances of the Algorithm 1 (ONLINE-LS), with: – an online greedy algorithm, ONLINEGREEDY that upon processing a new row adds it to the solution if, by swapping it with any row in the solution, it is possible to increase the volume. – the classic offline greedy algorithm, GREEDY that does k passes on the entire dataset and in every pass it adds to the current solution the row that increases the volume the most.

All the experiments have been carried out on a standard desktop computer and all the experiments presented are fully deterministic. - For ONLINELS and ONLINEGREEDY the authors report the number of volume computations as system independent proxy of the running time and the number of swap in the solution during the execution of the algorithm to capture the consistency of the algorithm.
- Both ONLINELS and ONLINEGREEDY recover a solution that has quality comparable with the solution of the offline algorithm

Conclusion

- The authors developed online algorithms for finding the most likelihood configuration of size k for DPPs.
- The authors' main contribution —ONLINE-DPP— achieves a kO(k) multiplicative approximation guarantee with an additive error ⌘, using memory independent of the size of the data stream

Related work

- The problem of finding the MAP configuration of a DPP has been studied extensively, and has also been referred to by other names such as sub-determinant maximization and D-optimal experiment design. Much of the work can be divided into a few strands, which we discuss briefly.

Submodular Maximization. It is known that the set function f (S) = log det(VSVST ) is a submodular function. Therefore, a series of previous work applied submodular maximization algorithms such as greedy addition [Chen et al, 2018, Badanidiyuru et al, 2014, Kulesza and Taskar, 2012b], soft-max relaxation [Gillenwater et al, 2012], multi-linear extension [Hassani et al, 2019], in order to find the MAP configuration. Interesting results are also known for maximizing determinantal functions with approximate submodularity Bian et al [2017] and maximizing submodular functions whose value can be negative Harshaw et al [2019]. However, there are two drawbacks with such methods. First, even though f (S) is submodular, it might be negative, i.e., when det(VSVST ) < 1. Almost all submodular maximization algorithms assume non-negativity of the objective function in order to provide a constant factor approximation guarantee [Buchbinder and Feldman, 2017]. Second, any ↵-approximation guarantee for a non-negative and non-monotone submodular function f may only provide a OPT1 ↵ approximation guarantee for problem (2) where OPT is the volume of the optimum solution. Such approximation guarantees may be much worse than multiplicative guarantees.

Funding

- Acknowledgments and Disclosure of Funding Aditya Bhaskara is partially supported by NSF (CCF-2008688) and by a Google Faculty Research Award. Amin Karbasi is partially supported by NSF (IIS-1845032), ONR (N00014-19-1-2406), AFOSR (FA9550-18-1-0160), and TATA Sons Private Limited

Study subjects and analysis

standard datasets: 3

In this section we compare the experimental performances of our Algorithm 1 (ONLINE-LS), with: – an online greedy algorithm, ONLINEGREEDY that upon processing a new row adds it to the solution if, by swapping it with any row in the solution, it is possible to increase the volume. – the classic offline greedy algorithm, GREEDY that does k passes on the entire dataset and in every pass it adds to the current solution the row that increases the volume the most.

All our experiments have been carried out on a standard desktop computer and all the experiments presented in this section are fully deterministic. In our experiments we consider three standard datasets: the Spambase dataset [Dua and Graff, 2017], the Statlog(or Shuttle) dataset [Dua and Graff, 2017] and the Pen-Based Recognition dataset [Dua and Graff, 2017]. All the datasets contain only integer and real values, the Spambase dataset contains 4601 instances of 57 dimensions, the Statlog dataset contains 58000 instances of 9 dimensions and the Pen-Based Recognition dataset contains 10992 of 16 dimensions

All our experiments have been carried out on a standard desktop computer and all the experiments presented in this section are fully deterministic. In our experiments we consider three standard datasets: the Spambase dataset [Dua and Graff, 2017], the Statlog(or Shuttle) dataset [Dua and Graff, 2017] and the Pen-Based Recognition dataset [Dua and Graff, 2017]. All the datasets contain only integer and real values, the Spambase dataset contains 4601 instances of 57 dimensions, the Statlog dataset contains 58000 instances of 9 dimensions and the Pen-Based Recognition dataset contains 10992 of 16 dimensions

standard datasets: 3

All our experiments have been carried out on a standard desktop computer and all the experiments presented in this section are fully deterministic. In our experiments we consider three standard datasets: the Spambase dataset [Dua and Graff, 2017], the Statlog(or Shuttle) dataset [Dua and Graff, 2017] and the Pen-Based Recognition dataset [Dua and Graff, 2017]. All the datasets contain only integer and real values, the Spambase dataset contains 4601 instances of 57 dimensions, the Statlog dataset contains 58000 instances of 9 dimensions and the Pen-Based Recognition dataset contains 10992 of 16 dimensions

Reference

- Pankaj K. Agarwal, Sariel Har-Peled, and Kasturi R. Varadarajan. Geometric approximation via coresets. In Combinatorial and Computational Geometry, MSRI, pages 1–30. University Press, 2005.
- Ahmed Alaoui and Michael W Mahoney. Fast randomized kernel ridge regression with statistical guarantees. In C. Cortes, N. D. Lawrence, D. D. Lee, M. Sugiyama, and R. Garnett, editors, Advances in Neural Information Processing Systems 28, pages 775–783. Curran Associates, Inc., 2015.
- Nima Anari, Shayan Oveis Gharan, and Alireza Rezaei. Monte carlo markov chain algorithms for sampling strongly rayleigh distributions and determinantal point processes. In Conference on Learning Theory, pages 103–115, 2016.
- Ashwinkumar Badanidiyuru, Baharan Mirzasoleiman, Amin Karbasi, and Andreas Krause. Streaming submodular maximization: Massive data summarization on the fly. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 671–680, 2014.
- Aditya Bhaskara, Silvio Lattanzi, Sergei Vassilvitskii, and Morteza Zadimoghaddam. Residual based sampling for online low rank approximation. In 2019 IEEE 60th Annual Symposium on Foundations of Computer Science (FOCS), pages 1596–1614. IEEE, 2019.
- Andrew An Bian, Joachim M Buhmann, Andreas Krause, and Sebastian Tschiatschek. Guarantees for greedy maximization of non-submodular functions with applications. arXiv preprint arXiv:1703.02100, 2017.
- Niv Buchbinder and Moran Feldman. Submodular functions maximization problems. Handbook of Approximation Algorithms and Metaheuristics, 1:753–788, 2017.
- L Elisa Celis, Amit Deshpande, Tarun Kathuria, Damian Straszak, and Nisheeth K Vishnoi. On the complexity of constrained determinantal point processes. arXiv preprint arXiv:1608.00554, 2016.
- Laming Chen, Guoxin Zhang, and Eric Zhou. Fast greedy map inference for determinantal point process to improve recommendation diversity. In Advances in Neural Information Processing Systems, 2018.
- Michael B. Cohen, Cameron Musco, and Christopher Musco. Input sparsity time low-rank approximation via ridge leverage score sampling. In Proceedings of the Twenty-Eighth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA ’17, page 1758–1777, USA, 2017. Society for Industrial and Applied Mathematics.
- Vincent Cohen-Addad, Niklas Oskar D Hjuler, Nikos Parotsidis, David Saulpic, and Chris Schwiegelshohn. Fully dynamic consistent facility location. In H. Wallach, H. Larochelle, A. Beygelzimer, F. d Alché-Buc, E. Fox, and R. Garnett, editors, Advances in Neural Information Processing Systems 32, pages 3255–3265. Curran Associates, Inc., 2019.
- Michal Derezinski, Daniele Calandriello, and Michal Valko. Exact sampling of determinantal point processes with sublinear time preprocessing. In Advances in Neural Information Processing Systems, pages 11542–11554, 2019.
- Dheeru Dua and Casey Graff. UCI machine learning repository, 2017. URL http://archive.ics.uci.edu/ml.
- J. B. Ebrahimi, D. Straszak, and N. K. Vishnoi. Subdeterminant maximization via nonconvex relaxations and anti-concentration. In 2017 IEEE 58th Annual Symposium on Foundations of Computer Science (FOCS), pages 1020–1031, Oct 2017. doi: 10.1109/FOCS.2017.98.
- Moran Feldman, Amin Karbasi, and Ehsan Kazemi. Do less, get more: streaming submodular maximization with subsampling. In Advances in Neural Information Processing Systems, pages 732–742, 2018.
- Jennifer Gillenwater, Alex Kulesza, and Ben Taskar. Near-optimal map inference for determinantal point processes. In Advances in Neural Information Processing Systems, pages 2735–2743, 2012.
- Boqing Gong, Wei-Lun Chao, Kristen Grauman, and Fei Sha. Diverse sequential subset selection for supervised video summarization. In Advances in neural information processing systems, pages 2069–2077, 2014.
- Alkis Gotovos, Hamed Hassani, and Andreas Krause. Sampling from probabilistic submodular models. In Advances in Neural Information Processing Systems, pages 1945–1953, 2015.
- Christopher Harshaw, Moran Feldman, Justin Ward, and Amin Karbasi. Submodular maximization beyond non-negativity: Guarantees, fast algorithms, and applications. arXiv preprint arXiv:1904.09354, 2019.
- Hamed Hassani, Amin Karbasi, Aryan Mokhtari, and Zebang Shen. Stochastic conditional gradient++. arXiv preprint arXiv:1902.06992, 2019.
- Piotr Indyk, Sepideh Mahabadi, Mohammad Mahdian, and Vahab S Mirrokni. Composable core-sets for diversity and coverage maximization. In Proceedings of the 33rd ACM SIGMOD-SIGACTSIGART symposium on Principles of database systems, pages 100–108, 2014.
- Piotr Indyk, Sepideh Mahabadi, Shayan Oveis Gharan, and Alireza Rezaei. Composable core-sets for determinant maximization problems via spectral spanners. arXiv preprint arXiv:1807.11648, 2018.
- Mohammad Reza Karimi Jaghargh, Andreas Krause, Silvio Lattanzi, and Sergei Vassilvtiskii. Consistent online optimization: Convex and submodular. In The 22nd International Conference on Artificial Intelligence and Statistics, pages 2241–2250, 2019.
- Alex Kulesza and Ben Taskar. Determinantal point processes for machine learning. arXiv preprint arXiv:1207.6083, 2012a.
- Alex Kulesza and Ben Taskar. Learning determinantal point processes. arXiv preprint arXiv:1202.3738, 2012b.
- Silvio Lattanzi and Sergei Vassilvitskii. Consistent k-clustering. In Proceedings of the 34th International Conference on Machine Learning-Volume 70, pages 1975–1984. JMLR. org, 2017.
- Sang-Chul Lee, Sang-Wook Kim, Sunju Park, and Dong-Kyu Chae. A single-step approach to recommendation diversification. In Proceedings of the 26th International Conference on World Wide Web Companion, pages 809–810, 2017.
- Chengtao Li, Stefanie Jegelka, and Suvrit Sra. Efficient sampling for k-determinantal point processes. arXiv preprint arXiv:1509.01618, 2015.
- Odile Macchi. The coincidence approach to stochastic point processes. Advances in Applied Probability, 1975.
- Sepideh Mahabadi, Piotr Indyk, Shayan Oveis Gharan, and Alireza Rezaei. Composable core-sets for determinant maximization: A simple near-optimal algorithm. In Kamalika Chaudhuri and Ruslan Salakhutdinov, editors, Proceedings of the 36th International Conference on Machine Learning, volume 97 of Proceedings of Machine Learning Research, pages 4254–4263, Long Beach, California, USA, 09–15 Jun 2019. PMLR. URL http://proceedings.mlr.press/v97/mahabadi19a.html.
- Sepideh Mahabadi, Ilya Razenshteyn, David P Woodruff, and Samson Zhou. Non-adaptive adaptive sampling on turnstile streams. arXiv preprint arXiv:2004.10969, 2020.
- Zelda E Mariet, Suvrit Sra, and Stefanie Jegelka. Exponentiated strongly rayleigh distributions. In Advances in Neural Information Processing Systems, pages 4459–4469, 2018.
- Vahab Mirrokni and Morteza Zadimoghaddam. Randomized composable core-sets for distributed submodular maximization. In Proceedings of the forty-seventh annual ACM symposium on Theory of computing, pages 153–162, 2015.
- Baharan Mirzasoleiman, Amin Karbasi, Rik Sarkar, and Andreas Krause. Distributed submodular maximization: Identifying representative elements in massive data. In Advances in Neural Information Processing Systems, pages 2049–2057, 2013.
- Aleksandar Nikolov. Randomized rounding for the largest simplex problem. In Proceedings of the Forty-Seventh Annual ACM Symposium on Theory of Computing, STOC ’15, page 861–870, New York, NY, USA, 2015. Association for Computing Machinery. ISBN 9781450335362. doi: 10.1145/2746539.2746628. URL https://doi.org/10.1145/2746539.2746628.
- Aleksandar Nikolov and Mohit Singh. Maximizing determinants under partition constraints. In Proceedings of the Forty-Eighth Annual ACM Symposium on Theory of Computing, STOC ’16, page 192–201, New York, NY, USA, 2016. Association for Computing Machinery. ISBN 9781450341325. doi: 10.1145/2897518.2897649. URL https://doi.org/10.1145/2897518.2897649.
- Lijing Qin and Xiaoyan Zhu. Promoting diversity in recommendation by entropy regularizer. In Twenty-Third International Joint Conference on Artificial Intelligence, 2013.
- Patrick Rebeschini and Amin Karbasi. Fast mixing for discrete point processes. In Conference on Learning Theory, pages 1480–1500, 2015.
- Marco Di Summa, Friedrich Eisenbrand, Yuri Faenza, and Carsten Moldenhauer. On largest volume simplices and sub-determinants. In Proceedings of the Twenty-Sixth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA ’15, page 315–323, USA, 2015. Society for Industrial and Applied Mathematics.
- Pengtao Xie, Ruslan Salakhutdinov, Luntian Mou, and Eric P Xing. Deep determinantal point process for large-scale multi-label classification. In Proceedings of the IEEE International Conference on Computer Vision, pages 473–482, 2017.

Tags

Comments

数据免责声明

页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果，我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问，可以通过电子邮件方式联系我们：report@aminer.cn