AI helps you reading Science

AI generates interpretation videos

AI extracts and analyses the key points of the paper to generate videos automatically


pub
Go Generating

AI Traceability

AI parses the academic lineage of this thesis


Master Reading Tree
Generate MRT

AI Insight

AI extracts a summary of this paper


Weibo:
We developed online algorithms for finding the most likelihood configuration of size k for Determinantal point processes

Online MAP Inference of Determinantal Point Processes

NIPS 2020, (2020)

Cited by: 0|Views60
EI
Full Text
Bibtex
Weibo

Abstract

In this paper, we provide an efficient approximation algorithm for finding the most likelihood configuration (MAP) of size k for Determinantal Point Processes (DPP) in the online setting where the data points arrive in an arbitrary order and the algorithm cannot discard the selected elements from its local memory. Given a tolerance additi...More

Code:

Data:

0
Introduction
  • Probabilistic modeling of data, along with complex inference techniques, have become an important ingredient of the modern machine learning toolbox.
  • Determinantal point processes (DPP) are elegant probabilistic models of repulsion that admit such criteria, namely, efficient sampling, marginalization, and conditioning [Kulesza and Taskar, 2012a,b].
  • They have first been introduced by [Macchi, 1975] in quantum physics to model negative interactions among particles.
Highlights
  • Probabilistic modeling of data, along with complex inference techniques, have become an important ingredient of the modern machine learning toolbox
  • Determinantal point processes (DPP) have found numerous applications in machine learning that rely on a diverse subset selection, such as different forms of data summarization [Mirzasoleiman et al, 2013, Kulesza and Taskar, 2012b, Feldman et al, 2018, Gong et al, 2014], multi-label classification [Xie et al, 2017], recommender systems [Lee et al, 2017, Qin and Zhu, 2013], to name a few
  • The essential characteristic of a DPP is that the inclusion of an item makes the inclusion of similar items less likely
  • A DPP assigns the probability of sampling a set of vectors indexed by S ✓ V as follows: Pr(S) / det(VSVST ) = vol2(S), (1)
  • We present our algorithmic results for online DPP
  • We developed online algorithms for finding the most likelihood configuration of size k for DPPs
Methods
  • The authors compare the experimental performances of the Algorithm 1 (ONLINE-LS), with: – an online greedy algorithm, ONLINEGREEDY that upon processing a new row adds it to the solution if, by swapping it with any row in the solution, it is possible to increase the volume. – the classic offline greedy algorithm, GREEDY that does k passes on the entire dataset and in every pass it adds to the current solution the row that increases the volume the most.

    All the experiments have been carried out on a standard desktop computer and all the experiments presented are fully deterministic.
  • For ONLINELS and ONLINEGREEDY the authors report the number of volume computations as system independent proxy of the running time and the number of swap in the solution during the execution of the algorithm to capture the consistency of the algorithm.
  • Both ONLINELS and ONLINEGREEDY recover a solution that has quality comparable with the solution of the offline algorithm
Conclusion
  • The authors developed online algorithms for finding the most likelihood configuration of size k for DPPs.
  • The authors' main contribution —ONLINE-DPP— achieves a kO(k) multiplicative approximation guarantee with an additive error ⌘, using memory independent of the size of the data stream
Related work
  • The problem of finding the MAP configuration of a DPP has been studied extensively, and has also been referred to by other names such as sub-determinant maximization and D-optimal experiment design. Much of the work can be divided into a few strands, which we discuss briefly.

    Submodular Maximization. It is known that the set function f (S) = log det(VSVST ) is a submodular function. Therefore, a series of previous work applied submodular maximization algorithms such as greedy addition [Chen et al, 2018, Badanidiyuru et al, 2014, Kulesza and Taskar, 2012b], soft-max relaxation [Gillenwater et al, 2012], multi-linear extension [Hassani et al, 2019], in order to find the MAP configuration. Interesting results are also known for maximizing determinantal functions with approximate submodularity Bian et al [2017] and maximizing submodular functions whose value can be negative Harshaw et al [2019]. However, there are two drawbacks with such methods. First, even though f (S) is submodular, it might be negative, i.e., when det(VSVST ) < 1. Almost all submodular maximization algorithms assume non-negativity of the objective function in order to provide a constant factor approximation guarantee [Buchbinder and Feldman, 2017]. Second, any ↵-approximation guarantee for a non-negative and non-monotone submodular function f may only provide a OPT1 ↵ approximation guarantee for problem (2) where OPT is the volume of the optimum solution. Such approximation guarantees may be much worse than multiplicative guarantees.
Funding
  • Acknowledgments and Disclosure of Funding Aditya Bhaskara is partially supported by NSF (CCF-2008688) and by a Google Faculty Research Award. Amin Karbasi is partially supported by NSF (IIS-1845032), ONR (N00014-19-1-2406), AFOSR (FA9550-18-1-0160), and TATA Sons Private Limited
Study subjects and analysis
standard datasets: 3
In this section we compare the experimental performances of our Algorithm 1 (ONLINE-LS), with: – an online greedy algorithm, ONLINEGREEDY that upon processing a new row adds it to the solution if, by swapping it with any row in the solution, it is possible to increase the volume. – the classic offline greedy algorithm, GREEDY that does k passes on the entire dataset and in every pass it adds to the current solution the row that increases the volume the most.

All our experiments have been carried out on a standard desktop computer and all the experiments presented in this section are fully deterministic. In our experiments we consider three standard datasets: the Spambase dataset [Dua and Graff, 2017], the Statlog(or Shuttle) dataset [Dua and Graff, 2017] and the Pen-Based Recognition dataset [Dua and Graff, 2017]. All the datasets contain only integer and real values, the Spambase dataset contains 4601 instances of 57 dimensions, the Statlog dataset contains 58000 instances of 9 dimensions and the Pen-Based Recognition dataset contains 10992 of 16 dimensions

standard datasets: 3
All our experiments have been carried out on a standard desktop computer and all the experiments presented in this section are fully deterministic. In our experiments we consider three standard datasets: the Spambase dataset [Dua and Graff, 2017], the Statlog(or Shuttle) dataset [Dua and Graff, 2017] and the Pen-Based Recognition dataset [Dua and Graff, 2017]. All the datasets contain only integer and real values, the Spambase dataset contains 4601 instances of 57 dimensions, the Statlog dataset contains 58000 instances of 9 dimensions and the Pen-Based Recognition dataset contains 10992 of 16 dimensions

Reference
  • Pankaj K. Agarwal, Sariel Har-Peled, and Kasturi R. Varadarajan. Geometric approximation via coresets. In Combinatorial and Computational Geometry, MSRI, pages 1–30. University Press, 2005.
    Google ScholarLocate open access versionFindings
  • Ahmed Alaoui and Michael W Mahoney. Fast randomized kernel ridge regression with statistical guarantees. In C. Cortes, N. D. Lawrence, D. D. Lee, M. Sugiyama, and R. Garnett, editors, Advances in Neural Information Processing Systems 28, pages 775–783. Curran Associates, Inc., 2015.
    Google ScholarLocate open access versionFindings
  • Nima Anari, Shayan Oveis Gharan, and Alireza Rezaei. Monte carlo markov chain algorithms for sampling strongly rayleigh distributions and determinantal point processes. In Conference on Learning Theory, pages 103–115, 2016.
    Google ScholarLocate open access versionFindings
  • Ashwinkumar Badanidiyuru, Baharan Mirzasoleiman, Amin Karbasi, and Andreas Krause. Streaming submodular maximization: Massive data summarization on the fly. In Proceedings of the 20th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 671–680, 2014.
    Google ScholarLocate open access versionFindings
  • Aditya Bhaskara, Silvio Lattanzi, Sergei Vassilvitskii, and Morteza Zadimoghaddam. Residual based sampling for online low rank approximation. In 2019 IEEE 60th Annual Symposium on Foundations of Computer Science (FOCS), pages 1596–1614. IEEE, 2019.
    Google ScholarLocate open access versionFindings
  • Andrew An Bian, Joachim M Buhmann, Andreas Krause, and Sebastian Tschiatschek. Guarantees for greedy maximization of non-submodular functions with applications. arXiv preprint arXiv:1703.02100, 2017.
    Findings
  • Niv Buchbinder and Moran Feldman. Submodular functions maximization problems. Handbook of Approximation Algorithms and Metaheuristics, 1:753–788, 2017.
    Google ScholarLocate open access versionFindings
  • L Elisa Celis, Amit Deshpande, Tarun Kathuria, Damian Straszak, and Nisheeth K Vishnoi. On the complexity of constrained determinantal point processes. arXiv preprint arXiv:1608.00554, 2016.
    Findings
  • Laming Chen, Guoxin Zhang, and Eric Zhou. Fast greedy map inference for determinantal point process to improve recommendation diversity. In Advances in Neural Information Processing Systems, 2018.
    Google ScholarLocate open access versionFindings
  • Michael B. Cohen, Cameron Musco, and Christopher Musco. Input sparsity time low-rank approximation via ridge leverage score sampling. In Proceedings of the Twenty-Eighth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA ’17, page 1758–1777, USA, 2017. Society for Industrial and Applied Mathematics.
    Google ScholarLocate open access versionFindings
  • Vincent Cohen-Addad, Niklas Oskar D Hjuler, Nikos Parotsidis, David Saulpic, and Chris Schwiegelshohn. Fully dynamic consistent facility location. In H. Wallach, H. Larochelle, A. Beygelzimer, F. d Alché-Buc, E. Fox, and R. Garnett, editors, Advances in Neural Information Processing Systems 32, pages 3255–3265. Curran Associates, Inc., 2019.
    Google ScholarLocate open access versionFindings
  • Michal Derezinski, Daniele Calandriello, and Michal Valko. Exact sampling of determinantal point processes with sublinear time preprocessing. In Advances in Neural Information Processing Systems, pages 11542–11554, 2019.
    Google ScholarLocate open access versionFindings
  • Dheeru Dua and Casey Graff. UCI machine learning repository, 2017. URL http://archive.ics.uci.edu/ml.
    Findings
  • J. B. Ebrahimi, D. Straszak, and N. K. Vishnoi. Subdeterminant maximization via nonconvex relaxations and anti-concentration. In 2017 IEEE 58th Annual Symposium on Foundations of Computer Science (FOCS), pages 1020–1031, Oct 2017. doi: 10.1109/FOCS.2017.98.
    Locate open access versionFindings
  • Moran Feldman, Amin Karbasi, and Ehsan Kazemi. Do less, get more: streaming submodular maximization with subsampling. In Advances in Neural Information Processing Systems, pages 732–742, 2018.
    Google ScholarLocate open access versionFindings
  • Jennifer Gillenwater, Alex Kulesza, and Ben Taskar. Near-optimal map inference for determinantal point processes. In Advances in Neural Information Processing Systems, pages 2735–2743, 2012.
    Google ScholarLocate open access versionFindings
  • Boqing Gong, Wei-Lun Chao, Kristen Grauman, and Fei Sha. Diverse sequential subset selection for supervised video summarization. In Advances in neural information processing systems, pages 2069–2077, 2014.
    Google ScholarLocate open access versionFindings
  • Alkis Gotovos, Hamed Hassani, and Andreas Krause. Sampling from probabilistic submodular models. In Advances in Neural Information Processing Systems, pages 1945–1953, 2015.
    Google ScholarLocate open access versionFindings
  • Christopher Harshaw, Moran Feldman, Justin Ward, and Amin Karbasi. Submodular maximization beyond non-negativity: Guarantees, fast algorithms, and applications. arXiv preprint arXiv:1904.09354, 2019.
    Findings
  • Hamed Hassani, Amin Karbasi, Aryan Mokhtari, and Zebang Shen. Stochastic conditional gradient++. arXiv preprint arXiv:1902.06992, 2019.
    Findings
  • Piotr Indyk, Sepideh Mahabadi, Mohammad Mahdian, and Vahab S Mirrokni. Composable core-sets for diversity and coverage maximization. In Proceedings of the 33rd ACM SIGMOD-SIGACTSIGART symposium on Principles of database systems, pages 100–108, 2014.
    Google ScholarLocate open access versionFindings
  • Piotr Indyk, Sepideh Mahabadi, Shayan Oveis Gharan, and Alireza Rezaei. Composable core-sets for determinant maximization problems via spectral spanners. arXiv preprint arXiv:1807.11648, 2018.
    Findings
  • Mohammad Reza Karimi Jaghargh, Andreas Krause, Silvio Lattanzi, and Sergei Vassilvtiskii. Consistent online optimization: Convex and submodular. In The 22nd International Conference on Artificial Intelligence and Statistics, pages 2241–2250, 2019.
    Google ScholarLocate open access versionFindings
  • Alex Kulesza and Ben Taskar. Determinantal point processes for machine learning. arXiv preprint arXiv:1207.6083, 2012a.
    Findings
  • Alex Kulesza and Ben Taskar. Learning determinantal point processes. arXiv preprint arXiv:1202.3738, 2012b.
    Findings
  • Silvio Lattanzi and Sergei Vassilvitskii. Consistent k-clustering. In Proceedings of the 34th International Conference on Machine Learning-Volume 70, pages 1975–1984. JMLR. org, 2017.
    Google ScholarLocate open access versionFindings
  • Sang-Chul Lee, Sang-Wook Kim, Sunju Park, and Dong-Kyu Chae. A single-step approach to recommendation diversification. In Proceedings of the 26th International Conference on World Wide Web Companion, pages 809–810, 2017.
    Google ScholarLocate open access versionFindings
  • Chengtao Li, Stefanie Jegelka, and Suvrit Sra. Efficient sampling for k-determinantal point processes. arXiv preprint arXiv:1509.01618, 2015.
    Findings
  • Odile Macchi. The coincidence approach to stochastic point processes. Advances in Applied Probability, 1975.
    Google ScholarLocate open access versionFindings
  • Sepideh Mahabadi, Piotr Indyk, Shayan Oveis Gharan, and Alireza Rezaei. Composable core-sets for determinant maximization: A simple near-optimal algorithm. In Kamalika Chaudhuri and Ruslan Salakhutdinov, editors, Proceedings of the 36th International Conference on Machine Learning, volume 97 of Proceedings of Machine Learning Research, pages 4254–4263, Long Beach, California, USA, 09–15 Jun 2019. PMLR. URL http://proceedings.mlr.press/v97/mahabadi19a.html.
    Locate open access versionFindings
  • Sepideh Mahabadi, Ilya Razenshteyn, David P Woodruff, and Samson Zhou. Non-adaptive adaptive sampling on turnstile streams. arXiv preprint arXiv:2004.10969, 2020.
    Findings
  • Zelda E Mariet, Suvrit Sra, and Stefanie Jegelka. Exponentiated strongly rayleigh distributions. In Advances in Neural Information Processing Systems, pages 4459–4469, 2018.
    Google ScholarLocate open access versionFindings
  • Vahab Mirrokni and Morteza Zadimoghaddam. Randomized composable core-sets for distributed submodular maximization. In Proceedings of the forty-seventh annual ACM symposium on Theory of computing, pages 153–162, 2015.
    Google ScholarLocate open access versionFindings
  • Baharan Mirzasoleiman, Amin Karbasi, Rik Sarkar, and Andreas Krause. Distributed submodular maximization: Identifying representative elements in massive data. In Advances in Neural Information Processing Systems, pages 2049–2057, 2013.
    Google ScholarLocate open access versionFindings
  • Aleksandar Nikolov. Randomized rounding for the largest simplex problem. In Proceedings of the Forty-Seventh Annual ACM Symposium on Theory of Computing, STOC ’15, page 861–870, New York, NY, USA, 2015. Association for Computing Machinery. ISBN 9781450335362. doi: 10.1145/2746539.2746628. URL https://doi.org/10.1145/2746539.2746628.
    Locate open access versionFindings
  • Aleksandar Nikolov and Mohit Singh. Maximizing determinants under partition constraints. In Proceedings of the Forty-Eighth Annual ACM Symposium on Theory of Computing, STOC ’16, page 192–201, New York, NY, USA, 2016. Association for Computing Machinery. ISBN 9781450341325. doi: 10.1145/2897518.2897649. URL https://doi.org/10.1145/2897518.2897649.
    Locate open access versionFindings
  • Lijing Qin and Xiaoyan Zhu. Promoting diversity in recommendation by entropy regularizer. In Twenty-Third International Joint Conference on Artificial Intelligence, 2013.
    Google ScholarLocate open access versionFindings
  • Patrick Rebeschini and Amin Karbasi. Fast mixing for discrete point processes. In Conference on Learning Theory, pages 1480–1500, 2015.
    Google ScholarLocate open access versionFindings
  • Marco Di Summa, Friedrich Eisenbrand, Yuri Faenza, and Carsten Moldenhauer. On largest volume simplices and sub-determinants. In Proceedings of the Twenty-Sixth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA ’15, page 315–323, USA, 2015. Society for Industrial and Applied Mathematics.
    Google ScholarLocate open access versionFindings
  • Pengtao Xie, Ruslan Salakhutdinov, Luntian Mou, and Eric P Xing. Deep determinantal point process for large-scale multi-label classification. In Proceedings of the IEEE International Conference on Computer Vision, pages 473–482, 2017.
    Google ScholarLocate open access versionFindings
Your rating :
0

 

Tags
Comments
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn
小科