AI helps you reading Science

AI generates interpretation videos

AI extracts and analyses the key points of the paper to generate videos automatically


pub
Go Generating

AI Traceability

AI parses the academic lineage of this thesis


Master Reading Tree
Generate MRT

AI Insight

AI extracts a summary of this paper


Weibo:
We studied an online optimization problem in which the reward functions are monotone DRsubmodular, and in addition, the sequence of decisions of the learner should satisfy some adversarially or stochastically varying monotone convex constraints on average

A Single Recipe for Online Submodular Maximization with Adversarial or Stochastic Constraints

NIPS 2020, (2020)

Cited by: 0|Views8
EI
Full Text
Bibtex
Weibo

Abstract

In this paper, we consider an online optimization problem in which the reward functions are DR-submodular, and in addition to maximizing the total reward, the sequence of decisions must satisfy some convex constraints on average. Specifically, at each round t ∈ {1, . . . , T }, upon committing to an action xt, a DR-submodular utility func...More

Code:

Data:

0
Introduction
  • Online optimization covers a large number of problems in which information is revealed incrementally and irrevocable decisions should be made at each step in face of uncertainty about the future arriving information [1,2,3,4,5].
  • In the regret analysis framework, at each round, the learner has to commit to an action before observing the corresponding reward function and the goal is to design algorithms whose total accumulated reward differs sub-linearly from the reward of the
Highlights
  • Online optimization covers a large number of problems in which information is revealed incrementally and irrevocable decisions should be made at each step in face of uncertainty about the future arriving information [1,2,3,4,5]
  • We focus on a general class of online optimization problems where the reward functions {ft}Tt=1 are monotone DR-submodular and are chosen adversarially
  • We propose our first algorithm that could be applied to online DR-submodular maximization problems with both adversarial or stochastic constraints without prior information about the regime
  • We studied an online optimization problem in which the reward functions are monotone DRsubmodular, and in addition, the sequence of decisions of the learner should satisfy some adversarially or stochastically varying monotone convex constraints on average
  • We propose a single algorithm for both adversarial or stochastic constraints without prior knowledge of the regime
  • In the special case of linear constraint functions, our proposed algorithm obtains improved regret and constraint violation bounds in both adversarial and stochastic settings compared to prior work
Methods
  • In order to verify the theoretical findings, the authors run the algorithms for the three experiments described in Section 3.1 and the authors plot the performance in Figure 1. 1) Online joke recommendation.
  • The authors choose n = 100 jokes, T = 10000 and BT = 1.5T.
  • The authors vary the window length W and choose V , α and K according to Section 4.
  • The authors consider the utility functions ft(x) = rtT x + i,j:i
  • [pt]i is chosen uniformly from the range [0.3, 6].
Conclusion
  • The authors studied an online optimization problem in which the reward functions are monotone DRsubmodular, and in addition, the sequence of decisions of the learner should satisfy some adversarially or stochastically varying monotone convex constraints on average.
  • In the special case of linear constraint functions, the proposed algorithm obtains improved regret and constraint violation bounds in both adversarial and stochastic settings compared to prior work.
  • Broader Impact
  • This theoretical paper studies online, sequential decision making with rewards and limited resources/budgets, with broad applications.
  • There are many online resource allocation problems that could be cast in the framework, the authors believe that this work does not raise any potential ethical concerns
Tables
  • Table1: Prior results for online problems with adversarial cumulative constraints in various settings. Note that in (a), V ∈ (W, T ) is a tunable parameter
Download tables as Excel
Related work
  • Online submodular maximization. Consider an online unconstrained optimization problem in which the reward functions are monotone √DR-submodular. [18] proposed the Meta-Frank-Wolfe algorithm for this problem and obtained O( T)

    regret bound against the (1 − 1 e )

    approximation to the best fixed decision in hindsight where is the best polynomial-time approximation ratio in the offline setting. The Meta-F√rank-Wolfe algorithm requires access to the full gradient of the reward functions and performs O( T ) gradient evaluations per step. More recently, [19] generalized this algorithm to the setting where only stochastic gradient estimates are available. Moreover, [20]

    proposed the Mono-Frank-Wolfe algorithm which performs only one gradient evaluation per round and requires only unbiased estimates of the gradient.

    Online optimization with adversarial constraints. Online convex optimization with constraints, where both the convex objective functions {ft}Tt=1 and the convex constraint functions {gt}Tt=1 can vary arbitrarily, was first studied by [21]. They provided a surprisingly simple counterexample which showed that it is not always possible to achieve a sub-linear regret against the best fixed benchmark action in hindsight while the total constraint violation is sub-linear. Therefore, subsequent works added more assumptions to the problem setting to be able to obtain meaningful results. In particular, not only did they require the fixed benchmark action to satisfy the long-term constraint (i.e., gt(x∗)
Funding
  • Acknowledgments and Disclosure of Funding This work was supported in part by the following grants: NSF TRIPODS grant 1740551, DARPA Lagrange grant FA8650-18-2-7836, ONR MURI grant N0014-16-1-2710
Reference
  • Niv Buchbinder, Kamal Jain, and Joseph Seffi Naor. Online primal-dual algorithms for maximizing ad-auctions revenue. In Proceedings of the 15th Annual European Conference on Algorithms, ESA’07, page 253–264, Berlin, Heidelberg, 2007. Springer-Verlag.
    Google ScholarLocate open access versionFindings
  • Aranyak Mehta, Amin Saberi, Umesh Vazirani, and Vijay Vazirani. Adwords and generalized online matching. J. ACM, 54(5):22–es, October 2007.
    Google ScholarLocate open access versionFindings
  • Niv Buchbinder and Joseph (Seffi) Naor. Online primal-dual algorithms for covering and packing. Mathematics of Operations Research, 34(2):270–286, 2009.
    Google ScholarLocate open access versionFindings
  • Martin Zinkevich. Online convex programming and generalized infinitesimal gradient ascent. In Proceedings of the Twentieth International Conference on International Conference on Machine Learning, ICML’03, page 928–935. AAAI Press, 2003.
    Google ScholarLocate open access versionFindings
  • Peter Auer, Nicolò Cesa-Bianchi, Yoav Freund, and Robert E. Schapire. The nonstochastic multiarmed bandit problem. SIAM J. Comput., 32(1):48–77, January 2003.
    Google ScholarLocate open access versionFindings
  • Sébastien Bubeck. Introduction to online optimization. Lecture Notes, 2, 2011.
    Google ScholarLocate open access versionFindings
  • Shai Shalev-Shwartz. Online learning and online convex optimization. Foundations and Trends R in Machine Learning, 4(2):107–194, 2012.
    Google ScholarLocate open access versionFindings
  • Elad Hazan. Introduction to online convex optimization. Foundations and Trends R in Optimization, 2(3-4):157–325, 2016.
    Google ScholarLocate open access versionFindings
  • Niv Buchbinder and Joseph (Seffi) Naor. The design of competitive online algorithms via a primal: Dual approach. Found. Trends Theor. Comput. Sci., 3(2–3):93–263, February 2009.
    Google ScholarLocate open access versionFindings
  • Shipra Agrawal and Nikhil R. Devanur. Bandits with concave rewards and convex knapsacks. In Proceedings of the Fifteenth ACM Conference on Economics and Computation, EC ’14, page 989–1006, New York, NY, USA, 2014. Association for Computing Machinery.
    Google ScholarLocate open access versionFindings
  • Shipra Agrawal and Nikhil R. Devanur. Fast algorithms for online stochastic convex programming. In Proceedings of the Twenty-Sixth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA ’15, page 1405–1424, USA, 2015. Society for Industrial and Applied Mathematics.
    Google ScholarLocate open access versionFindings
  • Ashwinkumar Badanidiyuru, Robert Kleinberg, and Aleksandrs Slivkins. Bandits with knapsacks. J. ACM, 65(3), March 2018.
    Google ScholarLocate open access versionFindings
  • Santiago R. Balseiro and Yonatan Gur. Learning in repeated auctions with budgets: Regret minimization and equilibrium. Management Science, 65(9):3952–3968, 2019.
    Google ScholarLocate open access versionFindings
  • Adish Singla and Andreas Krause. Truthful incentives in crowdsourcing tasks using regret minimization mechanisms. In Proceedings of the 22nd International Conference on World Wide Web, WWW ’13, page 1167–1178, New York, NY, USA, 2013. Association for Computing Machinery.
    Google ScholarLocate open access versionFindings
  • Nikolaos Liakopoulos, Apostolos Destounis, Georgios Paschos, Thrasyvoulos Spyropoulos, and Panayotis Mertikopoulos. Cautious regret minimization: Online optimization with long-term budget constraints. In Kamalika Chaudhuri and Ruslan Salakhutdinov, editors, Proceedings of the 36th International Conference on Machine Learning, volume 97 of Proceedings of Machine Learning Research, pages 3944–3952, Long Beach, California, USA, 09–15 Jun 2019. PMLR.
    Google ScholarLocate open access versionFindings
  • Omid Sadeghi and Maryam Fazel. Online continuous dr-submodular maximization with longterm budget constraints. In International Conference on Artificial Intelligence and Statistics, pages 4410–4419. PMLR, 2020.
    Google ScholarLocate open access versionFindings
  • Prasanna Sanjay Raut, Omid Sadeghi, and Maryam Fazel. Online dr-submodular maximization with stochastic cumulative constraints. arXiv preprint arXiv:2005.14708, 2020.
    Findings
  • Lin Chen, Hamed Hassani, and Amin Karbasi. Online continuous submodular maximization. In Amos Storkey and Fernando Perez-Cruz, editors, Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics, volume 84 of Proceedings of Machine Learning Research, pages 1896–1905, Playa Blanca, Lanzarote, Canary Islands, 09–11 Apr 20PMLR.
    Google ScholarLocate open access versionFindings
  • Lin Chen, Christopher Harshaw, Hamed Hassani, and Amin Karbasi. Projection-free online optimization with stochastic gradient: From convexity to submodularity. In Jennifer Dy and Andreas Krause, editors, Proceedings of the 35th International Conference on Machine Learning, volume 80 of Proceedings of Machine Learning Research, pages 814–823, Stockholmsmässan, Stockholm Sweden, 10–15 Jul 2018. PMLR.
    Google ScholarLocate open access versionFindings
  • Mingrui Zhang, Lin Chen, Hamed Hassani, and Amin Karbasi. Online continuous submodular maximization: From full-information to bandit feedback. In Advances in Neural Information Processing Systems, pages 9210–9221, 2019.
    Google ScholarLocate open access versionFindings
  • Shie Mannor, John N. Tsitsiklis, and Jia Yuan Yu. Online learning with sample path constraints. Journal of Machine Learning Research, 10(20):569–590, 2009.
    Google ScholarLocate open access versionFindings
  • Wen Sun, Debadeepta Dey, and Ashish Kapoor. Safety-aware algorithms for adversarial contextual bandit. In Doina Precup and Yee Whye Teh, editors, Proceedings of the 34th International Conference on Machine Learning, volume 70 of Proceedings of Machine Learning Research, pages 3280–3288, International Convention Centre, Sydney, Australia, 06–11 Aug 2017. PMLR.
    Google ScholarLocate open access versionFindings
  • Michael J. Neely and Hao Yu. Online Convex Optimization with Time-Varying Constraints. arXiv:1702.04783 [math], February 2017.
    Findings
  • X. Cao and K. J. R. Liu. Online convex optimization with time-varying constraints and bandit feedback. IEEE Transactions on Automatic Control, 64(7):2665–2680, 2019.
    Google ScholarLocate open access versionFindings
  • T. Chen, Q. Ling, and G. B. Giannakis. An online convex optimization approach to proactive network resource allocation. IEEE Transactions on Signal Processing, 65(24):6350–6364, 2017.
    Google ScholarLocate open access versionFindings
  • Hao Yu, Michael J. Neely, and Xiaohan Wei. Online convex optimization with stochastic constraints. In Proceedings of the 31st International Conference on Neural Information Processing Systems, NIPS’17, page 1427–1437, Red Hook, NY, USA, 2017. Curran Associates Inc.
    Google ScholarLocate open access versionFindings
  • Xiaohan Wei, Hao Yu, and Michael J Neely. Online primal-dual mirror descent under stochastic constraints. arXiv preprint arXiv:1908.00305, 2019.
    Findings
  • Andrew An Bian, Baharan Mirzasoleiman, Joachim Buhmann, and Andreas Krause. Guaranteed Non-convex Optimization: Submodular Maximization over Continuous Domains. In Aarti Singh and Jerry Zhu, editors, Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, volume 54 of Proceedings of Machine Learning Research, pages 111–120, Fort Lauderdale, FL, USA, 20–22 Apr 2017. PMLR.
    Google ScholarLocate open access versionFindings
  • Gruia Calinescu, Chandra Chekuri, Martin Pál, and Jan Vondrák. Maximizing a monotone submodular function subject to a matroid constraint. SIAM Journal on Computing, 40(6):1740– 1766, 2011.
    Google ScholarLocate open access versionFindings
  • Hui Lin and Jeff Bilmes. A class of submodular functions for document summarization. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies - Volume 1, HLT ’11, page 510–520, USA, 2011. Association for Computational Linguistics.
    Google ScholarLocate open access versionFindings
  • Y. Azar, N. Buchbinder, T. H. Chan, S. Chen, I. R. Cohen, A. Gupta, Z. Huang, N. Kang, V. Nagarajan, J. Naor, and D. Panigrahi. Online algorithms for covering and packing problems with convex objectives. In 2016 IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS), pages 148–157, 2016.
    Google ScholarLocate open access versionFindings
  • TH Chan, Zhiyi Huang, and Ning Kang. Online convex covering and packing problems. arXiv preprint arXiv:1502.01802, 2015.
    Findings
  • Shai Shalev-Shwartz, Yoram Singer, and Nathan Srebro. Pegasos: Primal estimated sub-gradient solver for svm. In Proceedings of the 24th International Conference on Machine Learning, ICML ’07, page 807–814, New York, NY, USA, 2007. Association for Computing Machinery.
    Google ScholarLocate open access versionFindings
  • Nathan Srebro, Karthik Sridharan, and Ambuj Tewari. Smoothness, low-noise and fast rates. In Proceedings of the 23rd International Conference on Neural Information Processing Systems Volume 2, NIPS’10, page 2199–2207, Red Hook, NY, USA, 2010. Curran Associates Inc.
    Google ScholarLocate open access versionFindings
  • John Duchi, Elad Hazan, and Yoram Singer. Adaptive subgradient methods for online learning and stochastic optimization. J. Mach. Learn. Res., 12(null):2121–2159, July 2011.
    Google ScholarLocate open access versionFindings
  • Lijun Zhang, Jinfeng Yi, Rong Jin, Ming Lin, and Xiaofei He. Online kernel learning with a near optimal sparsity bound. In International Conference on Machine Learning, pages 621–629, 2013.
    Google ScholarLocate open access versionFindings
  • Eric Hall and Rebecca Willett. Dynamical models and tracking regret in online convex programming. In International Conference on Machine Learning, pages 579–587, 2013.
    Google ScholarLocate open access versionFindings
  • Lijun Zhang, Shiyin Lu, and Zhi-Hua Zhou. Adaptive online learning in dynamic environments. In S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett, editors, Advances in Neural Information Processing Systems 31, pages 1323–1333. Curran Associates, Inc., 2018.
    Google ScholarLocate open access versionFindings
  • A. Mokhtari, S. Shahrampour, A. Jadbabaie, and A. Ribeiro. Online optimization in dynamic environments: Improved regret rates for strongly convex problems. In 2016 IEEE 55th Conference on Decision and Control (CDC), pages 7195–7201, 2016.
    Google ScholarLocate open access versionFindings
  • Ali Jadbabaie, Alexander Rakhlin, Shahin Shahrampour, and Karthik Sridharan. Online Optimization: Competing with Dynamic Comparators. In Guy Lebanon and S. V. N. Vishwanathan, editors, Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics, volume 38 of Proceedings of Machine Learning Research, pages 398–406, San Diego, California, USA, 09–12 May 2015. PMLR.
    Google ScholarLocate open access versionFindings
  • Tianbao Yang, Lijun Zhang, Rong Jin, and Jinfeng Yi. Tracking slowly moving clairvoyant: Optimal dynamic regret of online learning with true and noisy gradient. In Maria Florina Balcan and Kilian Q. Weinberger, editors, Proceedings of The 33rd International Conference on Machine Learning, volume 48 of Proceedings of Machine Learning Research, pages 449–457, New York, New York, USA, 20–22 Jun 2016. PMLR.
    Google ScholarLocate open access versionFindings
  • M. J. Neely. Universal scheduling for networks with arbitrary traffic, channels, and mobility. In 49th IEEE Conference on Decision and Control (CDC), pages 1822–1829, 2010.
    Google ScholarLocate open access versionFindings
Author
Omid Sadeghi
Omid Sadeghi
Prasanna Raut
Prasanna Raut
Maryam Fazel
Maryam Fazel
Your rating :
0

 

Tags
Comments
数据免责声明
页面数据均来自互联网公开来源、合作出版商和通过AI技术自动分析结果,我们不对页面数据的有效性、准确性、正确性、可靠性、完整性和及时性做出任何承诺和保证。若有疑问,可以通过电子邮件方式联系我们:report@aminer.cn
小科