# Dinkelbach-Type Algorithm for Computing Quantal Stackelberg Equilibrium

IJCAI, pp. 246-253, 2020.

EI

Weibo:

Abstract:

Stackelberg security games (SSGs) have been deployed in many real-world situations to optimally allocate scarce resource to protect targets against attackers. However, actual human attackers are not perfectly rational and there are several behavior models that attempt to predict subrational behavior. Quantal response is among the most com...More

Code:

Data:

Introduction

- Game-theoretic algorithms have been used for improving physical security, protecting wildlife in natural parks [Fang et al, 2017], or beating human professionals in poker [Moravc ́ık et al, 2017; Brown and Sandholm, 2018].
- QR is used as the response function in case of human players and the desired solution concept is termed Quantal Stackelberg Equilibrium (QSE).
- The problem must be formulated in terms of allocating limited resources to a set of targets
- This is often impossible, e.g., in classical games from economics.
- In order to solve real-world problems beyond SSGs, the authors study optimal behavior against a quantal response opponent in more general models of normal-form games.

Highlights

- Game-theoretic algorithms have been used for improving physical security, protecting wildlife in natural parks [Fang et al, 2017], or beating human professionals in poker [Moravc ́ık et al, 2017; Brown and Sandholm, 2018]
- Quantal Response is used as the response function in case of human players and the desired solution concept is termed Quantal Stackelberg Equilibrium (QSE)
- An optimal strategy of the rational player against such a subrational opponent is described by a leaderfollower solution concept: the Quantal Stackelberg Equilibrium
- The results show that with the increasing size of the game, the speedup of the Dinkelbach-type algorithm increases, being up to 25.5-times faster than one restart of gradient ascent for games with 7500 leader’s actions. It suggests that for even larger games, the Dinkelbach-type algorithm should perform significantly better than the gradient ascent
- We introduced a Dinkelbach-type formulation of computing a boundedly-rational quantal Stackelberg equilibrium in normal-form games
- In contrast to the direct formulation, the Dinkelbach formulation has both the theoretical advantages as well as positive computational consequences – the formulation offers up to 25.5-times speedup when compared with the original formulation

Methods

- The authors demonstrate practical aspects of proposed algorithms for computing QSE in NFGs. As a benchmark, the authors use the original formulation solved by gradient ascent (GA).
- The authors use the original formulation solved by gradient ascent (GA)
- The authors compare it to the Dinkelbach-type algorithm (DTA) – Algorithm 1 with subproblems solved via substitutional PWLA.
- All implementations were done in C++17.
- The authors used an implementation of the SLSQP GA algorithm in the NLOPT 2.6.1 library for non-linear optimization.
- Because the algorithm is domain independent, the authors used Randomly Generated Games (RGGs) for evaluation

Results

- The results show that with the increasing size of the game, the speedup of the DTA increases, being up to 25.5-times faster than one restart of GA for games with 7500 leader’s actions.
- The number of restarts of the GA required to reach a deviation from the DTA’s solution less than 1% is shown in Table 1.
- The quality of the solutions was often worse than GA’s solutions

Conclusion

- The authors introduced a Dinkelbach-type formulation of computing a boundedly-rational quantal Stackelberg equilibrium in normal-form games.
- In contrast to the direct formulation, the Dinkelbach formulation has both the theoretical advantages as well as positive computational consequences – the formulation offers up to 25.5-times speedup when compared with the original formulation

Summary

## Introduction:

Game-theoretic algorithms have been used for improving physical security, protecting wildlife in natural parks [Fang et al, 2017], or beating human professionals in poker [Moravc ́ık et al, 2017; Brown and Sandholm, 2018].- QR is used as the response function in case of human players and the desired solution concept is termed Quantal Stackelberg Equilibrium (QSE).
- The problem must be formulated in terms of allocating limited resources to a set of targets
- This is often impossible, e.g., in classical games from economics.
- In order to solve real-world problems beyond SSGs, the authors study optimal behavior against a quantal response opponent in more general models of normal-form games.
## Methods:

The authors demonstrate practical aspects of proposed algorithms for computing QSE in NFGs. As a benchmark, the authors use the original formulation solved by gradient ascent (GA).- The authors use the original formulation solved by gradient ascent (GA)
- The authors compare it to the Dinkelbach-type algorithm (DTA) – Algorithm 1 with subproblems solved via substitutional PWLA.
- All implementations were done in C++17.
- The authors used an implementation of the SLSQP GA algorithm in the NLOPT 2.6.1 library for non-linear optimization.
- Because the algorithm is domain independent, the authors used Randomly Generated Games (RGGs) for evaluation
## Results:

The results show that with the increasing size of the game, the speedup of the DTA increases, being up to 25.5-times faster than one restart of GA for games with 7500 leader’s actions.- The number of restarts of the GA required to reach a deviation from the DTA’s solution less than 1% is shown in Table 1.
- The quality of the solutions was often worse than GA’s solutions
## Conclusion:

The authors introduced a Dinkelbach-type formulation of computing a boundedly-rational quantal Stackelberg equilibrium in normal-form games.- In contrast to the direct formulation, the Dinkelbach formulation has both the theoretical advantages as well as positive computational consequences – the formulation offers up to 25.5-times speedup when compared with the original formulation

- Table1: The expected number of GA restarts needed to reach 1% deviation from the DTA solution. All exp functions required > 20 restarts

Funding

- This research is supported by the SIMTech-NTU Joint Laboratory on Complex Systems, the Czech Science Foundation (grant no. 1827483Y and 19-24384Y) and by the OP VVV MEYS funded project CZ.02.1.01/0.0/0.0/16 019/0000765 “Research Center for Informatics”
- Bo An is partially supported by Singtel Cognitive and Artificial Intelligence Lab for Enterprises (SCALE@NTU), which is a collaboration between Singapore Telecommunications Limited (Singtel) and Nanyang Technological University (NTU) that is funded by the Singapore Government through the Industry Alignment Fund – Industry Collaboration Projects Grant

Reference

- [Boyd and Vandenberghe, 2004] Stephen Boyd and Lieven Vandenberghe. Convex Optimization. Cambridge University Press, 2004.
- [Brown and Sandholm, 2018] Noam Brown and Tuomas Sandholm. Superhuman ai for heads-up no-limit poker: Libratus beats top professionals. Science, 359(6374):418– 424, 2018.
- [Camerer, 2011] Colin F Camerer. Behavioral Game Theory: Experiments in Strategic Interaction. Princeton University Press, 2011.
- [Delle Fave et al., 2014] Francesco Maria Delle Fave, Albert Xin Jiang, Zhengyu Yin, Chao Zhang, Milind Tambe, Sarit Kraus, and John P Sullivan. Game-theoretic patrolling with dynamic execution uncertainty and a case study on a real transit system. Journal of Artificial Intelligence Research, 50:321–367, 2014.
- [Dinkelbach, 1967] Werner Dinkelbach. On nonlinear fractional programming. Management Science, 13(7):492– 498, 1967.
- [Fang et al., 2017] Fei Fang, Thanh H Nguyen, Rob Pickles, Wai Y Lam, Gopalasamy R Clements, Bo An, Amandeep Singh, Brian C Schwedock, Milind Tambe, and Andrew Lemieux. PAWS - a deployed game-theoretic application to combat poaching. AI Magazine, 2017.
- [Hastie, 2017] Trevor J Hastie. Generalized additive models. In Statistical models in S, pages 249–307.
- Routledge, 2017.
- [Ibaraki, 1976] Toshimde Ibaraki. Integer programming formulation of combinatorial optimization problems. Discrete Mathematics, 16(1):39–52, 1976.
- [Kahneman and Tversky, 2013] Daniel Kahneman and Amos Tversky. Prospect theory: An analysis of decision under risk. In Handbook of the Fundamentals of Financial Decision Making: Part I, pages 99–127. World Scientific, 2013.
- [Kolodziej et al., 2013] Scott Kolodziej, Pedro M Castro, and Ignacio E Grossmann. Global optimization of bilinear programs with a multiparametric disaggregation technique. Journal of Global Optimization, 57(4):1039–1063, 2013.
- [McCormick, 1976] Garth P McCormick. Computability of global solutions to factorable nonconvex programs: Part I - Convex underestimating problems. Mathematical Programming, 10(1):147–175, 1976.
- [McFadden, 1976] Daniel L. McFadden. Quantal choice analaysis: A survey. In Annals of Economic and Social Measurement, Volume 5, number 4, pages 363–390. NBER, 1976.
- [McKelvey and Palfrey, 1995] Richard D. McKelvey and Thomas R. Palfrey. Quantal response equilibria for normal form games. Games and Economic Behavior, 10(1):6–38, 1995.
- [Moravcık et al., 2017] Matej Moravcık, Martin Schmid, Neil Burch, Viliam Lisy, Dustin Morrill, Nolan Bard, Trevor Davis, Kevin Waugh, Michael Johanson, and Michael Bowling. DeepStack: Expert-level artificial intelligence in no-limit poker. Science, 2017.
- [Nguyen et al., 2016] Thanh H Nguyen, Arunesh Sinha, Shahrzad Gholami, Andrew Plumptre, Lucas Joppa, Milind Tambe, Margaret Driciru, Fred Wanyama, Aggrey Rwetsiba, Rob Critchlow, et al. Capture: A new predictive anti-poaching tool for wildlife protection. In Proceedings of the 15th International Conference on Autonomous Agents & Multiagent Systems, pages 767–775, 2016.
- [Nudelman et al., 2004] Eugene Nudelman, Jennifer Wortman, Yoav Shoham, and Kevin Leyton-Brown. Run the GAMUT: A comprehensive approach to evaluating gametheoretic algorithms. In Proceedings of the 3rd International Joint Conference on Autonomous Agents and Multiagent Systems, volume 4, pages 880–887, 2004.
- [Pita et al., 2008] James Pita, Manish Jain, Janusz Marecki, Fernando Ordonez, Christopher Portway, Milind Tambe, Craig Western, Praveen Paruchuri, and Sarit Kraus. Deployed ARMOR protection: The application of a game theoretic model for security at the Los Angeles International Airport. In Proceedings of the 7th International Joint Conference on Autonomous Agents and Multiagent Systems, pages 125–132, 2008.
- [Potts, 1999] William JE Potts. Generalized additive neural networks. In Proceedings of the 5th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 194–200. ACM, 1999.
- [Tambe, 2011] Milind Tambe. Security and Game Theory: Algorithms, Deployed Systems, Lessons Learned. Cambridge University Press, New York, NY, USA, 2011.
- [Vielma et al., 2010] Juan Pablo Vielma, Shabbir Ahmed, and George Nemhauser. Mixed-integer models for nonseparable piecewise-linear optimization: Unifying framework and extensions. Operations Research, 58(2):303– 315, 2010.
- [Wahba, 1990] Grace Wahba. Spline Models for Observational Data, volume 59. SIAM, 1990.
- [Yang et al., 2012] Rong Yang, Fernando Ordonez, and Milind Tambe. Computing optimal strategy against quantal response in security games. In Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems-Volume 2, pages 847–854, 2012.
- [Yano et al., 2013] Masayuki Yano, James Douglass Penn, George Konidaris, and Anthony T Patera. Math, Numerics & Programming (for Mechanical Engineers). MIT, 2013.
- [Yechiam and Hochman, 2013] Eldad Yechiam and Guy Hochman. Losses as modulators of attention: Review and analysis of the unique effects of losses over gains. Psychological Bulletin, 139(2):497, 2013.

Full Text

Tags

Comments