Chun Kai Ling

Scalable Learning and Solving of Extensive-Form Games Degree Type: Ph.D. in Computer Science
Advisor(s): J. Zico Kolter
Graduated: August 2023

Abstract:

Strategic interactions between agents can be formalized by game theory. In re- cent years modern learning techniques, coupled with the ubiquity of data and proliferation of compute, have paved the way towards computationally solving large games, with broad applications ranging from poker to security. However, several challenges arise when considering the current state of the art in large-scale game solving: (i) game parameters in many realistic settings are often unspecified; and (ii) there is a notable lack of scalable solvers for large, general-sum extensive-form games (most existing approaches being limited to zero-sum games).

In this thesis, we tackle in these two challenges in two ways. First, we study the inverse problem in game theory, where game parameters are to be learnt given only samples drawn from its quantal response equilibrium. An end-to-end, differentiable game-solving module fully compatible with modern deep learning architectures is presented. This approach allows for the learning of game parameters in two-player normal-form and extensive-form zero-sum games in a scalable manner via gradient descent. Second, we extend state of the art online Nash solvers currently used in zero-sum games to Stackelberg and correlated equilibria in general-sum extensive-form games. In both cases, we present efficient online search algorithms which avoid computation in branches of the game tree not encountered in actual play. These algorithms allow approximate solving of large general-sum games which offline solvers struggle with, while still providing guarantees on performance. Third, we show how to solve Stackelberg equilibrium in large perfect information general-sum games by learning Enforceable Payoff Functions–functions which capture tradeoffs in utility between players–rather than the usual scalar values. We demonstrate how to learn these functions using a variant of fitted value iteration, and quantify the suboptimality incurred by errors in function approximation.

Thesis Committee:
J. Zico Kolter (Co-chair)
Fei Fang (Co-chair)
Vincent Conitzer
Tuomas Sandholm
Michael Bowling (University of Alberta)

Srinivasan Seshan, Head, Computer Science Department
Martial Hebert, Dean, School of Computer Science

Keywords:
Game Theory, Machine Learning, Multiagent Decision Making

CMU-CS-23-121.pdf (1.65 MB) ( 150 pages)
Copyright Notice