Ellen Vitercik

Automated algorithm and mechanism configuration Degree Type: Ph.D. in Computer Science
Advisor(s): Maria-Florina Balcan, Tuomas Sandholm
Graduated: August 2021

Abstract:

Algorithms from diverse domains often have tunable parameters that have a significant impact on performance metrics such as solution quality, runtime, and memory usage. Typically, the optimal setting depends intimately on the application domain at hand. Hand-tuning parameters is often tedious, time-consuming, and error-prone, so a burgeoning line of research has studied automated algorithm configuration via machine learning, where a training set of typical problem instances from the application at hand is used to find high-performing parameter settings.

In this thesis, we help develop the theory and practice of automated algorithm configuration. We investigate both statistical and algorithmic questions. For ex ample, how large should the training set be to ensure that an algorithm's average performance over the training set is indicative of its future performance on unseen instances? How can we algorithmically find provably high-performing configurations? As we answer these questions, we analyze parameterized algorithms from a variety of domains, including:

  1. Integer programming. We study branch-and-bound algorithms for integer linear programming, the most widely-used tools for solving combinatorial and non- convex problems. Beyond answering the algorithmic and statistical questions above, we provide experiments demonstrating that no one parameter setting is optimal across all application domains, and that tuning parameters using our approach can have a significant impact on algorithmic performance. We also analyze integer quadratic programming approximation algorithms, which can be used to find nearly-optimal solutions to a variety of combinatorial problems.
  2. Mechanism design. A mechanism is a special type of algorithm that plays a crucial role in economics and political science. A mechanism's purpose is to help a set of agents come to a collective decision. In economics, for example, a mechanism might dictate how a set of items should be split among the agents, given their values for those items. Mechanisms often have tunable parameters that impact, for example, their revenue. No one setting is optimal across all mechanism design scenarios. In this thesis, we analyze sales mechanisms, where the goal is to maximize revenue, and voting mechanisms, where the goal is to maximize the agents' total value for the outcome the mechanism selects.
  3. Computational biology. Algorithms from computational biology are often highly parameterized, and understanding which parameter settings are optimal in which scenarios is an active area of research. We analyze parameterized algorithms for several fundamental problems from computational biology, including sequence alignment, RNA folding, and predicting topologically associating domains in DNA sequences.

The key challenge from both an algorithmic and statistical perspective is that across these three diverse domains, an algorithm's performance is an erratic function of its parameters. This is because a small tweak to the parameters can cause a cascade of changes in the algorithm's performance. We develop tools for analyzing and optimizing these volatile performance functions, which we use to provide parameter tuning procedures and strong statistical guarantees.

Thesis Committee:
Maria-Florina Balcan (Co-Chair)
Tuomas Sandholm (Co-Chair)
Ameet Talwalkar
Eric Horvitz (Microsoft)
Kevin Leyton-Brown (University of British Columbia)

Srinivasan Seshan, Head, Computer Science Department
Martial Hebert, Dean, School of Computer Science

Keywords:
Automated algorithm design, data-driven algorithm design, automated algorithm configuration, sample complexity, machine learning theory, mechanism design

CMU-CS-21-125.pdf (11.31 MB) ( 271 pages)
Copyright Notice