Leslie Rice

Methods for robust training and evaluation of deep neural networks Degree Type: Ph.D. in Computer Science
Advisor(s): J. Zico Kolter
Graduated: May 2023

Abstract:

As machine learning systems are deployed in real-world, safety-critical applications, it becomes increasingly important to ensure these systems are robust and trustworthy. The study of machine learning robustness gained a significant amount of interest upon discovering the brittle nature of deep neural networks. Intrigue and concerns about this behavior have resulted in a significant body of work on adversarial robustness, which studies a model's performance on worst-case perturbed inputs, known as adversarial examples. In the first chapter of this thesis, we present improvements on adversarial training methods for developing empirically robust deep networks. First, we show that with certain modifications, adversarial training using the fast gradient sign method can result in models that are significantly more robust than previously thought possible, while retaining a much lower training cost compared to alternative adversarial training methods. We then discuss our findings on the harmful effects of overfitting that occur during adversarial training, and show that by using validation-based early stopping, the robust test performance of an adversarially trained model can be drastically improved.

An increasing interest in more natural, non-adversarial settings of robustness has led researchers to alternatively measure robustness in terms of a model's average performance on randomly sampled input corruptions, a notion which also underlies standard data augmentation strategies. In the second chapter of this thesis, we generalize the seemingly separate notions of average and worst-case robustness under a unifying framework that allows us to evaluate models on a wide spectrum of robustness levels. For practical use, we introduce a path sampling-based method for accurately approximating this intermediate robustness objective. We use this metric to analyze and compare deep networks in zero-shot and fine-tuned settings to better understand the effects of large-scale pre-training and fine-tuning on robustness. We show that we can also train models to intermediate levels of robustness using this objective, and further explore alternative, more efficient methods for training that bridge the gap between average and worst-case robustness.

Thesis Committee:
J. Zico Kolter (Chair)
Matt Fredrikson
Adti Raghunathan
Nicholas Carlini (Google DeepMind)

Srinivasan Seshan, Head, Computer Science Department
Martial Hebert, Dean, School of Computer Science

Keywords:
Adversarial robustness, deep learning

CMU-CS-23-108.pdf (2.68 MB) ( 101 pages)
Copyright Notice