Computer Science Thesis Oral

Friday, May 5, 2017 - 2:00pm to 3:30pm

Location:

Traffic21 Classroom 6501 Gates Hillman Centers

Speaker:

KRISTEN GARDNER, Ph.D. Student http://www.cs.cmu.edu/~ksgardne/

Reducing latency is a primary concern in computer systems. As cloud computing and resource sharing become more prevalent, the problem of how to reduce latency becomes more challenging because there is a high degree of variability in server speeds. Recent computer systems research has shown that the same job can take 12 times or even 27 times longer to run on one machine than another, due to varying background load, garbage collection, network contention, and other factors. This server variability is transient and unpredictable, making it hard to know how long a job will take to run on any given server, and therefore how best to dispatch and schedule jobs. An increasingly popular strategy for combating server variability is redundancy. The idea is to create multiple copies of the same job, dispatch these copies to different servers, and wait for the first copy to complete service. A great deal of empirical computer systems research has demonstrated the benefits of redundancy: using redundancy can yield up to a 50% reduction in mean response time. Unfortunately, there is very little theoretical work analyzing performance in systems with redundancy. This thesis presents the first exact analysis of response time in systems with redundancy. We begin in the Independent Runtimes (IR) model, in which a job's service times (runtimes) are assumed to be independent across servers. Here we derive exact expressions for the distribution of response time in a certain set of class-based redundancy systems. We also propose two new scheduling policies, Least Redundant First (LRF) and Primaries First (PF), and prove that LRF minimizes overall system response time, while PF is fair across classes of jobs with different redundancy degrees. While the IR model is appropriate in certain settings, in others it does not make sense because the independence assumption eliminates any notion of an "inherent job size." The IR model leads to the conclusion that more redundancy is always better, which often is not true in practice. We propose the S&X model, which is the first model to decouple a job's inherent size (X) from the server slowdown (S). This model is important because, unlike prior models, it allows a job's runtimes to be correlated across servers. The S&X model makes it evident that redundancy does not always help: in fact, too much redundancy can lead to instability. To overcome this, we propose a new dispatching policy, Redundant-to-Idle-Queue, which is provably stable in the S&X model, while offering substantial response time improvements. Thesis Committee: Mor Harchol-Balter (Chair) Guy Blelloch Anupam Gupta Alan Scheller-Wolf Rhonda Righter (University of California, Berkeley)

For More Information, Contact:

Keywords:

Thesis Oral