Computer Science 5th Years Master's Thesis Presentation - POSTPONED

Location:
In Person - Traffic21 Classroom, Gates Hillman 6501

Speaker:
BRIAN ZHANG - To Be Rescheduled , Masters Student, Computer Science Department, Carnegie Mellon University

Towards an OS for GPUs: Threadblock Scheduling for Deep Learning Workloads

As the year over year performance gains of CPUs has stagnated with the death of Moore's Law, GPUs and other data parallel chips have seen a surge in demand particularly for use in datacenter deep learning workloads. In spite of the growing demand, many companies are unable to fully utilize the hardware that is already in their datacenters. In fact, Alibaba reported a median GPU utilization of less than 10% in 2020. This number implies vast over-provisioning and shows the benefits to be gained via GPU multi-tenancy. Just as multi-tenancy with traditional CPU architectures is facilitated with an OS, we believe that an OS can similarly solve this problem for GPUs. 

In this thesis we describe the design and implementation of the compute scheduler of AxOS, an OS for data parallel accelerators. AxOS allows for transparency, high GPU utilization, performance isolation, and spatial stacking between multiple processes using the GPU. To achieve this, AxOS has a novel threadblock-centric approach to GPU compute scheduling via the virtual streams abstraction, kernel chunking, and rightsizing. We evaluate AxOS on a number of deep learning workloads to show these benefits. 

Thesis Committee:

Dimitrios Skarlatos (Chair)
Todd Mowry

Additional Information


Add event to Google
Add event to iCal