Doctoral Thesis Proposal - Kaiyang Zhao

— 11:30am

Location:
In Person - Traffic21 Classroom, Gates Hillman 6501

Speaker:
KAIYANG ZHAO , Ph.D. Student
Computer Science Department
Carnegie Mellon University

https://www.cs.cmu.edu/~kaiyang2/

Architecting Memory Efficiency in Modern Datacenters

The proliferation of memory-intensive applications, the rapid expansion of memory capacity to terabyte scales, and the slowing of DRAM cost scaling have established memory as the critical bottleneck in modern datacenter computing. This inefficiency manifests in two dimensions: the cycle efficiency lost to the virtual memory abstraction and the escalating financial cost of memory.

First, the virtual memory abstraction is under increasing strain. As memory capacity grows while Translation Lookaside Buffer sizes remain stagnant, address translation overhead becomes severe; internal profiling at hyperscalers reveals that approximately 20% of CPU cycles are stalled on TLB misses. This overhead is bound to worsen due to inherent TLB scaling limits, the introduction of additional page table levels, vast heterogeneous memory capacity, and the page-level security checks required by confidential computing. Second, the financial cost of memory has skyrocketed. Memory now accounts for nearly a quarter of rack power consumption and half of the Total Cost of Ownership of a typical datacenter server. In this proposal, I address these challenges through a set of operating system and architectural designs.

To improve cycle efficiency, I present two completed works. Contiguitas creates abundant physical memory contiguity by grouping unmovable allocations in the OS and introducing hardware extensions to migrate pages previously locked for device I/O. This contiguity is leveraged to allocate huge pages, reducing translation overhead and yielding up to 18% performance improvement for production workloads. Learned Virtual Memory (LVM) replaces rigid radix page tables with learned indexes tailored to the application's virtual address space. By leveraging address space regularity, LVM reduces page walk overhead by an average of 44% and achieves a 2–27% speedup in execution.

To improve cost efficiency, I present two ongoing works. Multi-Tier dynamically manages pages across DRAM, CXL memory, and SSDs to maximize financial savings by utilizing cheaper tiers within a defined performance loss limit. Equilibria addresses the challenges of multi-tenant tiering, ensuring fair memory sharing and mitigating noisy-neighbor interference through flexible placement policies and thrashing mitigation.

Together, these works provide a comprehensive solution to improving memory efficiency in datacenters from the perspectives of both cycle overhead and financial cost.

Thesis Committee
Dimitrios Skarlatos (Chair)
Phil Gibbons
Todd Mowry 
Kim Keeton (Google) 

For More Information:
matthewstewart@cmu.edu


Add event to Google
Add event to iCal