Tuesday, August 8, 2017 - 2:00pm to 3:30pm
Location:McWilliams Classroom 4303 Gates Hillman Centers
Speaker:KAI REN, Ph.D. Student http://www.cs.cmu.edu/~kair/
In an era of big data, the rapid growth of data that many companies and organizations produce and manage continues to drive efforts to improve the scalability of storage systems. The number of objects presented in storage systems continue to grow, making metadata management critical to the overall performance of file systems. Many modern parallel applications are shifting toward shorter durations and larger degree of parallelism. Such trends continue to make storage systems to experience more diverse metadata intensive workloads. The goal of this dissertation is to improve metadata management in both local and distributed file systems.
The dissertation focuses on two aspects. One is to improve the out-of-core representation of file system metadata, by exploring the use of log-structured multi-level approaches to provide a unified and efficient representation for different types of secondary storage devices (e.g., traditional hard disk and solid state disk). We have designed and implemented TableFS and its improved version SlimFS, which shows 50% to 10x faster than traditional Linux file systems. The other aspect is to demonstrate that such representation also can be flexibly integrated with many namespace distribution mechanisms to scale metadata performance of distribution file systems, and provide better support for a variety of big data applications in data center environment. Our distributed metadata middleware IndexFS can help improve metadata performance for PVFS, Lustre and HDFS by scaling to as many as 128 metadata servers.
Garth Gibson (Chair)
Gregory R. Ganger
David G. Andersen
Brent B. Welch (Google, Inc.)