Skip to main content

CS Seminar: Dr. Sarp Oral, August 4 2010, 13:40, FENS G035

Lessons Learned in Deploying the World’s Largest Scale Lustre File System

Dr.  Sarp Oral
The Oak Ridge National Laboratory


Abstract:The Spider system at the Oak Ridge National Laboratory’s Leadership Computing Facility (OLCF) is the world’s largest scale Lustre parallel file system. Envisioned as a shared parallel file system capable of delivering both the bandwidth and capacity requirements of the OLCF’s diverse computational environment, the project had a number of ambitious goals. To support the workloads of the OLCF’s diverse computational platforms, the aggregate performance and storage capacity of Spider exceed that of OLCF’s previously deployed systems by a factor of 6x - 240 GB/sec, and 17x - 10 Petabytes, respectively. Furthermore, Spider supports over 26,000 clients concurrently accessing the file system, which exceeds OLCF’s previously deployed systems by nearly 4x. In addition to these scalability challenges, moving to a center-wide shared file system required dramatically improved resiliency and fault-tolerance mechanisms.  Through a phased approach of research and development, prototyping, deployment, and transition to operations, this work has resulted in a number of insights into large-scale parallel file system architectures. This presentation details our efforts in designing, deploying, and operating Spider and particularly focuses on reducing parallel file system journaling overheads.
Journaling is a widely used technique to increase file system robustness against metadata and/or data corruptions. While the overhead of journaling can be masked by the page cache for small-scale, local file systems, we found that Lustre's use of journaling for the object store significantly impacted the overall performance of our large-scale center-wide parallel file system. By requiring that each write request wait for a journal transaction to commit, Lustre introduced serialization to the client request stream and imposed additional latency due to disk head movement (seeks) for each request.
This work provides a head-to-head comparison of two significantly different approaches to increasing the overall efficiency of the Lustre file system.  First, a hardware solution using external journaling devices to eliminate the latencies incurred by the extra disk head seeks due to journaling is presented. Second, a software-based optimization to remove the synchronous commit for each write request, side-stepping additional latency and amortizing the journal seeks across a much larger number of requests is introduced. Both solutions have been implemented and experimentally tested on the Spider storage system. Tests show both methods considerably improve the write performance, in some cases up to 93%. Testing with a real-world scientific application showed a 37% decrease in the number journal updates, each with an associated seek — which translated into an average I/O bandwidth improvement of 56.3%.
In this presentation, also will be covered our solutions to issues such as network congestion, performance baselining and evaluation, and high availability in a system with tens of thousands of components. Areas of continued challenges, such as stressed metadata performance and the need for file system quality of service alongside with efforts to address them will also be discussed. Finally, operational aspects of managing a system of this scale are discussed along with real-world data and observations are presented.
Short Bio:Dr. Sarp Oral is a Research Staff member at the Oak Ridge National Laboratory (ORNL). He is a member of the Technology Integration Group at the National Center for Computational Sciences (NCCS) at the ORNL. His research and professional interests focus upon computer benchmarking, parallel I/O and file systems, high-performance computing and networking, system and storage area networks, computer architecture, fault-tolerance, storage technologies, and networked storage.
Prior to joining NCCS, Dr. Oral served as a Systems Analyst at the NCI Inc. and as the Associate Laboratory Director and a Research Associate at the High-performance and Simulation (HCS) Laboratory at the University of Florida (UF). Dr. Oral received his Ph.D. in Computer Engineering from the University of Florida in 2003.


August 4  2010, 13:40, FENS G035




 

Home

FENS Dean's Office

Orta Mahalle, 34956 Tuzla, İstanbul, Türkiye

+90 216 483 96 00

© Sabancı University 2023