报告题目：Exploration of Alternative Strategies for Burst Buffer Systems
Weikuan Yu is a Professor in the Department of Computer Science at Florida State University. He earned both his PhD degree in Computer Science and master's degree in Neurobiology from the Ohio State University. He also holds a Bachelor degree in Genetics from Wuhan University, China. Yu’s main research interests include big data management and analytics frameworks, parallel I/O and storage, GPU memory architecture, and high performance networking. He has published more than 90 papers, many of which appeared in top conferences and journals such as IEEE Transactions on Parallel and Distributed Systems, IEEE Transactions on Computers, Supercomputing, SigMetrics, PACT, ICS and IPDPS. Yu’s research has been funded by grants from the U.S. National Science Foundation, NASA, Department of Energy, NVIDIA, Mellanox, and SolarFlare for more than $7M (PI for more than $5M). Many of Dr. Yu’s graduate students have joined prestigious organizations such as Boeing, Amazon, IBM, Intel, Yahoo and Berkeley and Argonne National Labs upon graduation. Yu is currently serving as an Associate Editor for the IEEE Transactions on Parallel and Distributed Systems. He is a senior member of IEEE and a member of ACM, USENIX and AAAS.
The growth of computing power on large-scale systems requires commensurate high-bandwidth I/O systems. Many parallel file systems are designed to provide scalable I/O in response to applications’ soaring requirements. In addition, the sheer size of computing components on leadership systems leads to ever-escalating failure rates. Checkpointing as a common defensive mechanism also demands an increasing share of I/O bandwidth on HPC systems. Novel I/O systems are imperative to provide commensurate I/O to meet the various needs of data-intensive applications on HPC systems. This talk will present our recent exploration of two distinct strategies for designing burst buffer systems. First I will describe a burst buffer framework called BurstMEM that extends the cutting-edge MemCached system to aggregate I/O bandwidth from remote shared SSDs and support data-intensive applications through several techniques including novel tree-based metadata indexing and coordinated shuffling for data management. Next I will present a new file system called BurstFS that leverages locally attached SSDs as burst buffers and features techniques for scalable I/O including scalable metadata indexing and co-located I/O delegation. Finally, I will conclude with a discussion on future directions on optimizing and hardening these burst buffer systems.