Yue Cheng
Associate Professor at the University of Virginia
mrz7dp _AT_ virginia.edu
SDS,CS@UVA
Data Systems Researcher
I am an Associate Professor of Data Science and Computer Science at the University of Virginia. My research covers a range of topics including distributed systems, serverless and cloud computing, storage systems, operating systems, and high-performance computing. My current research focuses on designing scalable, high-performance, and easy-to-use computer systems that manage and process huge volume of data.
Currently I am working on: (1) Serverless and FaaS: improving serverless computing using a end-to-end approach that cuts across the entire ecosystem stack: (stateful) applications, middleware, platforms, and low-level OS/HW; (2) Sys4ML: building better (computing and storage) systems for large-scale ML applications; (3) ML4Sys: improving systems software and infrastructure management by using learned/learning/data-driven approaches; and (4) Data Reduction: rethinking data reduction techniques for large data-intensive applications;
I am the recipient of an NSF CAREER Award (2021), an Amazon Research Award (2021), a Meta Research Award (2022), the IEEE CS TCHPC Early Career Researchers Award for Excellence in HPC (2022), and a Samsung GRO 2023 Award (2023). Prior to joining UVA, I was an Assistant Professor of Computer Science at George Mason University, from 2017 to 2022. I received my Ph.D. degree in Computer Science from Virginia Tech, working with Dr. Ali R. Butt. During my Ph.D. I spent two summers at IBM Research Almaden in 2013 and 2014, and six months at Dell EMC Princeton Office in 2015.
selected projects
Most of my projects are open-source and available on our group’s GitHub page.
-
Serverless Cloud Storage: Storing large and small objects on a dynamic fleet of serverless functions with only 3% of ElastiCache’s cost but without sacrificing performance and availability.
[ASPLOS’23]: [GitHub] – [VLDB’23]: [GitHub] – [FAST’20]: [GitHub] -
Serverless Parallel Computing: Scaling out Python parallel programs (e.g., Dask applications) on FaaS without worrying about tedious cluster management. Wukong uses a new decentralized scheduling technique, which decentralizes resource orchestration to each individual serverless function, thereby enabling high elasticity and high scalability.
[SoCC’20] [PDSW’19]: [GitHub] -
FaaS Platform Management: A highly scalable container provisioning framework that can provision thousands of 10+GB serverless function containers with just a few seconds. FaaSNet is currently deployed at Alibaba Function Compute.
[ATC’21]: [GitHub] [Alibaba Cloud Blog] -
Serverless Function OS Scheduling: Linux CFS is not ideal for short-lived serverless function workloads. This project rethinks OS scheduling to minimize function turnaround time.
[SC’22]: [GitHub] – [ATC’24]: [GitHub] -
Storage for Deep Learning: A common practice in deep learning training is to randomly shuffle all training samples epoch by epoch. With SHADE, you can cache the most important training samples without losing training quality.
[FAST’23]: [GitHub] – [SoCC’24]: [code]
news
Sep 2024 | Congrats to Redwan on FedCaSe on federated learning I/O caching and scheduling accepted to SoCC 2024! |
---|---|
Sep 2024 | 👋 A warm welcome to our newest members: Zirui Wang and Tingfeng Lan! |
Sep 2024 | Thrilled to receive an NSF CSSI Elements grant on developing a sustainable and GPU-efficient cyberinfrastructure for Notebooks (w/ Co-PI Geoffrey Fox). Thanks, NSF! |
Jul 2024 | Excited to receive an NSF REU Site grant (lead PI: Claudia Scholz). Thanks, NSF! |
Jun 2024 | Congrats to Yuqi and Ruizhe on ALPS accepted to USENIX ATC 2024! ALPS learns workload intelligence from the user space to inform serverless function scheduling in the kernel space. |
May 2024 | This summer Yuqi will be doing a student researcher internship at Google and Zhaoyuan will be doing a research internship at Samsung. Congrats! |
Apr 2024 | Excited to receive an NSF OAC Core grant on building a distributed graph learning cyberinfrastructure for large spatiotemporal prediction (w/ Liang Zhao from Emory). Thanks, NSF! |
Mar 2024 | Congrats to Ruizhe on the IPFS analysis work accepted to SIGMETRICS 2024! We answered questions about accessibility, content, and performance of IPFS in this research. |
Mar 2024 | Congrats to Zhaoyuan and Zirui on their work accepted to VLDB 2024! In this work, Zhaoyuan analyzed a large dataset of real-world pre-trained ML models collected from Hugging Face. Based on the analysis study, he designed a new storage compression method for reducing the storage requirement of pre-trained models at scale. |
Feb 2024 | Congrats to Rui on his work accepted to VLDB 2024! In this work, Rui systematically studied the algorithmic complexity vulnerabilities of dynamic learned indexes. |
Jan 2024 | Check our latest survey on resource-efficient LLMs. |
Oct 2023 | Excited to receive a Samsung GRO 2023 Award on New Storage for Large ML Training (w/ Ali Anwar from UMN). Thanks, Samsung Advanced Institute of Technology and Samsung Memory Solutions Lab, for the generous support on our research! |
Oct 2023 | Serving as the general co-chair of ACM HotStorage’24. Consider submitting your exciting early ideas! |
Jun 2023 | 🎓 My first Ph.D. student Jingyuan Zhang successfully defended his Ph.D. dissertation. Congratulations, Dr. Zhang! Jingyuan will be joining the cloud-native infrastructure team @ ByteDance (San Jose, CA). |
Apr 2023 | Congrats to Ben, Runzhou, and Jingyuan on the acceptance of λFS to ASPLOS 2023! The acceptance of λFS at ASPLOS’23 marks yet another significant milestone of our serverless storage project series. Don’t forget to check out our projects: Episode I - InfiniCache, Episode II - InfiniStore, and our latest work, Episode III - λFS. |
Feb 2023 | Congrats to Jingyuan, Ben, and the team on the acceptance of InfiniStore to VLDB 2023! |
Dec 2022 | Congrats to Redwan, Ahmad, and Yuqi on their paper on deep learning I/O caching accepted to FAST 2023! |
Sep 2022 | I am honored to be selected for the 2022 IEEE CS TCHPC Early Career Researchers Award for Excellence in High Performance Computing. |
Sep 2022 | Congrats to Zhaoyuan on his paper accepted to DRBSD-8 co-located with SC 2022! |
Sep 2022 | Excited to receive a Meta Research Award for AI System Hardware/Software Codesign. Thanks, Meta Research! |
Aug 2022 | In Fall ‘22, I am joining the School of Data Science and the Department of Computer Science at the University of Virginia. |
Jul 2022 | SFS is nominated as a Best Student Paper Award Finalist at SC 2022! Congrats to Yuqi! |
Jun 2022 | Congrats to Yuqi on his paper on serverless function scheduling accepted to SC 2022! |
May 2022 | This summer my students will intern at MSR (Ben Carver), ByteDance (Yuqi Fu, Jingyuan Zhang), and Argonne National Lab (Zhaoyuan Su)! Congrats! |
May 2022 | 🏆 Thrilled to receive an Outstanding Teaching Award from CS @ Mason! |
Aug 2021 | Congrats to Li and Haoliang on rKube accepted to SoCC 2021! |
Aug 2021 | A collaborative FMSG grant funded by NSF (with Jia Liu @ Auburn). Thanks, NSF! |
Jun 2021 | Congrats to Zheng on FedAT accepted to SC 2021! |
Apr 2021 | Congrats to Ao on FaaSNet accepted to USENIX ATC 2021! |
Mar 2021 | Honored to receive a gift from Adobe Research for our work on serverless computing! Thanks, Adobe! |
Feb 2021 | Thrilled to receive an NSF CAREER Award for my work on building serverless cloud storage infrastructure. Thanks, NSF! |
Oct 2020 | Excited to receive an Amazon Research Award with Liang Zhao from Emory! |
Aug 2020 | Congrats to Junxiang and Zheng on their paper getting accepted to IEEE ICDM 2020! |
Aug 2020 | Congrats to Ben, Jingyuan, and Ao on Wukong getting accepted by ACM SoCC 2020! Wukong is a super-fast serverless parallel computing framework built atop AWS Lambda. Wukong achieves up to 68X speedup over state-of-the-art serverless parallel processing frameworks. Wukong project is online. We are happy to accept contributions! |
Jul 2020 | Two projects got funded by NSF. With the new MRI grant, we will be building a new HPC infrastructure to support the growing computing needs for Mason users. With an OAC grant, we will be building a new model parallel deep learning training infrastructure. Thanks NSF! |
Mar 2020 | Congrats to Zheng, Ahsan, and Syed on TiFL getting accepted to ACM HPDC 2020! |
Dec 2019 | Congrats to Ao, Jingyuan, and Xiaolong on InfiniCache getting accepted to USENIX FAST 2020! InfiniCache is a first-of-its-kind, cost-effective, object cache that is built atop ephemeral cloud funtions. InfiniCache is 31-96x cheaper than existing cloud cache services (e.g., AWS ElastiCache) while offering same or better performance. Fork InfiniCache on GitHub. |
selected/recent publications
- SIGMETRICS’24A Closer Look into IPFS: Accessibility, Content, and PerformanceIn ACM SIGMETRICS / IFIP Performance 2024