
publication by categories in reversed chronological order. generated by jekyll-scholar.


  1. Preprint
    λScale: Enabling Fast Scaling for Serverless Large Language Model Inference
    Minchen Yu, Rui Yang, Chaobo Jia, Zhaoyuan Su, Sheng Yao, Tingfeng Lan, Yuchen Yang,  Yue Cheng, Wei Wang, Ao Wang,  and Ruichuan Chen
    In 2025
  2. Preprint
    Ensuring Fair LLM Serving Amid Diverse Applications
    Redwan Ibne Seraj Khan, Kunal Jain, Haiying Shen, Ankur Mallick, Anjaly Parayil, Anoop Kulkarni, Steve Kofsky, Pankhuri Choudhary, Renèe St. Amant, Rujia Wang,  Yue Cheng, Ali R. Butt, Victor Rühle, Chetan Bansal,  and Saravan Rajmohan
    In 2025
  3. ASPLOS’25
    Concurrency-Informed Orchestration for Serverless Functions
    Qichang Liu,  Yue Cheng, Haiying Shen, Ao Wang,  and Bharathan Balaji
    In 30th ACM International Conference on Architectural Support for Programming Languages and Operating Systems 2025
  4. WWW’25
    Centralization in Decentralized Web: Challenges and Opportunities in IPFS’ Data Management
    Ruizhe Shi, Ruizhi Cheng, Yuqi Fu, Bo Han,  Yue Cheng,  and Songqing Chen
    In The 2025 ACM Web Conference 2025
  5. SDM’25
    Staleness-Alleviated Distributed GNN Training via Online Dynamic-Embedding Prediction
    Guangji Bai, Ziyang Yu, Zheng Chai,  Yue Cheng,  and Liang Zhao
    In Proceedings of SIAM International Conference on Data Mining 2025


  1. SoCC’24
    FedCaSe: Enhancing Federated Learning with Heterogeneity-aware Caching and Scheduling
    Redwan Ibne Seraj Khan, Arnab K. Paul,  Yue Cheng, Xun Jian,  and Ali R. Butt
    In Proceedings of the ACM Symposium on Cloud Computing 2024
  2. VLDB’24
    Everything You Always Wanted to Know About Storage Compressibility of Pre-Trained ML Models but Were Afraid to Ask
    Zhaoyuan Su, Ammar Ahmed, Zirui Wang, Ali Anwar,  and Yue Cheng
    In 50th International Conference on Very Large Data Bases 2024
  3. VLDB’24
    Algorithmic Complexity Attacks on Dynamic Learned Indexes
    Rui Yang, Evgenios M. Kornaropoulos,  and Yue Cheng
    In 50th International Conference on Very Large Data Bases 2024
  4. USENIX ATC’24
    ALPS: An Adaptive Learning, Priority OS Scheduler for Serverless Functions
    Yuqi Fu, Ruizhe Shi, Haoliang Wang, Songqing Chen,  and Yue Cheng
    In 2024 USENIX Annual Technical Conference (USENIX ATC 24) 2024
    A Closer Look into IPFS: Accessibility, Content, and Performance
    Ruizhe Shi, Ruizhi Cheng, Bo Han,  Yue Cheng,  and Songqing Chen
    In ACM SIGMETRICS / IFIP Performance 2024
  6. arXiv
    Beyond Efficiency: A Systematic Survey of Resource-Efficient Large Language Models
    Guangji Bai, Zheng Chai, Chen Ling, Shiyu Wang, Jiaying Lu, Nan Zhang, Tingwei Shi, Ziyang Yu, Mengdan Zhu, Yifei Zhang, Carl Yang,  Yue Cheng,  and Liang Zhao
    In 2024


  1. Dissertation
    Towards Elastic and Cost-effective Stateful Serverless Systems
    Jingyuan Zhang
    PhD Dissertation 2023
  2. BigData’23
    Towards Cost-effective and Resource-aware Aggregation at Edge for Federated Learning
    Ahmad Khan, Yuze Li, Xinran Wang, Sabaat Haroon, Haider Ali,  Yue Cheng, Ali R. Butt,  and Ali Anwar
    In 2023 IEEE International Conference on Big Data 2023
  3. ASPLOS’23
    λFS: A Scalable and Elastic Distributed File System Metadata Service using Serverless Functions
    Benjamin Carver, Runzhou Han, Jingyuan Zhang, Mai Zheng,  and Yue Cheng
    In 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems 2023
  4. VLDB’23
    InfiniStore: Elastic Serverless Cloud Storage
    Jingyuan Zhang, Ao Wang, Xiaolong Ma, Benjamin Carver, Nicholas John Newman, Ali Anwar, Lukas Rupprecht, Dimitrios Skourtis, Vasily Tarasov, Feng Yan,  and Yue Cheng
    In 49th International Conference on Very Large Data Bases 2023
    SHADE: Enable Fundamental Cacheability for Distributed Deep Learning Training
    Redwan Ibne Seraj Khan, Ahmad Hossein Yazdani, Yuqi Fu, Arnab K. Paul, Bo Ji, Xun Jian,  Yue Cheng,  and Ali R. Butt
    In 21th USENIX Conference on File and Storage Technologies (FAST 23) 2023


  1. DRBSD-8 ’22
    Understanding Impact of Lossy Compression on Derivative-related Metrics in Scientific Datasets
    Zhaoyuan Su, Sheng Di, Ali Murat Gok,  Yue Cheng,  and Franck Cappello
    In Proceedings of the 8th International Workshop on Data Analysis and Reduction for Big Scientific Data 2022
  2. Preprint
    InfiniStore: Elastic Serverless Cloud Storage
    Jingyuan Zhang, Ao Wang, Xiaolong Ma, Benjamin Carver, Nicholas John Newman, Ali Anwar, Lukas Rupprecht, Dimitrios Skourtis, Vasily Tarasov, Feng Yan,  and Yue Cheng
    In Preprint 2022
  3. SC’22
    SFS: Smart OS Scheduling for Serverless Functions
    Yuqi Fu, Li Liu, Haoliang Wang,  Yue Cheng,  and Songqing Chen
    In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis 2022


  1. SoCC’21
    Mind the Gap: Broken Promises of CPU Reservations in Containerized Multi-Tenant Clouds
    Li Liu, Haoliang Wang, An Wang, Mengbai Xiao,  Yue Cheng,  and Songqing Chen
    In Proceedings of the ACM Symposium on Cloud Computing 2021
  2. SC’21
    FedAT: A High-Performance and Communication-Efficient Federated Learning System with Asynchronous Tiers
    Zheng Chai, Yujing Chen, Ali Anwar, Liang Zhao,  Yue Cheng,  and Huzefa Rangwala
    In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis 2021
  3. USENIX ATC’21
    FaaSNet: Scalable and Fast Provisioning of Custom Serverless Container Runtimes at Alibaba Cloud Function Compute
    Ao Wang, Shuai Chang, Huangshi Tian, Hongqi Wang, Haoran Yang, Huiba Li, Rui Du,  and Yue Cheng
    In 2021 USENIX Annual Technical Conference (USENIX ATC 21) 2021
  4. OPT’21
    Community-based Layerwise Distributed Training of Graph Convolutional Networks
    Hongyi Li, Junxiang Wang, Yongchao Wang,  Yue Cheng,  and Liang Zhao
    In The 13th International OPT Workshop on Optimization for Machine Learning (OPT’21) 2021
  5. Thesis
    Wukong: A Fast, Cost-Effective, and Easy-to-Use Serverless DAG Engine
    Benjamin Carver
    Master’s Thesis 2021


  1. ICDM ’21
    Toward Model Parallelism for Deep Neural Network Based on Gradient-Free ADMM Framework
    Junxiang Wang, Zheng Chai,  Yue Cheng,  and Liang Zhao
    In 2020 IEEE International Conference on Data Mining (ICDM) 2020
  2. SoCC’20
    Wukong: A Scalable and Locality-Enhanced Framework for Serverless Parallel Computing
    Benjamin Carver, Jingyuan Zhang, Ao Wang, Ali Anwar, Panruo Wu,  and Yue Cheng
    In Proceedings of the 11th ACM Symposium on Cloud Computing 2020
  3. OPTML-ICML’20
    Tunable Subnetwork Splitting for Model-parallelism of Neural Network Training
    Junxiang Wang, Zheng Chai,  Yue Cheng,  and Liang Zhao
    In Beyond First Order Methods in ML Systems 2020
  4. HPDC’20
    TiFL: A Tier-Based Federated Learning System
    Zheng Chai, Ahsan Ali, Syed Zawad, Stacey Truex, Ali Anwar, Nathalie Baracaldo, Yi Zhou, Heiko Ludwig, Feng Yan,  and Yue Cheng
    In Proceedings of the 29th International Symposium on High-Performance Parallel and Distributed Computing 2020
    InfiniCache: Exploiting Ephemeral Serverless Functions to Build a Cost-Effective Memory Cache
    Ao Wang, Jingyuan Zhang, Xiaolong Ma, Ali Anwar, Lukas Rupprecht, Dimitrios Skourtis, Vasily Tarasov, Feng Yan,  and Yue Cheng
    In 18th USENIX Conference on File and Storage Technologies (FAST 20) 2020


  1. TPDS
    MOANA: Modeling and Analyzing I/O Variability in Parallel System Experimental Design
    Kirk W. Cameron, Ali Anwar,  Yue Cheng, Li Xu, Bo Li, Uday Ananth, Jon Bernard, Chandler Jearls, Thomas Lux, Yili Hong, Layne T. Watson,  and Ali R. Butt
    IEEE Transactions on Parallel and Distributed Systems 2019
    HyperFaaS: A Truly Elastic Serverless Computing Framework
    Jingyuan Zhang, Ao Wang, Min Li, Yuan Chen,  and Yue Cheng
    In USENIX Symposium on Networked Systems Design and Implementation 2019
  3. PDSW’19
    In Search of a Fast and Efficient Serverless DAG Engine
    Benjamin Carver, Jingyuan Zhang, Ao Wang,  and Yue Cheng
    In 2019 IEEE/ACM Fourth International Parallel Data Systems Workshop (PDSW) 2019
  4. Cloud’19
    Bolt: Towards a Scalable Docker Registry via Hyperconvergence
    Michael Littley, Ali Anwar, Hannan Fayyaz, Zeshan Fayyaz, Vasily Tarasov, Lukas Rupprecht, Dimitrios Skourtis, Mohamed Mohamed, Heiko Ludwig,  Yue Cheng,  and Ali R. Butt
    In 2019 IEEE 12th International Conference on Cloud Computing (CLOUD) 2019
  5. USENIX OpML’19
    Towards Taming the Resource and Data Heterogeneity in Federated Learning
    Zheng Chai, Hannan Fayyaz, Zeshan Fayyaz, Ali Anwar, Yi Zhou, Nathalie Baracaldo, Heiko Ludwig,  and Yue Cheng
    In 2019 USENIX Conference on Operational Machine Learning (OpML 19) 2019
  6. VEE’19
    VCPU as a Container: Towards Accurate CPU Allocation for VMs
    Li Liu, Haoliang Wang, An Wang, Mengbai Xiao,  Yue Cheng,  and Songqing Chen
    In Proceedings of the 15th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments 2019


  1. Book Chapter
    SDN helps Big Data to optimize Storage
    Ali R. Butt, Ali Anwar,  and Yue Cheng
    Book Chapter, Big Data and Software Defined Networks, Editor: Javid Taheri. IET, ISBN 978-1-78561-304-3. 2018
  2. BigData’18
    Analyzing Alibaba’s Co-located Datacenter Workloads
    Yue Cheng, Ali Anwar,  and Xuejing Duan
    In 2018 IEEE International Conference on Big Data (Big Data) 2018
  3. SC’18
    BESPOKV: Application Tailored Scale-Out Key-Value Stores
    Ali Anwar,  Yue Cheng, Hai Huang, Jingoo Han, Hyogi Sim, Dongyoon Lee, Fred Douglis,  and Ali R. Butt
    In SC18: International Conference for High Performance Computing, Networking, Storage and Analysis 2018
  4. ApSys’18
    Characterizing Co-Located Datacenter Workloads: An Alibaba Case Study
    Yue Cheng, Zheng Chai,  and Ali Anwar
    In Proceedings of the 9th Asia-Pacific Workshop on Systems 2018
    Improving Docker Registry Design Based on Production Workload Analysis
    Ali Anwar, Mohamed Mohamed, Vasily Tarasov, Michael Littley, Lukas Rupprecht,  Yue Cheng, Nannan Zhao, Dimitrios Skourtis, Amit S. Warke, Heiko Ludwig, Dean Hildebrand,  and Ali R. Butt
    In 16th USENIX Conference on File and Storage Technologies (FAST 18) 2018
  6. IPDPS’18
    Chameleon: An Adaptive Wear Balancer for Flash Clusters
    Nannan Zhao, Ali Anwar,  Yue Cheng, Mohammed Salman, Daping Li, Jiguang Wan, Changsheng Xie, Xubin He, Feiyi Wang,  and Ali Butt
    In 2018 IEEE International Parallel and Distributed Processing Symposium (IPDPS) 2018


  1. Doctoral Dissertation
    Workload-aware Efficient Storage Systems
    Yue Cheng
    PhD Dissertation 2017


  1. Internet Computing
    Provider versus Tenant Pricing Games for Hybrid Object Stores in the Cloud
    Yue Cheng, M. Safdar Iqbal, Aayush Gupta,  and Ali R. Butt
    IEEE Internet Computing 2016
  2. USENIX ATC’16
    Erasing Belady’s Limitations: In Search of Flash Cache Offline Optimality
    Yue Cheng, Fred Douglis, Philip Shilane, Grant Wallace, Peter Desnoyers,  and Kai Li
    In 2016 USENIX Annual Technical Conference (USENIX ATC 16) 2016
  3. HotStorage’16
    ClusterOn: Building Highly Configurable and Reusable Clustered Data Services Using Simple Data Nodes
    Ali Anwar,  Yue Cheng, Hai Huang,  and Ali R. Butt
    In 8th USENIX Workshop on Hot Topics in Storage and File Systems (HotStorage 16) 2016
  4. HPDC’16
    MOS: Workload-Aware Elasticity for Cloud Object Stores
    Ali Anwar,  Yue Cheng, Aayush Gupta,  and Ali R. Butt
    In Proceedings of the 25th ACM International Symposium on High-Performance Parallel and Distributed Computing 2016
  5. VarSys’16
    Towards Managing Variability in the Cloud
    Ali Anwar,  Yue Cheng,  and Ali R. Butt
    In 2016 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW) 2016


  1. PDSW’15
    Taming the Cloud Object Storage with MOS
    Ali Anwar,  Yue Cheng, Aayush Gupta,  and Ali R. Butt
    In Proceedings of the 10th Parallel Data Storage Workshop 2015
  2. HotCloud’15
    Pricing Games for Hybrid Object Stores in the Cloud: Provider vs. Tenant
    Yue Cheng, M. Safdar Iqbal, Aayush Gupta,  and Ali R. Butt
    In 7th USENIX Workshop on Hot Topics in Cloud Computing (HotCloud 15) 2015
  3. HPDC’15
    CAST: Tiering Storage for Data Analytics in the Cloud
    Yue Cheng, M. Safdar Iqbal, Aayush Gupta,  and Ali R. Butt
    In Proceedings of the 24th International Symposium on High-Performance Parallel and Distributed Computing 2015
  4. EuroSys’15
    An In-Memory Object Caching Framework with Adaptive Load Balancing
    Yue Cheng, Aayush Gupta,  and Ali R. Butt
    In Proceedings of the Tenth European Conference on Computer Systems 2015


  1. SoCC’13
    High Performance In-Memory Caching through Flexible Fine-Grained Services
    Yue Cheng, Aayush Gupta, Anna Povzner,  and Ali R. Butt
    In Proceedings of the 4th Annual Symposium on Cloud Computing 2013