Last updated: | Permalink

Course Schedule

Being less concrete further out, the course scheduling is tentative and subject to changes.

Introduction

Week 1

08/27

Lec1: Course introduction: Video

Fill out background survey

Week 2

09/01

Lec2: Projects: Video

09/03

Lec3: Cloud computing: Video

Reading: A Berkeley View on Cloud Computing; Cloudscape [FAST’25]

Team signup due (09/05)

Function-as-a-Service platforms & workloads

Week 3

09/08

How to read research papers

Lec4: Serverless Computing: Video

Reading: A Berkeley View on Serverless Computing

09/10

Firecracker: Lightweight Virtualization for Serverless Applications

On-demand Container Loading in AWS Lambda

Survey questions: Link

Peeking Behind the Curtains of Serverless Platforms (optional)

Week 4

09/15: How Does It Function? Characterizing Long-term Trends in Production Serverless Workloads; Characterizing Serverless Platforms with ServerlessBench; Survey questions: Link

Cold starts

09/17: Concurrency-Informed Orchestration for Serverless Functions; Serverless in the Wild: Characterizing and Optimizing the Serverless Workload at a Large Cloud Provider; Catalyzer: Sub-millisecond Startup for Serverless Computing with Initialization-less Booting (optional); Survey questions: Link

Week 5

09/22: Serverless Cold Starts and Where to Find Them; Fork in the Road: Reflections and Optimizations for Cold Start Latency in Production Serverless Systems; SEUSS: skip redundant paths to make serverless fast (optional); Survey questions: Link; Project Proposal due
09/24: SAND: Towards High-Performance Serverless Computing; SOCK: Rapid Task Provisioning with Serverless-Optimized Containers; Survey questions: Link

Stateful serverless computing

Week 6

09/29: Faasm: Lightweight Isolation for Efficient Stateful Serverless Computing; NotebookOS: A Replicated Notebook Platform for Interactive Training with On-Demand GPUs; Cloudburst: Stateful Functions-as-a-Service (optional); Survey questions: Link
10/01: Fault-tolerant and transactional stateful serverless workflows; Following the Data, Not the Function: Rethinking Function Orchestration in Serverless Computing; Survey questions: Link

Serverless applications (parallel computing & programming)

Week 7

10/06: Occupy the Cloud: Distributed Computing for the 99%; From Laptop to Lambda: Outsourcing Everyday Jobs to Thousands of Transient Functional Containers; Survey questions: Link
10/08: Shuffling, Fast and Slow: Scalable Analytics on Serverless Infrastructure; Wukong: a scalable and locality-enhanced framework for serverless parallel computing; Survey questions: Link

Week 8

10/13: Reading day (no class)
10/15: Midterm Project Checkpoint Vibe Coding Day and Project Reflection

Week 9

10/20: Encoding, Fast and Slow: Low-Latency Video Processing Using Thousands of Tiny Threads; Towards Demystifying Serverless Machine Learning Training; Survey questions: Link; Sprocket: A Serverless Video Processing Framework (optional); Lambada: Interactive Data Analytics on Cold Data using Serverless Cloud Infrastructure (optional)

Serverless storage

10/22: Pocket: Elastic Ephemeral Storage for Serverless Analytics; InfiniCache: Exploiting Ephemeral Serverless Functions to Build a Cost-Effective Memory Cache; Survey questions: Link; Boki: Stateful Serverless Computing with Shared Logs (optional); Splinter: Bare-Metal Extensions for Multi-Tenant Low-Latency Storage (optional); Project Checkpoint due

LLM serving

Week 10

10/27: Efficient Memory Management for Large Language Model Serving with PagedAttention; InfiniGen: Efficient Generative Inference of Large Language Models with Dynamic KV Cache Management; Survey questions: Link; The Ultra-Scale Playbook: Training LLMs on GPU Clusters (optional but highly recommended)
10/29: Orca: A Distributed Serving System for Transformer-Based Generative Models; DistServe: Disaggregating Prefill and Decoding for Goodput-optimized Large Language Model Serving; Survey questions: Link

Week 11

11/03: Taming Throughput-Latency Tradeoff in LLM Inference with Sarathi-Serve; AlpaServe: Statistical Multiplexing with Model Parallelism for Deep Learning Serving; Survey questions: Link
11/05: SpotServe: Serving Generative Large Language Models on Preemptible Instances; Mooncake: Trading More Storage for Less Computation — A KVCache-centric Architecture for Serving LLM Chatbot; Survey questions: Link; Serving DNNs like Clockwork: Performance Predictability from the Bottom Up (optional)

Serverless AI

Week 12

11/10: ServerlessLLM: Low-Latency Serverless Inference for Large Language Models; DEEPSERVE: Serverless Large Language Model Serving at Scale; Survey questions: Link
11/12: Towards Swift Serverless LLM Cold Starts with ParaServe; INFaaS: Automated Model-less Inference Serving

Week 13

11/17: Parrot: Efficient Serving of LLM-based Applications with Semantic Variable; SLoRA: Scalable Serving of Thousands of LoRA Adapters; Survey questions: Link; Punica: Multi-Tenant LoRA Serving (optional)
11/19: BlitzScale: Fast and Live Large Model Autoscaling with O(1) Host Caching; PhoenixOS: Concurrent OS-level GPU Checkpoint and Restore with Validated Speculation; Survey questions: Link

Week 14

11/24: Hack day (no class)
11/26: Thanksgiving recess (no class)

Wrapping up

Week 15

12/01: Project presentation I
12/03: Project presentation II

Week 16

12/08: Project presentation III
12/10: Project everything due