Cs6501_ai_sys_f26
📢 New Course Alert: I’ll be teaching CS6501 AI Systems in Fall 26! Do you want to understand why LLM inference can be nondeterministic, what scaling laws predict, how FlashAttention and vLLM work, how the Roofline model reveals whether an AI workload is compute-bound or memory-bound, and how to choose checkpoint intervals that balance fault tolerance, performance, and cost? Do you also want to design, build, and evaluate a practical AI system that addresses a real-world problem? If you are curious about all these questions, you should take this course!