Skip to main content Link Search Menu Expand Document (external link)
Last updated: | Permalink

Lecture 15: AWS AI

Learning objectives:

In this lecture, you will:

  • get exposed to what it takes to excel a role of Applied Scientist at AWS
  • learn how checkpointing helps provide fault tolerance to large-scale LLM training
  • see how replication helps prevent in-memory data loss

Lecture slides


© 2024 Yue Cheng. Released under the CC BY-SA license