Skip to main content Link Search Menu Expand Document (external link)
Last updated: | Permalink

Lecture 4: Apache Spark and RDDs

Learning objectives:

In this lecture, you will learn:

  • the in-memory cluster computing abstraction RDDs and how it’s different from other in-memory data structures.
  • the motivation, design, and architecture of Apache Spark.
  • how Spark is different from MapReduce.
  • the APIs of Spark and the workflows of basic Spark applications (e.g., log debugging, PageRank).

Lecture slides

Readings

Paper review forms

Lecture videos


© 2023 Yue Cheng. Released under the CC BY-SA license