Background

PySpark is an interface for Apache Spark in Python. It not only allows you to write Spark applications using Python APIs, but also provides the PySpark shell for interactively analyzing your data in a distributed environment.1

Goals

To learn about the implementation of PySpark for machine learning.

Results

Visit here to view the notebooks.