/PySpark-Tutorial

In this Repo, I create a tutorial of PySpark to better understand how to read and manage Big Data.

Primary LanguageJupyter Notebook

PySpark Tutorial

Overview

  • PySpark is an interface for Apache Spark in Python. It not only allows you to write Spark applications using Python APIs, but also provides the PySpark shell for interactively analyzing your data in a distributed environment.
  • In this Repository i explain each and everything about PySpark and how can you do read , handle missing values etc with the help of PySpark.

Installation

pip install pyspark

Credits