Install Spark https://spark.apache.org/downloads.html
Install Jupyter https://spark.apache.org/downloads.html
Pyspark and Jupyter together: https://www.sicara.ai/blog/2017-05-02-get-started-pyspark-jupyter-notebook-3-minutes
export PYSPARK_DRIVER_PYTHON=jupyter
export PYSPARK_DRIVER_PYTHON_OPTS='notebook'
pip install findspark
Parquet file format: https://www.youtube.com/watch?v=1j8SdS7s_NY