Step 1: Open Your terminal and navigate to project root directory (SCP_Project-x21171203). You should change to this directory before starting PySpark.
cd ~/SCP_Project-x21171203
Step 2: Start 'PySpark' server and open Jupyter notebook on browser from the 'URL:port' that will be printed as a output by executing below command.
pyspark
Step 3: When Jupyter notebook starts you will be able to see 5 .ipynb file that are as mentioned below. Click on any of the below mentioned file to open it in new tab.
Question_2.ipynb Question_6.ipynb Question_12.ipynb Question_16.ipynb Question_17.ipynb
Step 4: Click on 'cell' in menu bar and select 'Run All' option and this will Run all the cells in that .ipynb file to give the analysis results and graphs.
--- D0 'step 3' and 'step 4' to run all the the questions ---