/pySpark_tutorial

Implementation of Spark code in Jupyter notebook. Topics include: RDDs and DataFrame, exploratory data analysis (EDA), handling multiple DataFrames, visualization, Machine Learning

Primary LanguageJupyter Notebook

pySpark_tutorial

List of contents

  • RDDs and DataFrame
  • Exploratory data analysis
  • Handeling multiple dataframes
  • Visualization
  • Machine learning