/pySpark-learn

Practising PySpark by solving exercises such as email classification, clustering data and pandas equivalent to pySpark.

Primary LanguageJupyter Notebook

pySpark-learn

Simple exercises to improve my knowledge of PySpark. Exercises include:

  1. Classifying emails as spam or ham
  2. Find out how many clusters are in a dataset
  3. PySpark equivalent of pandas