/spark-analytics

spark analytics using pyspark, spark dataframes and spark sql, parsing user logs, handling unstructured data

Primary LanguageJupyter NotebookMIT LicenseMIT

  • To run the project clone the repository
  • Install the requirements from requirements.txt
  • Make sure spark-3.0.1-bin-hadoop2.7 installed on system Find Installation Instructions here.
  • Run the project using Spark-exercises notebook
  • Experiment with project creating your own queries