/Spark

Assignment3

Primary LanguageScala

Spark

Assignment3

##Part 1

Import the csv file in the shell and using RDD computation, answer to the following question :

  1. What is the crime that happens the most in Sacramento ?
  1. Give the 3 days with the highest crime count
  1. Calculate the average of each crime per day

Reproduce the same code using DataFrames

Build.sbt

Part1 Scala file

##Part 2

Using either DataFrames or RDDs, export a CSV file that contains the average of crimes per day per districts.

Same Scala

out.csv