distributed-computing-spark
Slides and samples used in Distributed Computing with Spark talk.
Sample1 : Most retweeted
First example is a simple snippet used for guess the most retweeted tweet of a bunch of them. It also explore some options at deploying embeded Spark cluster and some basic features.
Sample2: Most retweeted (with SparkSQL)
Same example as before, but using SparkSQL syntax...
How to run
sbt run