Here are the materials I used to lead the OCRUG book club session on modeling & pipeline in Spark R. This repository includes my teaching slides and exercises in modeling with Spark R. The modeling includes supervised learning using decision tree regressor (house_price dataset) and the unsupervised learning includes k-means clustering (iris dataset).
Data - 'house_price.csv'
Slides - 'chap_4-5_exercise_slides.Rmd' & 'chap_4-5_exercise_slides.html'
Exercise - 'chap_4-5_exercise_template.Rmd'
Exercise key - 'chap_4-5_exercise_answer.Rmd'
Full code with output - 'Supervised_and_Unsupervised_Learning_Models_Demo.pdf'