
Data & Code for the Global Big Data Conference R Workshop

Primary LanguageR


Data & Code for the Global Big Data Conference R Workshop

Conference Speaker URL : http://globalbigdataconference.com/43/santa-clara/big-data-bootcamp/speaker-details/567/krishna-sankar.html

Slides at Slideshare http://goo.gl/2rijLz

Abstract :

A hands-on workshop with R, Data Science, Algorithms and interesting data. We might even try couple of Kaggle competitions.

Outline :

  1. Intro, Goals, Logistics, Setup [10] [9:00-9:10)
  2. Introduction to R [30] [9:10-9:40)
  3. Who will win Superbowl XLIX ? The Art of ELO Ranking [30] [9:40-10:10)
  4. Anatomy of a Kaggle Competition [40] [10:10-10:50)
    • Competition Mechanics
    • Register, download data, create sub directories
    • Trial Run : Submit Titanic
  5. Break [20] [10:50-11:10)
  6. Algorithms for the Amateur Data Scientist [20] [11:10-11:30)
    • Algorithms, Tools & frameworks in perspective
    • “Folk Wisdom”
  7. Model Evaluation & Interpretation [30] [11:30 - 12:00)
    • Confusion Matrix, ROC Graph
  8. Questions/Discussions

Requirements for the attendees:

This will be a hands-on workshop. So would be most benefitial if you can come prepared.

  1. Install R

  2. Install R Studio

  3. Clone this github repository (and update before going to bed on Saturday ;o) for the latest files)

  4. Download data from Kaggle. I cannot distribute Kaggle data.

  1. We also will be using the data 2014 NFL Boxscore