/gads

materials for GA intro to data science class (2013)

Primary LanguageJupyter Notebook

=========== syllabus

INTRO

  • setup (installing python & scientific libraries)
  • intro to DS/ML/data exploration with UNIX
  • logistic regression
  • intro to ML theory (train/test/OOS, cv, overfitting, etc)
  • regularization, bias/variance tradeoff, sample complexity, VC dimension

MODELS

  • kNN, greedy algos
  • probability, classical statistics, naive Bayes
  • decision trees & random forests
  • ensemble methods
  • clustering
  • svm's

ETC

  • visualization w/ d3
  • intro to nosql
  • map-reduce
  • dimensionality reduction
  • recommender systems
  • site visit?

PROJECTS

  • working session
  • presentations I
  • presentations II
  • (open, see below)

===========

addl (1/2 lectures):

  • project progress check
  • industry stuff
  • 3x guest speakers

===========

possible wkd sessions:

  • pandas
  • open data (incl apis)