/DAT4

Class Materials for General Assembly's Data Science course held in San Francisco

Primary LanguageJavaScriptArtistic License 2.0Artistic-2.0

GA_DS_SF_2014_01

Class Materials for General Assembly's Data Science course held in San Francisco

IMPORTANT DATES

  • 1/27: First Project Proposals
  • 2/3: Formal Project Proposals (including data and methods chosen)
  • 2/17: NO CLASS
  • 2/19: Project live on Github
  • 3/3: Peer Feedback

SYLLABUS

A. OVERVIEW

  1. TOOLS OF THE TRADE
    • Introduction to Data Science; Lab on Python

B. DATA

  1. BIG DATA
    • MAP-REDUCE AND HADOOP
    • DISTRIBUTED COMPUTING AND IPYTHON.PARALLEL
  2. SEMI-STRUCTURED DATA: REST APIS, MONGO, ETC.
    • WORKING WITH MONGO, API REQUESTS, AND JSON
  3. STRUCTURED DATA: RELATIONAL DATABASES AND DATAFRAMES
    • RDBMS AND PANDAS

C. SCIENCE

  1. R
    • EXPLORATORY DATA ANALYSIS: R & GGPLOT
    • MACHINE LEARNING in R
  2. INFORMATION RETRIEVAL (Guest Lecture)
  3. MACHINE LEARNING IN R
    • KNN CLASSIFICATION
    • NAIVE BAYES CLASSIFICATION
    • LOGISTIC REGRESSION
    • DIMENSIONALITY REDUCTION
    • ARTIFICIAL NEURAL NETWORKS
    • CLUSTERING AND K-MEANS
  4. MACHINE LEARNING in PYTHON
    • NATURAL LANGUAGE PROCESSING
    • DECISION TREES AND RANDOM FORESTS
    • SUPPORT VECTOR MACHINES
    • ENSEMBLE METHODS
    • RECOMMENDATION SYSTEMS