Class Materials for General Assembly's Data Science course held in San Francisco
- 1/27: First Project Proposals
- 2/3: Formal Project Proposals (including data and methods chosen)
- 2/17: NO CLASS
- 2/19: Project live on Github
- 3/3: Peer Feedback
A. OVERVIEW
- TOOLS OF THE TRADE
- Introduction to Data Science; Lab on Python
B. DATA
- BIG DATA
- MAP-REDUCE AND HADOOP
- DISTRIBUTED COMPUTING AND IPYTHON.PARALLEL
- SEMI-STRUCTURED DATA: REST APIS, MONGO, ETC.
- WORKING WITH MONGO, API REQUESTS, AND JSON
- STRUCTURED DATA: RELATIONAL DATABASES AND DATAFRAMES
- RDBMS AND PANDAS
C. SCIENCE
- R
- EXPLORATORY DATA ANALYSIS: R & GGPLOT
- MACHINE LEARNING in R
- INFORMATION RETRIEVAL (Guest Lecture)
- MACHINE LEARNING IN R
- KNN CLASSIFICATION
- NAIVE BAYES CLASSIFICATION
- LOGISTIC REGRESSION
- DIMENSIONALITY REDUCTION
- ARTIFICIAL NEURAL NETWORKS
- CLUSTERING AND K-MEANS
- MACHINE LEARNING in PYTHON
- NATURAL LANGUAGE PROCESSING
- DECISION TREES AND RANDOM FORESTS
- SUPPORT VECTOR MACHINES
- ENSEMBLE METHODS
- RECOMMENDATION SYSTEMS