We meet on alternating Wednesdays from 3-5pm at D-Lab (Barrows 356). We have no expectation of prior machine learning experience, and simply go through one algorithm a meeting, with about 30 minutes each in R & Python. We also incorporate lightning talks and other guest presentations throughout our meetings.
Fall 2018 - unsupervised methods
- Sept. 5: Principal component analysis (PCA)
- Sept. 19: K-means clustering
- Oct. 3: Hierarchical clustering
- Oct. 17: Medoid partitioning
- Oct. 31: tSNE
- Nov. 14: UMAP
- Nov. 28: Latent class analysis
- Dec. 12: Lightning talks
We are always looking for student/staff/faculty presenters. Please contact us if you are interested!
More information on the D-Lab MLWG website
- Spring 2018
- k-nearest neighbors
- decision tree
- random forest
- gradient boosting
- elastic net
- Fall 2017
- basics of neural networks for image processing
- Spring 2017
- k-nearest neighbors
- stepwise regression
- linear and polynomial regression, smoothing splines
- multivariate adaptive regression splines and generalized additive models
- support vector machines
- neural networks.
- Fall 2016
- decision trees, random forests, penalized regression, and boosting
Books:
- Intro to Statistical Learning by James et al. (free pdf) (Amazon)
- Applied Predictive Modeling by Max Kuhn (Amazon)
- Python Data Science Handbook by Jake VanderPlas (online version)
- Elements of Statistical Learning by Hastie et al. (free pdf) (Amazon)
- Modern Multivariate Statistical Techniques by Alan Izenman (Amazon)
- Differential Equations and Linear Algebra by Stephen Goode and Scott Annin (Amazon)
Other:
- R Markdown: The Definitive Guide
- Introduction to Probability and Statistics Using R, 3rd ed.
- R for Data Science
- The tidyverse style guide
- bookdown
- Quick Intro to Parallel Computing in R
Help:
Courses at Berkeley:
- Stat 154 - Statistical Learning
- CS 189 / CS 289A - Machine Learning
- COMPSCI x460 - Practical Machine Learning with R [UC Berkeley Extension]
- PH 252D - Causal Inference
- PH 295 - Big Data
- PH 295 - Targeted Learning for Biomedical Big Data
- Data 8, CS61A, CS61B, CS 61C, CS70/Math 55, CS 188, CS 189, Math 53, Math 54, Math 110, Stat 28, Stat 20/21, Stat 133, Stat 134/140, Data 100.
Online classes:
- D-Lab's Introduction to Data Science for Social Scientists
- Data 8X: Foundations of Data Science
- Tibshirani and Hastie's Statistical Learning Free Course
- Coursera Data Science Specialization
- edX - Principles of Machine Learning
- edX - Applied Machine Learning
- Coursera - Machine Learning
Other Campus Groups: