/Data-Science-Fall2023-JHU

Data Science Course

Primary LanguageJupyter Notebook

Data Science Fall 2023 at JHU AMS

01 - Intro

02 - Distributions and Python

03 - Samples

### Section Week 2 

04 - Optimization

05 - Least Squares

06 - PCA

### Section Week 3 

07 - PCA app

08 - PCA scree

09 - Bayes

10 - BayesCont

### Section Week 4 
### Section Week 5 

11 - Classification

### Practice Exam 1 

12 - NaiveBayes

### Section Week 6 

13 - Cross-Validation

### Section Week 7 

14 - DecisionTree

15 - Random Forest

16 - Logistic Regression and Support Vector Machines

### Section Week 8 

17 - Clustering

### Section Week 10 

18 - Gaussian Mixtures

This is a tutorial on Clustering, a statistical learning technique for collecting a set of objects into groups or clusters of similar items. The notebook is focused on two main clustering algorithms: k-means clustering and Gaussian mixture. k-means clustering is a simple (flat) algorithm that groups data points into clusters based on their similarity, and Gaussian mixture models clusters data points based on their probability density. The article also discussed the limitations of clustering algorithms and how to determine the number of clusters to use. They provides examples to demonstrate the applications of clustering in real-world problems, such as discovering different species of birds based on their photographs, segmenting an image based on pixel colors, and organizing news articles that cover the same story.

19 - Spectral

### Section Week 11

The code provides an introduction to spectral embedding and spectral clustering. It first introduces statistical learning and the four general categories of methods. Next, it explains graphs, similarity graphs, and adjacency matrices. It provides a simple implementation of vertices and edges and then explains how an adjacency matrix can be used to encode whether two vertices are connected. The code then explains spectral clustering and introduces the graph Laplacian, which can help in cutting the graph into smaller pieces with minimal damage. Finally, it explains an interesting property of the graph Laplacian, which is that it can be used to calculate the bilinear expression x^TLx, which is equal to 1/2 * sum(a_ij * (x_i - x_j)^2).

20 - Embedding

21 - IsomapLLE

### Practice Exam 2