This repository contains code to train and deploy a PCR and K-Means model with Amazon sagemaker to segment the US population based on US census data.
- Data loading and exploration
- Data cleaning and pre-processing
- Dimensionality reduction with PCA
- Feature engineering and data transformation
- Clustering transformed data with k-means
- Extracting trained model attributes and visualizing k clusters
These tasks make up a complete machine learning workflow from data loading and cleaning to model deployment.