Population Segmentation using PCR and K-Means with Amazon Sagemaker

This repository contains code to train and deploy a PCR and K-Means model with Amazon sagemaker to segment the US population based on US census data.

General Project Outline

  1. Data loading and exploration
  2. Data cleaning and pre-processing
  3. Dimensionality reduction with PCA
  4. Feature engineering and data transformation
  5. Clustering transformed data with k-means
  6. Extracting trained model attributes and visualizing k clusters

These tasks make up a complete machine learning workflow from data loading and cleaning to model deployment.