/Clustering-Countries

An application of the KMeans algorithm to cluster the most in need countries on the basis of health and socio-economic factors

Primary LanguageJupyter Notebook

Clustering Countries


Course : COCSC16 | Data Mining | Fall 2021

About

  • This project was an application of an Unsupervised Learning which seeks to cluster similar countries using the K-Means++ Clustering Algorithm. The clustering is done for countries on the basis of health based and socio-economic factors and thus a subset of most deprived countries is extracted.
  • The most in need countries were clustered together depending upon various factors like income per capita, health expenditure by gov. per capita, mortality rate of children and some other important factors which determined the overall develoment of a country.
  • All the relevant score plots, histograms and EDA plots are included in the repository.
  • The notebook, Clustering Countries.ipynb contains the final notebook. Click here if the notebook is not rendered properly.