
This Github repository contains projects related to K-means Clustering. Exploring Insights/Inferences by performing EDA on the given project data (Airlines data).

Primary LanguageJupyter Notebook

K-means Clustering Project


Project: To identify clusters of passengers that have similar characteristics for the purpose of targeting different segments for different types of mileage offers.


Exploring Insights/Inferences by performing EDA on the given data. Relevant graphs were plotted to get some insights on data using seaborn package. Model fitting via K-means Clustering by Importing sklearn package.

Python Libraries Used:

  • Pandas
  • Numpy
  • Matplotlib
  • Seaborn
  • Scikit learn
  • Joblib


  1. Data copying and cleaning:

    • Read the csv file
    • copy the data
    • check for null values and other informations
    • drop the duplicate values
  2. Exploratory Data Analysis:

    • Conduct all the necessary EDA using various graphs on the dataset
    • interpret the graphs
    • check for outliers and correlation among the coloumns
    • perform one hot encoding in case of categorical columns
  3. Sampling of data:

    • Scaling the data using StandardScaler and Normalizer
  4. Modelling of data:

    • import K-means clustering and initialize it
    • determine the k value using elbow method
    • fit the model
    • predict the model


Still Learning,

So feel free, Anything You wanna contirubute.