/Feature-Selection-and-Label-Encoding

Data Visualisation, Categorical, Nominal, Ordinal, Numerical features, Data Imbalance, Missing data handling, Features Selection, Binary Classification, Evaluation

Primary LanguageJupyter Notebook

EDA, Label, Feature Selection and Binary Classification

Dataset: https://www.kaggle.com/c/cat-in-the-dat-ii/data

Standard approach to solve classification problem in Machine Learning

- Data Visualisation
- Handling numerical, ordinal, categorical and nominal type of features
- Handle Imbalance (If there is any)
- Encoding features and make them useful for training the model
- Feature Selection technique
- Classification
- Model Evaluation

Dataset Overview

Data

  • From bin_0 to month columns indicate the independent variables and target column indicate dependent varibale (Binary)
  • Refer to EDA notebook for more details about each features

Missing values visualisation in the dataset

Missing

Data Distrubution

Distribution

Correlation among numerical features present in the dataset

Corr