
Gender Classification with categorical data variables using One Hot Encoding. The model is trained by Linear Regression and Decision Tree Classifier

Primary LanguageJupyter Notebook


Gender Classification with categorical data variables using One Hot Encoding. The model is trained by Linear Regression and Decision Tree Classifier

Gender classification based on a survey of Favorite Color, Music Beverage and Soft Drink

The dataset is openly available in Kaggle

This code shows how one hot encoding can be used in pandas to deal with categorical data

Two machine learning algorithms : Linear Regression and Decision Tree Classifier are used.

Model Accuracies:

Linear Regression : 26.80% Accurate

Decision Tree : 95.45% Accurate
