/hiring-prediction

binary classification on unbalanced data

Primary LanguageJupyter Notebook

Hiring prediction on unbalanced data

The topic is hiring prediction, from data representating job's applications. The dataset represents the job's applications and the features are:

  • date: date of application,
  • hair: color of hair,
  • age,
  • experience: number of years of experience,
  • salary: salary expectation,
  • gender,
  • diploma,
  • speciality,
  • note: technical test note,
  • availability,
  • hiring: target variable.

The goal is to predict the hiring variable which is either 'yes' or 'no'. Consequently the problem is turned into a binary classification task. Moreover, the data are not correlated and unbalanced.

Outline:

    1. Exploratory Data Analysis
    1. Statistical analysis
    1. Model selection
    1. Conclusion

Python modules:

  • mydata_stats.py
  • mydata_processing.py
  • mymodeling.py