Hiring prediction on unbalanced data
The topic is hiring prediction, from data representating job's applications. The dataset represents the job's applications and the features are:
- date: date of application,
- hair: color of hair,
- age,
- experience: number of years of experience,
- salary: salary expectation,
- gender,
- diploma,
- speciality,
- note: technical test note,
- availability,
- hiring: target variable.
The goal is to predict the hiring variable which is either 'yes' or 'no'. Consequently the problem is turned into a binary classification task. Moreover, the data are not correlated and unbalanced.
Outline:
-
- Exploratory Data Analysis
-
- Statistical analysis
-
- Model selection
-
- Conclusion
Python modules:
- mydata_stats.py
- mydata_processing.py
- mymodeling.py