It contains the exploratory data analysis of the Human Resources Analytics dataset from Kaggle.
To know more about the dataset please follow the link: https://www.kaggle.com/ludobenistant/hr-analytics
I am using Python 3.6.1 for the project. You need to install the fllowing Python libraries:
- NumPy (for documentation:http://www.numpy.org/)
- Pandas (for documentation:http://pandas.pydata.org/)
- Matplotlib (for documentation: https://matplotlib.org/)
- Seaborn (for documentation: https://seaborn.pydata.org/)
I have used Jupyter Notebook for the data exploration.
The complete code is in the 'HR_notebook.ipynb' file.
You can see the data in 'HR_comma_sep.csv' file.
https://www.kaggle.com/ludobenistant/hr-analytics/downloads/human-resources-analytics.zip
These above file is in the .zip format. Please extract the files to get the .csv file out of it.
*This dataset is simulated
Why are our best and most experienced employees leaving prematurely? Have fun with this database and try to predict which valuable employees will leave next. Fields in the dataset include:
- Satisfaction Level
- Last evaluation
- Number of projects
- Average monthly hours
- Time spent at the company
- Whether they have had a work accident
- Whether they have had a promotion in the last 5 years
- Departments
- Salary
- Whether the employee has left
Hope it helps, Regards - Nilay.