/IA-Data-Analysis-Project

This repository showcases my fourth-year AI project at CESI EXIA, focusing on employee attrition prediction using data wrangling and machine learning techniques. The project code is implemented in Jupyter Notebook

Primary LanguageJupyter Notebook

Employee Attrition Prediction AI Project - CESI EXIA 4th Year

This GitHub repository holds the work done during my fourth year of Computer Science studies at CESI EXIA School as part of an AI project focusing on employee attrition prediction using data wrangling and machine learning techniques. The project code is implemented in Jupyter Notebook.

Project Background

During my fourth year at CESI EXIA, I undertook a data-driven AI project aimed at addressing employee attrition within an organization. Employee attrition, or the rate at which employees leave a company, can significantly impact productivity and organizational success. My goal was to build a predictive model that could identify potential attrition risks and enable proactive measures to retain valuable employees.

Data Wrangling and Preparation

The project started with data wrangling and preparation, where I explored, structured, and cleaned the dataset. I imported the data from CSV files, merging relevant information to create a comprehensive dataset for analysis. Dealing with missing values, converting date columns to usable datetime objects, and calculating the average working time per employee were some of the essential steps in this phase.

Data Analysis and Transformation

To make the data suitable for machine learning algorithms, I created pipelines for handling missing values and scaling numerical features using StandardScaler. Additionally, I employed OneHotEncoder to transform categorical data into a numerical format, ensuring the models could effectively process the information.

Correlation Analysis and Feature Selection

To gain insights into the relationship between various attributes and employee attrition, I conducted correlation analysis. This helped me identify the most critical features that significantly impacted attrition. Understanding these correlations was crucial in building a robust predictive model.

Model Training and Evaluation

I then trained and evaluated multiple machine learning models, including Logistic Regression, Perceptron, Stochastic Gradient Descent, and Random Forest Classifier. To ensure optimal model performance, I fine-tuned the hyperparameters using GridSearchCV. I evaluated the models using the confusion matrix, recall, and precision metrics to assess their predictive capabilities.

Conclusion

Completing this AI project was a significant accomplishment during my fourth year at CESI EXIA. It allowed me to apply my data science and machine learning skills to address a real-world problem faced by organizations. The predictive model I developed could provide valuable insights to employers, enabling them to take proactive measures to retain talented employees and reduce attrition rates.