The goal of this project is to build a classification machine learning model to predict the salary of an employee. An individual’s annual income results from various factors. Intuitively, it is influenced by the individual’s education level, age, gender, occupation, and etc.
An app was built using Streamlit and deployed on Heroku. The app allows for a user to input the listed features and select an algorithm to run which shows the results and allow for the classification model to predict the expected results.
A pipeline was built to collect the data and perform machine learning to predict if the transaction was fraud and deployed an app to show the results. .
The dataset contains 16 columns Target filed: Income -- The income is divide into two classes: <=50K and >50K Number of attributes: 14 -- These are the demographics and other features to describe a person
Attribute Information:
- age
- workclass
- fnlwgt
- education
- educational-num
- marital-status
- occupation
- relationship
- race
- gender
- capital-gain
- capital-loss
- hours-per-week
- native-country
- income
Machine Learning classification algorithms:
- Logistic Regression
- DecisionTree
The following tools were used in this project:
- SQL, Python & Pandas to clean, explore and generate the final modeling data.
- Matplotlib and Seaborn to generate visualizations.
- SKLearn to build Machine Learning classification models and measuring metrics.
- Streamlit to develop the app.
- Heroku to deploy the app.
- Docker to create a smooth pipeline.
The findings and slide deck accompanying this project's presentation are accessible in this GitHub repository.