Assignment

Step 1- EDA

1.Checking data types of the features and any feature contain NAN values or not. 2.Descriptive statistics include those that summarize the central tendency, dispersion and shape of a dataset’s distribution 3.Checking for multicolinearity 4.Checking the ditribution of the Features. 5.Cheking if the features have any outlier or not. 6.Checking the frequency of the target varibale. If it's a imbalanced calssification or not.

Step 2- Preprocessing

Remove the highly correleated features.
Remove the outlier from the features
Done scaling to the features
Applied Oversampling technique to solve the imbalanced class problem.

Step 3- Spiltiing, Modeling and Feature Importance

Split the the dataset with ratio of 4:1
Fit the Random Forest Model
Make a list of top 10 important feature using Feature Importance

Step 4- Final Modelling

Select top 10 most important from the train_dataset
Done Preprocessing on the selected dataset
Split the dataset with 4:1 ratio
Train the train-data using Random Forest

Step 5- Evaluation

Model evaluation has been done by some of the most important metics-

ROC_AUC Score
Confusion Matrix 3.Precision
Recall
F1-Score

Step6- Hyperparameter Optimization

Tried different paramater using Grid Search to check if it's possible to increase the model performance or not.