/Machine-Learning-POCs

Machine Learning & Data Analysis Implementations

Primary LanguageJupyter Notebook

Machine-Learning-POCs

  1. Fraud Detection Data Analysis -

Fraud Detection dataset project with a aim to implement various data analysis and machine learning modeling techniques. Following are the components in notebook:-

  • Data Glimpse - Data Distribution of some of the continuous variables, Target variable distribution along with other variables, other EDA.
  • Data Preparation - Balancing the unbalanced dataset, Data chunking.
  • Data Preprocessing - Missing Values Treatment, Outlier detection & Capping, Feature Engineering & Data Encoding.
  • Data Modeling - Baseline Random Forest, Adaboost (Ensemble), Linear SVC. Accuracy measures - AUC_ROC curve, Classification Matrix.

Model Inferences and Conclusions in the end notes of py notebook.

  1. Gradient Descent From Scratch -

Implementation of Gradient Descent using Logistic Regression from Scratch for Predictions. Gradient Descent is a first order iterative optimization algorithm fro finding the minimum of a function. Gradient descent is also known as steepest descent. Data could be manipulated in the "main.py" file and the same file could be run for Predictions.