mrinal1704

Senior Data Scientist at Capital One

Capital OneNew Delhi

Pinned Repositories

awesome-readme
A curated list of awesome READMEs
1 1 00
Credit-Card-Application-Fraud-Detection-using-Supervised-machine-learning-models
The provided dataset contained application (identity) fraud cases. It was a supervised problem as the data included a column showing the application’s fraud label (whether an application was fraudulent or not). It also contained several identifying data fields about the applicant such as SSN, address, phone number, etc. The dataset had 1,000,000 records and 10 data fields. We first described and visualized each of the 10 data fields and treated all frivolous values. Then we created 634 candidate variables and performed feature selection to reduce them to 30. Finally, we used a few different machine learning algorithms (both linear and nonlinear) to predict fraudulent applications records.
Language:Jupyter Notebook11
Credit-Card-Transaction-Fraud-Detection-using-Supervised-Machine-learning-with-an-Imbalanced-dataset
Credit card fraud is a burden for organizations across the globe. Specifically, $24.26 billion were lost due to credit card fraud worldwide in 2018, according to shiftprocessing.com. In this project, our goal was to build an effective and efficient model to predict fraud. We analyzed a real-world dataset that contained a list of government related credit card transactions over the 2010 calendar year. The data presented a supervised problem as it included a column showing the transaction’s fraud label (whether a transaction was fraudulent or not). It also contained identifying information about each transaction such as the credit card number, merchant, merchant state, etc. The dataset had 96,753 records and 10 data fields. We first described and visualized each of the 10 data fields, cleaned the dataset, and filled in missing values. Then we created many variables and performed feature selection. Finally, we created a variety of machine learning models (both linear and nonlinear) and highlighted our results.
Language:Jupyter Notebook10 2 07
Disaster-Response-Pipeline-using-Natural-language-processing
This project contains a web app that asks for a message from a potential user who is in danger during a disaster and the app categorizes that message into a particular category such as aid related, weather-related, fire or many more using natural language processing and AdaBoost classifier.
Language:Jupyter Notebook01
Dog-Breed-Classifier-using-CNN
This repo contains a Dog breed classifier algorithm using deep learning. The main functions of this algorithm are that if a dog is detected in the image, it will provide an estimate of the dog's breed. If a human is detected, it will provide an estimate of the dog breed that is most resembling.
Language:Jupyter Notebook3 2 02
ETL-Pipelines
Contains Machine learning pipeline and ETL pipeline notebooks in order to practice and learn their working.
Language:Jupyter Notebook1 2 01
My-First-Python-Package
This repo contains the files of a python package developed by me which automates the task of getting basic insights from data.
Language:Python8 4 04
NYC-Property-Tax-Record-Fraud-Detection-using-Unsupervised-learning-models
This project analyzes New York City’s (NYC’s) real estate data to specifically identify property tax fraud. The main indicators of property tax fraud were property tax assessments that were too high or too low. Given a property dataset of 1,070,994 records and 32 data fields, we first described, visualized, and filled in missing values for each variable. Second, 45 additional variables were created in order to create the most accurate algorithm. Next, we used dimensionality reduction techniques to refine our dataset. Finally, we used (principal component analysis (PCA) and an autoencoder) to obtain two separate fraud scores. The scores were combined and then ranked to get a final fraud score.
Language:Jupyter Notebook24
Recommendation-Engine-with-IBM
This repo contains my first hands-on experience in developing a Recommendation engine using IBM Watson Studio dataset. The goal is to recommend the articles to the user using varius types of Recommendation engines that I studied while pursuing my Data Science Nanodegree from Udacity.
Language:Jupyter Notebook0 2 02
SQL-Leetcode-Challenge
Contains all the 117 Leetcode questions with their solutions ranging from Easy to Hard in MySQL.
Language:TSQL729 17 1458

mrinal1704's Repositories

mrinal1704/SQL-Leetcode-Challenge
Contains all the 117 Leetcode questions with their solutions ranging from Easy to Hard in MySQL.
Language:TSQL729 17 1458
mrinal1704/Credit-Card-Transaction-Fraud-Detection-using-Supervised-Machine-learning-with-an-Imbalanced-dataset
Credit card fraud is a burden for organizations across the globe. Specifically, $24.26 billion were lost due to credit card fraud worldwide in 2018, according to shiftprocessing.com. In this project, our goal was to build an effective and efficient model to predict fraud. We analyzed a real-world dataset that contained a list of government related credit card transactions over the 2010 calendar year. The data presented a supervised problem as it included a column showing the transaction’s fraud label (whether a transaction was fraudulent or not). It also contained identifying information about each transaction such as the credit card number, merchant, merchant state, etc. The dataset had 96,753 records and 10 data fields. We first described and visualized each of the 10 data fields, cleaned the dataset, and filled in missing values. Then we created many variables and performed feature selection. Finally, we created a variety of machine learning models (both linear and nonlinear) and highlighted our results.
Language:Jupyter Notebook10 2 07
mrinal1704/My-First-Python-Package
This repo contains the files of a python package developed by me which automates the task of getting basic insights from data.
Language:Python8 4 04
mrinal1704/Dog-Breed-Classifier-using-CNN
This repo contains a Dog breed classifier algorithm using deep learning. The main functions of this algorithm are that if a dog is detected in the image, it will provide an estimate of the dog's breed. If a human is detected, it will provide an estimate of the dog breed that is most resembling.
Language:Jupyter Notebook3 2 02
mrinal1704/NYC-Property-Tax-Record-Fraud-Detection-using-Unsupervised-learning-models
This project analyzes New York City’s (NYC’s) real estate data to specifically identify property tax fraud. The main indicators of property tax fraud were property tax assessments that were too high or too low. Given a property dataset of 1,070,994 records and 32 data fields, we first described, visualized, and filled in missing values for each variable. Second, 45 additional variables were created in order to create the most accurate algorithm. Next, we used dimensionality reduction techniques to refine our dataset. Finally, we used (principal component analysis (PCA) and an autoencoder) to obtain two separate fraud scores. The scores were combined and then ranked to get a final fraud score.
Language:Jupyter Notebook24
mrinal1704/awesome-readme
A curated list of awesome READMEs
1 1 00
mrinal1704/Credit-Card-Application-Fraud-Detection-using-Supervised-machine-learning-models
The provided dataset contained application (identity) fraud cases. It was a supervised problem as the data included a column showing the application’s fraud label (whether an application was fraudulent or not). It also contained several identifying data fields about the applicant such as SSN, address, phone number, etc. The dataset had 1,000,000 records and 10 data fields. We first described and visualized each of the 10 data fields and treated all frivolous values. Then we created 634 candidate variables and performed feature selection to reduce them to 30. Finally, we used a few different machine learning algorithms (both linear and nonlinear) to predict fraudulent applications records.
Language:Jupyter Notebook11
mrinal1704/ETL-Pipelines
Contains Machine learning pipeline and ETL pipeline notebooks in order to practice and learn their working.
Language:Jupyter Notebook1 2 01
mrinal1704/Job_Salary_Prediction
A web application that predicts the salary of a job posting from job description and location using machine learning algorithms.
Language:Python10
mrinal1704/python_scraping
clone:https://github.com/REMitchell/python-scraping
1
mrinal1704/Disaster-Response-Pipeline-using-Natural-language-processing
This project contains a web app that asks for a message from a potential user who is in danger during a disaster and the app categorizes that message into a particular category such as aid related, weather-related, fire or many more using natural language processing and AdaBoost classifier.
Language:Jupyter Notebook01
mrinal1704/Recommendation-Engine-with-IBM
This repo contains my first hands-on experience in developing a Recommendation engine using IBM Watson Studio dataset. The goal is to recommend the articles to the user using varius types of Recommendation engines that I studied while pursuing my Data Science Nanodegree from Udacity.
Language:Jupyter Notebook0 2 02
mrinal1704/AIPND
Code and associated files for the AI Programming with Python Nanodegree Program
Language:Jupyter Notebook1 0
mrinal1704/course-collaboration-travel-plans
mrinal1704/INF-552
HW submission for INF - 552 (Machine Learning for Data Science)
1
mrinal1704/Job-Skills-Extraction
mrinal1704/machine-learning
Content for Udacity's Machine Learning curriculum
mrinal1704/mlflow-spark-summit-2019
MLFlow Spark Summit 2019 Presentation
mrinal1704/salary-predictor
Deep learning model using NLP to predict job salary based on Indeed job postings
mrinal1704/SaralDyeChems-FE
mrinal1704/size-limit
Calculate the real cost to run your JS app or lib to keep good performance. Show error in pull request if the cost exceeds the limit.
mrinal1704/tensorflow
An Open Source Machine Learning Framework for Everyone
Language:C++1 0
mrinal1704/usc-dso-570
Course material for the class DSO 570: The Analytics Edge (Data, Models and Effective Decisions) at USC Marshall School of Business

mrinal1704

Pinned Repositories

awesome-readme

Credit-Card-Application-Fraud-Detection-using-Supervised-machine-learning-models

Credit-Card-Transaction-Fraud-Detection-using-Supervised-Machine-learning-with-an-Imbalanced-dataset

Disaster-Response-Pipeline-using-Natural-language-processing

Dog-Breed-Classifier-using-CNN

ETL-Pipelines

My-First-Python-Package

NYC-Property-Tax-Record-Fraud-Detection-using-Unsupervised-learning-models

Recommendation-Engine-with-IBM

SQL-Leetcode-Challenge

mrinal1704's Repositories

mrinal1704/SQL-Leetcode-Challenge

mrinal1704/Credit-Card-Transaction-Fraud-Detection-using-Supervised-Machine-learning-with-an-Imbalanced-dataset

mrinal1704/My-First-Python-Package

mrinal1704/Dog-Breed-Classifier-using-CNN

mrinal1704/NYC-Property-Tax-Record-Fraud-Detection-using-Unsupervised-learning-models

mrinal1704/awesome-readme

mrinal1704/Credit-Card-Application-Fraud-Detection-using-Supervised-machine-learning-models

mrinal1704/ETL-Pipelines

mrinal1704/Job_Salary_Prediction

mrinal1704/python_scraping

mrinal1704/Disaster-Response-Pipeline-using-Natural-language-processing

mrinal1704/Recommendation-Engine-with-IBM

mrinal1704/AIPND

mrinal1704/course-collaboration-travel-plans

mrinal1704/INF-552

mrinal1704/Job-Skills-Extraction

mrinal1704/machine-learning

mrinal1704/mlflow-spark-summit-2019

mrinal1704/salary-predictor

mrinal1704/SaralDyeChems-FE

mrinal1704/size-limit

mrinal1704/tensorflow

mrinal1704/usc-dso-570