Pinned Repositories
awesome-readme
A curated list of awesome READMEs
Credit-Card-Application-Fraud-Detection-using-Supervised-machine-learning-models
The provided dataset contained application (identity) fraud cases. It was a supervised problem as the data included a column showing the application’s fraud label (whether an application was fraudulent or not). It also contained several identifying data fields about the applicant such as SSN, address, phone number, etc. The dataset had 1,000,000 records and 10 data fields. We first described and visualized each of the 10 data fields and treated all frivolous values. Then we created 634 candidate variables and performed feature selection to reduce them to 30. Finally, we used a few different machine learning algorithms (both linear and nonlinear) to predict fraudulent applications records.
Credit-Card-Transaction-Fraud-Detection-using-Supervised-Machine-learning-with-an-Imbalanced-dataset
Credit card fraud is a burden for organizations across the globe. Specifically, $24.26 billion were lost due to credit card fraud worldwide in 2018, according to shiftprocessing.com. In this project, our goal was to build an effective and efficient model to predict fraud. We analyzed a real-world dataset that contained a list of government related credit card transactions over the 2010 calendar year. The data presented a supervised problem as it included a column showing the transaction’s fraud label (whether a transaction was fraudulent or not). It also contained identifying information about each transaction such as the credit card number, merchant, merchant state, etc. The dataset had 96,753 records and 10 data fields. We first described and visualized each of the 10 data fields, cleaned the dataset, and filled in missing values. Then we created many variables and performed feature selection. Finally, we created a variety of machine learning models (both linear and nonlinear) and highlighted our results.
Disaster-Response-Pipeline-using-Natural-language-processing
This project contains a web app that asks for a message from a potential user who is in danger during a disaster and the app categorizes that message into a particular category such as aid related, weather-related, fire or many more using natural language processing and AdaBoost classifier.
Dog-Breed-Classifier-using-CNN
This repo contains a Dog breed classifier algorithm using deep learning. The main functions of this algorithm are that if a dog is detected in the image, it will provide an estimate of the dog's breed. If a human is detected, it will provide an estimate of the dog breed that is most resembling.
ETL-Pipelines
Contains Machine learning pipeline and ETL pipeline notebooks in order to practice and learn their working.
My-First-Python-Package
This repo contains the files of a python package developed by me which automates the task of getting basic insights from data.
NYC-Property-Tax-Record-Fraud-Detection-using-Unsupervised-learning-models
This project analyzes New York City’s (NYC’s) real estate data to specifically identify property tax fraud. The main indicators of property tax fraud were property tax assessments that were too high or too low. Given a property dataset of 1,070,994 records and 32 data fields, we first described, visualized, and filled in missing values for each variable. Second, 45 additional variables were created in order to create the most accurate algorithm. Next, we used dimensionality reduction techniques to refine our dataset. Finally, we used (principal component analysis (PCA) and an autoencoder) to obtain two separate fraud scores. The scores were combined and then ranked to get a final fraud score.
Recommendation-Engine-with-IBM
This repo contains my first hands-on experience in developing a Recommendation engine using IBM Watson Studio dataset. The goal is to recommend the articles to the user using varius types of Recommendation engines that I studied while pursuing my Data Science Nanodegree from Udacity.
SQL-Leetcode-Challenge
Contains all the 117 Leetcode questions with their solutions ranging from Easy to Hard in MySQL.
mrinal1704's Repositories
mrinal1704/SQL-Leetcode-Challenge
Contains all the 117 Leetcode questions with their solutions ranging from Easy to Hard in MySQL.
mrinal1704/Credit-Card-Transaction-Fraud-Detection-using-Supervised-Machine-learning-with-an-Imbalanced-dataset
Credit card fraud is a burden for organizations across the globe. Specifically, $24.26 billion were lost due to credit card fraud worldwide in 2018, according to shiftprocessing.com. In this project, our goal was to build an effective and efficient model to predict fraud. We analyzed a real-world dataset that contained a list of government related credit card transactions over the 2010 calendar year. The data presented a supervised problem as it included a column showing the transaction’s fraud label (whether a transaction was fraudulent or not). It also contained identifying information about each transaction such as the credit card number, merchant, merchant state, etc. The dataset had 96,753 records and 10 data fields. We first described and visualized each of the 10 data fields, cleaned the dataset, and filled in missing values. Then we created many variables and performed feature selection. Finally, we created a variety of machine learning models (both linear and nonlinear) and highlighted our results.
mrinal1704/My-First-Python-Package
This repo contains the files of a python package developed by me which automates the task of getting basic insights from data.
mrinal1704/Dog-Breed-Classifier-using-CNN
This repo contains a Dog breed classifier algorithm using deep learning. The main functions of this algorithm are that if a dog is detected in the image, it will provide an estimate of the dog's breed. If a human is detected, it will provide an estimate of the dog breed that is most resembling.
mrinal1704/NYC-Property-Tax-Record-Fraud-Detection-using-Unsupervised-learning-models
This project analyzes New York City’s (NYC’s) real estate data to specifically identify property tax fraud. The main indicators of property tax fraud were property tax assessments that were too high or too low. Given a property dataset of 1,070,994 records and 32 data fields, we first described, visualized, and filled in missing values for each variable. Second, 45 additional variables were created in order to create the most accurate algorithm. Next, we used dimensionality reduction techniques to refine our dataset. Finally, we used (principal component analysis (PCA) and an autoencoder) to obtain two separate fraud scores. The scores were combined and then ranked to get a final fraud score.
mrinal1704/awesome-readme
A curated list of awesome READMEs
mrinal1704/Credit-Card-Application-Fraud-Detection-using-Supervised-machine-learning-models
The provided dataset contained application (identity) fraud cases. It was a supervised problem as the data included a column showing the application’s fraud label (whether an application was fraudulent or not). It also contained several identifying data fields about the applicant such as SSN, address, phone number, etc. The dataset had 1,000,000 records and 10 data fields. We first described and visualized each of the 10 data fields and treated all frivolous values. Then we created 634 candidate variables and performed feature selection to reduce them to 30. Finally, we used a few different machine learning algorithms (both linear and nonlinear) to predict fraudulent applications records.
mrinal1704/ETL-Pipelines
Contains Machine learning pipeline and ETL pipeline notebooks in order to practice and learn their working.
mrinal1704/Job_Salary_Prediction
A web application that predicts the salary of a job posting from job description and location using machine learning algorithms.
mrinal1704/python_scraping
clone:https://github.com/REMitchell/python-scraping
mrinal1704/Disaster-Response-Pipeline-using-Natural-language-processing
This project contains a web app that asks for a message from a potential user who is in danger during a disaster and the app categorizes that message into a particular category such as aid related, weather-related, fire or many more using natural language processing and AdaBoost classifier.
mrinal1704/Recommendation-Engine-with-IBM
This repo contains my first hands-on experience in developing a Recommendation engine using IBM Watson Studio dataset. The goal is to recommend the articles to the user using varius types of Recommendation engines that I studied while pursuing my Data Science Nanodegree from Udacity.
mrinal1704/AIPND
Code and associated files for the AI Programming with Python Nanodegree Program
mrinal1704/course-collaboration-travel-plans
mrinal1704/INF-552
HW submission for INF - 552 (Machine Learning for Data Science)
mrinal1704/Job-Skills-Extraction
mrinal1704/machine-learning
Content for Udacity's Machine Learning curriculum
mrinal1704/mlflow-spark-summit-2019
MLFlow Spark Summit 2019 Presentation
mrinal1704/salary-predictor
Deep learning model using NLP to predict job salary based on Indeed job postings
mrinal1704/SaralDyeChems-FE
mrinal1704/size-limit
Calculate the real cost to run your JS app or lib to keep good performance. Show error in pull request if the cost exceeds the limit.
mrinal1704/tensorflow
An Open Source Machine Learning Framework for Everyone
mrinal1704/usc-dso-570
Course material for the class DSO 570: The Analytics Edge (Data, Models and Effective Decisions) at USC Marshall School of Business