Pa355
I am a Data Scientist with recent experience fin-tech. My formal education is in analytics. When I am not building models, I am doing my theatre rehearsal
Drexel UniversityPrinceton NJ
Pinned Repositories
getting-started
Getting started with Docker
Kidney-Disease-Prediction
Chronic Kidney Disease Prediction using Logistic Regression
Loan-Delinquincy
To help banks identify the criteria to approve loans for an individual customer such that the likelihood of the loan delinquency is minimized. Also, derive the factors that drive the behavior of loan delinquency
MSBAnCapstone
Behavioral segmentation and forecasting based on natural gas consumption
Term-Deposit-Subscription-using-Hadoop---PySpark-
The problem statement is to find out whether the client will subscribe to term deposit or not. We are given the details of client including age, marital status, education, account balance, loan and so on… Prepared machine learning pipeline using String indexer for Categorical variables , One hot encoding for Numerical variables and Vector Assembler. Implemented Gradient Boost, Random Forest and Logistic Regression models and compared the best accuracy.
trustpage-data-challenge
This data science exercise is performed in Python 3 as a part of the SDS challenge for Trustpage. The task is to develop an unsupervised method that can determine whether a movie review is positive or negative. The labels provided in the data set are used to evaluate your method.
Pa355's Repositories
Pa355/Kidney-Disease-Prediction
Chronic Kidney Disease Prediction using Logistic Regression
Pa355/MSBAnCapstone
Behavioral segmentation and forecasting based on natural gas consumption
Pa355/Term-Deposit-Subscription-using-Hadoop---PySpark-
The problem statement is to find out whether the client will subscribe to term deposit or not. We are given the details of client including age, marital status, education, account balance, loan and so on… Prepared machine learning pipeline using String indexer for Categorical variables , One hot encoding for Numerical variables and Vector Assembler. Implemented Gradient Boost, Random Forest and Logistic Regression models and compared the best accuracy.
Pa355/getting-started
Getting started with Docker
Pa355/Loan-Delinquincy
To help banks identify the criteria to approve loans for an individual customer such that the likelihood of the loan delinquency is minimized. Also, derive the factors that drive the behavior of loan delinquency
Pa355/trustpage-data-challenge
This data science exercise is performed in Python 3 as a part of the SDS challenge for Trustpage. The task is to develop an unsupervised method that can determine whether a movie review is positive or negative. The labels provided in the data set are used to evaluate your method.