sawankumar94

Data Scientist

Mumbai,India

Pinned Repositories

-AI-CL-688-Course-Project
This contains codes for course project
Language:R00
Coursera-Course-Certificates
0 2 00
Creating-a-Credit-Scoring-Model-to-obtain-the-probability-of-default
We have baseline and loan performance information for approximately 6000 loans. The target variable (BAD) is a binary variable indicating whether an applicant eventually defaulted or was seriously delinquent. We have 12 recorded variables for each applicant. Given these information we want to obtain a predictive model which outputs 'probability of default'. Our model should be interpretable and statistically sound so that we can give the reasons for rejections.
0 2 00
flair
A very simple framework for state-of-the-art Natural Language Processing (NLP)
Language:Python0 1 00
Key-Frames-Extraction-from-Video
Using Color Histogram, SVD and Dynamic Clustering Method obtained Key-Frames from a video. This analysis can be used to identify frames which make a shot. The code is well documented.
Language:Jupyter Notebook21 2 19
ML-Model-to-identify-Churning-Customer-
The challenge is to obtain Ten-fold Cross Validation AUC Score above 0.893, given telecom data with 'Churn' as target variable.
Language:R0 2 00
Practical-Machine-Learning-Course-Project
Course Project-Practical Machine learning
Language:HTML0 2 00
Sentiment-Analysis-of-Twitter-Data-using-DTM-SVD-and-ML
The challenge is to obtain Ten-fold cross validation auc score more than 0.803. The approach i have taken is to first clean the tweets, spelling correction, lemmatization, stop words removal, creating document term matrix (since all frequent words already have been removed) , dimensionality reduction and then finally fitting ML Algorithm. These approaches are pretty naive. With this approach i could reach to 0.775 10-fold cross validation auc score.
Language:Jupyter Notebook1 2 00
Sentiment-Analysis-of-Twitter-Data-using-Pre-trained-Vector-and-ML-Algo
The challenge is to obtain Ten-fold cross validation auc score more than 0.803. After basic cleaning and spelling correction i used pre-trained Glove vector to find 200D representation for words in tweet which are there in Glove Vector words dictionary. Then i summed the (matching) vectors to obtain 200D feacture vector for each tweet. Atlast, i fitted Random Forest Algorithm. I obtained 0.793 10-Fold cross validation auc score.
Language:Jupyter Notebook0 2 00
Sentiment-Analysis-of-Twitter-Data-using-Pre-trained-Vector-and-Neural-Network
The challenge is to obtain Ten-fold cross validation auc score more than 0.803. After basic cleaning and spelling correction i used pre-trained Glove vector to find 200D representation for words in tweet which are there in Glove Vector words dictionary. Then i summed the (matching) vectors to obtain 200D feacture vector for each tweet. Atlast, i fitted a neural network with 1 hidden layer. I obtained 0.81 10-Fold cross validation auc score.
Language:Jupyter Notebook0 2 00

sawankumar94's Repositories

sawankumar94/Key-Frames-Extraction-from-Video
Using Color Histogram, SVD and Dynamic Clustering Method obtained Key-Frames from a video. This analysis can be used to identify frames which make a shot. The code is well documented.
Language:Jupyter Notebook21 2 19
sawankumar94/Sentiment-Analysis-of-Twitter-Data-using-DTM-SVD-and-ML
The challenge is to obtain Ten-fold cross validation auc score more than 0.803. The approach i have taken is to first clean the tweets, spelling correction, lemmatization, stop words removal, creating document term matrix (since all frequent words already have been removed) , dimensionality reduction and then finally fitting ML Algorithm. These approaches are pretty naive. With this approach i could reach to 0.775 10-fold cross validation auc score.
Language:Jupyter Notebook1 2 00
sawankumar94/-AI-CL-688-Course-Project
This contains codes for course project
Language:R00
sawankumar94/Coursera-Course-Certificates
0 2 00
sawankumar94/Creating-a-Credit-Scoring-Model-to-obtain-the-probability-of-default
We have baseline and loan performance information for approximately 6000 loans. The target variable (BAD) is a binary variable indicating whether an applicant eventually defaulted or was seriously delinquent. We have 12 recorded variables for each applicant. Given these information we want to obtain a predictive model which outputs 'probability of default'. Our model should be interpretable and statistically sound so that we can give the reasons for rejections.
0 2 00
sawankumar94/flair
A very simple framework for state-of-the-art Natural Language Processing (NLP)
Language:Python0 1 00
sawankumar94/ML-Model-to-identify-Churning-Customer-
The challenge is to obtain Ten-fold Cross Validation AUC Score above 0.893, given telecom data with 'Churn' as target variable.
Language:R0 2 00
sawankumar94/Practical-Machine-Learning-Course-Project
Course Project-Practical Machine learning
Language:HTML0 2 00
sawankumar94/Sentiment-Analysis-of-Twitter-Data-using-Pre-trained-Vector-and-ML-Algo
The challenge is to obtain Ten-fold cross validation auc score more than 0.803. After basic cleaning and spelling correction i used pre-trained Glove vector to find 200D representation for words in tweet which are there in Glove Vector words dictionary. Then i summed the (matching) vectors to obtain 200D feacture vector for each tweet. Atlast, i fitted Random Forest Algorithm. I obtained 0.793 10-Fold cross validation auc score.
Language:Jupyter Notebook0 2 00
sawankumar94/Sentiment-Analysis-of-Twitter-Data-using-Pre-trained-Vector-and-Neural-Network
The challenge is to obtain Ten-fold cross validation auc score more than 0.803. After basic cleaning and spelling correction i used pre-trained Glove vector to find 200D representation for words in tweet which are there in Glove Vector words dictionary. Then i summed the (matching) vectors to obtain 200D feacture vector for each tweet. Atlast, i fitted a neural network with 1 hidden layer. I obtained 0.81 10-Fold cross validation auc score.
Language:Jupyter Notebook0 2 00

sawankumar94

Pinned Repositories

-AI-CL-688-Course-Project

Coursera-Course-Certificates

Creating-a-Credit-Scoring-Model-to-obtain-the-probability-of-default

flair

Key-Frames-Extraction-from-Video

ML-Model-to-identify-Churning-Customer-

Practical-Machine-Learning-Course-Project

Sentiment-Analysis-of-Twitter-Data-using-DTM-SVD-and-ML

Sentiment-Analysis-of-Twitter-Data-using-Pre-trained-Vector-and-ML-Algo

Sentiment-Analysis-of-Twitter-Data-using-Pre-trained-Vector-and-Neural-Network

sawankumar94's Repositories

sawankumar94/Key-Frames-Extraction-from-Video

sawankumar94/Sentiment-Analysis-of-Twitter-Data-using-DTM-SVD-and-ML

sawankumar94/-AI-CL-688-Course-Project

sawankumar94/Coursera-Course-Certificates

sawankumar94/Creating-a-Credit-Scoring-Model-to-obtain-the-probability-of-default

sawankumar94/flair

sawankumar94/ML-Model-to-identify-Churning-Customer-

sawankumar94/Practical-Machine-Learning-Course-Project

sawankumar94/Sentiment-Analysis-of-Twitter-Data-using-Pre-trained-Vector-and-ML-Algo

sawankumar94/Sentiment-Analysis-of-Twitter-Data-using-Pre-trained-Vector-and-Neural-Network