Pinned Repositories
-AI-CL-688-Course-Project
This contains codes for course project
Coursera-Course-Certificates
Creating-a-Credit-Scoring-Model-to-obtain-the-probability-of-default
We have baseline and loan performance information for approximately 6000 loans. The target variable (BAD) is a binary variable indicating whether an applicant eventually defaulted or was seriously delinquent. We have 12 recorded variables for each applicant. Given these information we want to obtain a predictive model which outputs 'probability of default'. Our model should be interpretable and statistically sound so that we can give the reasons for rejections.
flair
A very simple framework for state-of-the-art Natural Language Processing (NLP)
Key-Frames-Extraction-from-Video
Using Color Histogram, SVD and Dynamic Clustering Method obtained Key-Frames from a video. This analysis can be used to identify frames which make a shot. The code is well documented.
ML-Model-to-identify-Churning-Customer-
The challenge is to obtain Ten-fold Cross Validation AUC Score above 0.893, given telecom data with 'Churn' as target variable.
Practical-Machine-Learning-Course-Project
Course Project-Practical Machine learning
Sentiment-Analysis-of-Twitter-Data-using-DTM-SVD-and-ML
The challenge is to obtain Ten-fold cross validation auc score more than 0.803. The approach i have taken is to first clean the tweets, spelling correction, lemmatization, stop words removal, creating document term matrix (since all frequent words already have been removed) , dimensionality reduction and then finally fitting ML Algorithm. These approaches are pretty naive. With this approach i could reach to 0.775 10-fold cross validation auc score.
Sentiment-Analysis-of-Twitter-Data-using-Pre-trained-Vector-and-ML-Algo
The challenge is to obtain Ten-fold cross validation auc score more than 0.803. After basic cleaning and spelling correction i used pre-trained Glove vector to find 200D representation for words in tweet which are there in Glove Vector words dictionary. Then i summed the (matching) vectors to obtain 200D feacture vector for each tweet. Atlast, i fitted Random Forest Algorithm. I obtained 0.793 10-Fold cross validation auc score.
Sentiment-Analysis-of-Twitter-Data-using-Pre-trained-Vector-and-Neural-Network
The challenge is to obtain Ten-fold cross validation auc score more than 0.803. After basic cleaning and spelling correction i used pre-trained Glove vector to find 200D representation for words in tweet which are there in Glove Vector words dictionary. Then i summed the (matching) vectors to obtain 200D feacture vector for each tweet. Atlast, i fitted a neural network with 1 hidden layer. I obtained 0.81 10-Fold cross validation auc score.
sawankumar94's Repositories
sawankumar94/Key-Frames-Extraction-from-Video
Using Color Histogram, SVD and Dynamic Clustering Method obtained Key-Frames from a video. This analysis can be used to identify frames which make a shot. The code is well documented.
sawankumar94/Sentiment-Analysis-of-Twitter-Data-using-DTM-SVD-and-ML
The challenge is to obtain Ten-fold cross validation auc score more than 0.803. The approach i have taken is to first clean the tweets, spelling correction, lemmatization, stop words removal, creating document term matrix (since all frequent words already have been removed) , dimensionality reduction and then finally fitting ML Algorithm. These approaches are pretty naive. With this approach i could reach to 0.775 10-fold cross validation auc score.
sawankumar94/-AI-CL-688-Course-Project
This contains codes for course project
sawankumar94/Coursera-Course-Certificates
sawankumar94/Creating-a-Credit-Scoring-Model-to-obtain-the-probability-of-default
We have baseline and loan performance information for approximately 6000 loans. The target variable (BAD) is a binary variable indicating whether an applicant eventually defaulted or was seriously delinquent. We have 12 recorded variables for each applicant. Given these information we want to obtain a predictive model which outputs 'probability of default'. Our model should be interpretable and statistically sound so that we can give the reasons for rejections.
sawankumar94/flair
A very simple framework for state-of-the-art Natural Language Processing (NLP)
sawankumar94/ML-Model-to-identify-Churning-Customer-
The challenge is to obtain Ten-fold Cross Validation AUC Score above 0.893, given telecom data with 'Churn' as target variable.
sawankumar94/Practical-Machine-Learning-Course-Project
Course Project-Practical Machine learning
sawankumar94/Sentiment-Analysis-of-Twitter-Data-using-Pre-trained-Vector-and-ML-Algo
The challenge is to obtain Ten-fold cross validation auc score more than 0.803. After basic cleaning and spelling correction i used pre-trained Glove vector to find 200D representation for words in tweet which are there in Glove Vector words dictionary. Then i summed the (matching) vectors to obtain 200D feacture vector for each tweet. Atlast, i fitted Random Forest Algorithm. I obtained 0.793 10-Fold cross validation auc score.
sawankumar94/Sentiment-Analysis-of-Twitter-Data-using-Pre-trained-Vector-and-Neural-Network
The challenge is to obtain Ten-fold cross validation auc score more than 0.803. After basic cleaning and spelling correction i used pre-trained Glove vector to find 200D representation for words in tweet which are there in Glove Vector words dictionary. Then i summed the (matching) vectors to obtain 200D feacture vector for each tweet. Atlast, i fitted a neural network with 1 hidden layer. I obtained 0.81 10-Fold cross validation auc score.