mkorrapati
Data Scientist enthusiast interested in building predictive analytical models using statistical models and neural nets.
New York
Pinned Repositories
aws-codepipeline-jenkins-aws-codedeploy_linux
Use this sample when creating a four-stage pipeline in AWS CodePipeline while following the Four Stage Pipeline Tutorial. http://docs.aws.amazon.com/codepipeline/latest/userguide/getting-started-4.html
citibike
Build a prediction model to estimate number of available citibikes at a given docking station at any time of the day based on demand and supply of citibikes at that station until that point. Gathered variety of data related to citibikes, weather and social events by web scraping. Performed cleansing of data by imputing missing data, treating data inconsistency, and normalizing it for analysis. Used deep learning algorithm using H2O package with grid search. Achieved extremely high model accuracy of 86% to estimate available bikes at any hour of the day for next 7 days. Build shiny app to enable users to benefit from the model.
creditcard_fraud_detection
Detect Fraudulent transactions in credit card usage
creditscore
Data-Analysis-and-Machine-Learning-Projects
Repository of teaching materials, code, and data for my data analysis and machine learning projects.
Hadoop-spark
person_posture
Identifying posture of a person based on data from sensors attached to him
predictive-leadscore
treasury-portfolio-prediction
Bond trading classification system for US treasuries using machine learning techniques like PCA, Restricted Botzmann Machines (RBM) and deep belief networks (DBN)
twitterproject
Build a sentiment analysis tool to analyze overall sentiment of the tweets from a particular timeline over a period of time. Conducted data cleansing and exploratory analysis to gain insights on tweeting habits (time of the day, device used, freq etc) of user. Build word scores using log odd ratio to compare high frequency words between different sources. Calculated sentiment scores using nrc lexicon. Analyzed sentiment scores and word cloud between different users.
mkorrapati's Repositories
mkorrapati/citibike
Build a prediction model to estimate number of available citibikes at a given docking station at any time of the day based on demand and supply of citibikes at that station until that point. Gathered variety of data related to citibikes, weather and social events by web scraping. Performed cleansing of data by imputing missing data, treating data inconsistency, and normalizing it for analysis. Used deep learning algorithm using H2O package with grid search. Achieved extremely high model accuracy of 86% to estimate available bikes at any hour of the day for next 7 days. Build shiny app to enable users to benefit from the model.
mkorrapati/creditcard_fraud_detection
Detect Fraudulent transactions in credit card usage
mkorrapati/aws-codepipeline-jenkins-aws-codedeploy_linux
Use this sample when creating a four-stage pipeline in AWS CodePipeline while following the Four Stage Pipeline Tutorial. http://docs.aws.amazon.com/codepipeline/latest/userguide/getting-started-4.html
mkorrapati/creditscore
mkorrapati/Data-Analysis-and-Machine-Learning-Projects
Repository of teaching materials, code, and data for my data analysis and machine learning projects.
mkorrapati/Hadoop-spark
mkorrapati/person_posture
Identifying posture of a person based on data from sensors attached to him
mkorrapati/predictive-leadscore
mkorrapati/treasury-portfolio-prediction
Bond trading classification system for US treasuries using machine learning techniques like PCA, Restricted Botzmann Machines (RBM) and deep belief networks (DBN)
mkorrapati/twitterproject
Build a sentiment analysis tool to analyze overall sentiment of the tweets from a particular timeline over a period of time. Conducted data cleansing and exploratory analysis to gain insights on tweeting habits (time of the day, device used, freq etc) of user. Build word scores using log odd ratio to compare high frequency words between different sources. Calculated sentiment scores using nrc lexicon. Analyzed sentiment scores and word cloud between different users.
mkorrapati/data-science-ipython-notebooks
Recently updated with 50 new notebooks! Data science Python notebooks: Deep learning (TensorFlow, Theano, Caffe, Keras), scikit-learn, Kaggle, big data (Spark, Hadoop MapReduce, HDFS), matplotlib, pandas, NumPy, SciPy, Python essentials, AWS, and various command lines.
mkorrapati/email-analysis
mkorrapati/entity-linking
mkorrapati/invoice-parser
mkorrapati/llm_sas_to_python
mkorrapati/llm_spam_filter
mkorrapati/seq2seq
sequence to sequence model for neural machine translation
mkorrapati/spaCy
💫 Industrial-strength Natural Language Processing (NLP) with Python and Cython
mkorrapati/Turbofan-Engine-Degradation