This rep contains my experimentation on tweet data to find whether it has a positive, negative or neutral tone. Data set can be found in kaggle whose link is be given below.
Machine Learning Workflow: Problem Statement -> Data Gathering -> Data Formatting -> Algorithm Selection -> Creating Model -> Training Model -> Testing Model -> Repeat till optimum solution
Using Tweet Sentiment data after cleaning using Pandas dataframe, to classify the tweets into negative and positive using Gaussian Naive Bayes Algorithm and testing the model using seperate test data and improving accuracy by changing hyper parameters, using cross validation and changing Algorithm if needed.
https://www.kaggle.com/c/twitter-sentiment-analysis2/data -1 Testing vs Training data Ratio: 100K : 300k
ItemID - id of twit Sentiment - sentiment SentimentText - text of the twit
0 - negative 1 - positive
-Source 1
Gaussian Naive Bayes: https://scikit-learn.org/stable/modules/naive_bayes.html#gaussian-naive-bayes Decision Tree Classifier: https://scikit-learn.org/stable/modules/tree.html#classification Logistic Regression: https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression (CV version of these)
-To be pushed-
-To be pushed-