
💲entiment Analysis of Tweets & predicting Stock 💹 prices based on user sentiments about # Amazon

Primary LanguagePython


This project is being worked upon and all features being implemented.Proper repo will be updated.

Project started : 21:50 , 28/10/2017 | PwC challenge #9, Hack2Innovate Hackathon (24 hours), IIT-G

Sentiment Analysis of Tweets & predicting Stock prices based on user sentiments about # Amazon

Dataset : We have created 2 data sets on our own.Since we lack proper cleaned datasets for both Stock data & tweets of same timeframe in the meantime.

(1) data.json - is a dictionary holding a key-value pair of sentiment ( 1 for positive & 0 for negative ) , where "key" is the sentiment ( either 0 or 1 ) & "value" is the sample tweet (" Actual Tweet by users").

(2) data.txt - sample data file holding 2 columns & 100 rows (rows denote no of cases).

 The two columns:
 column ( tweet_sent ) - holds data if the tweet was positive ( 1 ) or negative ( 0 ).
 column ( stock_sent ) - holds data if stock price rose ( 1 ) or fell ( 0 ).
 0 for fall/-ve sentiment & 1 for rise/+ve sentiment.

Note : Since, the stock fluctuations & the user tweets are made up by us, they are synthetic.We are unable to predict and match with real case scenarios.For instance, during festive season, people generally shops more & feel happy about it.This trend during festive season generally tends to increase stock prices, because of more demand of the products.Since, here we dont have a time to time mapped real dataset of people's sentiments & stock fluctuation, We are unable to predict/verify if it really happened.

Future case scenario : With proper datasets, we can better train the model & hence get better accuracy.We will be able to extend & predict the same outcome of another e-commerce site during similar historic events.


01. numpy
02. pandas
03. matplotlib
04. json
05. tflearn
06. tensorflow
07. sklearn
08. nltk
09. sys
10. random
11. string
12. unicodedata

To run :

TO RUN : >>>tweeStock.py


Input Image


Output Image

We can see our model performs well for the self generated test cases.In future, we can maybe get real time tweets with the help of an API & predict the stock fluctuations for the coming days & give the functionality through a webapp.


Correlation between the two columns ( tweet_sent,stock_sent ) of data.txt found is : 85.7785299553 % It's plot is : Correlation plot

if (correlation > 70 )

Model : Tensorflow Deep Neural Network (DNN).

Training : Gradient Descent.

Test data.
