This repository contains google colab notebook developed over visualization, data preprocessing and Recurrent Neural Network model building to predict the sentiment of a tweet.
The Social Dilemma, a documentary-drama hybrid explores the dangerous human impact of social networking, with tech experts sounding the alarm on their own creations as the tech experts sound the alarm on the dangerous human impact of social networking. This dataset brings you the twitter responses made with the #TheSocialDilemma hashtag after watching the eye-opening documentary "The Social Dilemma" released in an OTT platform(Netflix) on September 9th, 2020.
Attribute Information:
- user_name - The name of the user, as they’ve defined it.
- user_location - The user-defined location for this account’s profile.
- user_description - The user-defined UTF-8 string describing their account.
- user_created - Time and date, when the account was created.
- user_followers - The number of followers an account currently has.
- user_friends – The number of friends an account currently has.
- user_favourites - The number of favorites an account currently has.
- user_verified - When true, indicates that the user has a verified account.
- date - UTC time and date when the Tweet was created.
- hashtags - All the other hashtags posted in the tweet along with #TheSocialDilemma
- source - Utility used to post the Tweet, Tweets from the Twitter website have a source value – web
- is_retweet - Indicates whether this Tweet has been Retweeted by the authenticating user.
- clean_text – Cleaned text of the tweet.
- Sentiment (target) - Indicates the sentiment of the tweet, consists of three categories: Positive, neutral, and negative.
Model: sequential_1
text_vectorization_1 (TextVe (None, None) 0
embedding_1 (Embedding) (None, None, 64) 128000
simple_rnn_1 (SimpleRNN) (None, 64) 8256
dense_2 (Dense) (None, 64) 4160
Total params: 140,611 Trainable params: 140,611 Non-trainable params: 0
The input for the model is from encoder which encodes the text tweet into machine understandable data.
The curves are pretty smooth and model is good.
The accuracy of the RNN model : 86.17 %