/us-election-tweets

This Repository holds the analysis on the US 2020 Elections Tweets Dataset taken from Kaggle.

Primary LanguageJupyter Notebook

us-election-tweets

Binder

The US Presidential Elections for 2020 just got over and people have a lottt to say!

In this repo, we look at the tweets made by people regarding the election. The dataset used is: US Election 2020 Tweets

The dataset comprises of 1.72 Million Tweets regarding the US Elections. It is collected from October 15, 2020 till November 8, 2020 (as of version 19, the most recent version at the time of this update). The data is distributed into two CSV files:

  1. hashtag_donaldtrump.csv
  2. hashtag_joebiden.csv

Tweets were scraped using the snsscrape and Twitter API on #DonaldTrump, #Trump, #JoeBiden, #Biden keywords.

The data folder will be empty but should contain just the two CSVs for the notebooks to work.

We performed the following on the dataset: 1. Exploratory data analysis 2. Sentiment analysis of tweet and Classying each state as Democratic or republic on basis of tweets 3. Using K means to Create word cluster and word cloud , optimizing K means using Elbow method

All notebooks are available in the ./notebooks folder.