/Twitter-Data-Analysis-on-USA-Presidential-Candidates

This project is developed as part of Principles of Big Data subject in UMKC

Primary LanguageJava

Twitter-Data-Analysis-on-USA-Presidential-Candidates

This project is developed as part of Principles of big data class at UMKC, Spring 2016.

Twitter data analysis on “USA Presidential Candidates”. 1GB of tweets collected using twitter4j. Dynamic web application to visualize the results. Apache Spark SQL and RDD 9 different dynamic queries are created for top trends like tweet source, highest tweets per candidate, top locations, etc. and sentiment discovery

Environment: Windows 10 Tools: Eclipse, Apache Spark Python, Java, Java Script, D3.js, HTML/CSS Bluemix for hosting Application

Analytical Queries:

Query 1: Tweet Count based on President Candidates

Query 2: Top 8 Most Frequently Tweeting Users

Query 3: Top 8 Users with highest followers

Query 4: Top Locations with most Tweets

Query 5: Users with Friends greater than 150000

Query 6: Top 8 Most Tweeting Timestamps

Query 7: Sentiment Discovery

Query 8: Tweets from Different Type of Devices

Query 9: Tweet vs Retweet Status