lasoya/Social_Media_Analysis

The basis for this project was the question: is it possible to infer if a user is at risk for depression from social media activity. With the emergence of the internet where individuals spend most of their time online, there is a wealth of data that can be tapped for analysis to better understand the user's overall mood. Due to the unavailability of medical data, we approached it indirectly from another perspective. The problem evaluated in this project was: can we predict the user's overall mood based on certain features from tweets? If a user is on the happier side of the spectrum, s/he is less at risk for depression than if a user is on the sadder side of the spectrum. This became a binary classification problem with the target variable as the user's overall mood (0 = happy, 1 = sad). 1-year of tweets, scraped from Twitter based on specific keywords and hashtags, were analyzed using Natural Language Processing, specifically sentiment analysis. Multiple classification algorithms were tested and the final model was a Support Vector Machine (SVM) with an accuracy and F1 score of 0.84. It was concluded that this topic is worth further exploration.

HTML

Stargazers

utaveras
New York, NY