/distance_learning

Sentiment analysis on the tweets about distance learning with TextBlob

Primary LanguageJupyter Notebook

Sentiment analysis on the tweets about distance learning with TextBlob

Introduction

The Covid19 Pandemic brought about distance learning in the 2020 academic term. Although some people could adapt easily, some of them found it inefficient. Nowadays, the re-opening of schools is being discussed. Most experts suggest that at least one semester should be online again. As a student who passed the last semester with distance learning, I could find a lot of time to spend on learning natural language processing. Finally, I decided to explore what people think about distance learning. We are going to explore the tweets related to distance learning to understand people's opinions (a.k.a opinion mining) and to discover facts. I have used the lexicon-based approach to determine the tweets' polarities. I have also build a machine learning model to predict the positivity and the negativity of the tweets.

I have collected 202.645 tweets related to distance learning by using Twitter API.

You can find the related story on Medium

Content

1. Data Gathering

- Twitter API
- Retrieve tweets with tweepy

2. Preprocessing and Cleaning

- Drop duplicates
- Data type conversions
- Drop uninformative columns
- Get rid of stop words, hashtags, punctuation, and one or two-letter words
- Tokenize the words
- Apply lemmatization
- Term frequency-inverse document frequency vectorization

3. Exploratory Data Analysis

- Visualize the data
- Compare word counts
- Investigate the creation times distribution
- Investigate the locations of tweets
- Look at the popular tweets and the most frequent words
- Make a word cloud

4. Sentiment Analysis

5. Machine Learning

6. Summary

Some findings

Label Counts

Scores

Word cloud