Welcome to HateScan 📢

What is HateScan?

HateScan offers you the ability to analyze hate speech from real tweets and Twitter accounts, including those of celebrities and public figures.

HateScan is a web app powered by AI to make classifications of speech in Twitter. HateScan integrates two Reccurent Neural Network models trained on ten's of thousands of records of data to do two primary tasks:

Classify a tweet & accounts level of hate:
- Normal
- Offensive
- Hateful
Classify what the topic of the tweet is regarding:
is about:
- Gender
- Religion
- Race
- Politics
- Sport

HateScan Features

1. Tweet Scan

Analyze any single tweet and our models will classify its hate label and hate topic (feature now deprecated).

2. Account Scan

Analyze any account by inputing Twitter handle and number of tweets wished to analyze. Models return hate label of account and hate topic distribution of the account (feature now only works for accounts in BigQuery DB. Use handles found in the Global Scan chart by hovering over account / data points).

3. Global Scan

All the accounts analyzed are stored in a Google Cloud's database: BigQuery. We are able to compare all accounts of public figures, politians, artists, world leading figures, athletes. And visualize patterns of speech, through Principal Component Analysis (PCA) and plotting.

Each account is a data point in the graph. The size of the circles represents the number of followers of each profile. The color of the circles represents the hate label assigned by HateScan. Therefore we can see which accounts are most influential based on their following base.

Methodology & Under The Hood

Analyzed our initial dataset https://github.com/avaapm/hatespeech
Downloaded Tweets text from Tweet IDs using Twitter API in Rapid API
Found imbalanced classes so we enriched the classes with additional datasets
Trained two Recurrent Neural Network Models
Set up our cloud database with BigQuery to store model classification results for first time account scans
Built our front-end and app with Streamlit & Fast API
Pitched our final project in front of an audience of 100 people given a demo of our solution

HateScan web: https://hatescan.streamlit.app/
Built by Joaquin Ortega, Elina Emsems & Santiago Rodriguez for Le Wagon Data Science Bootcamp Demoy Day (Batch #1237).

joaquin-ortega84/HateScan