/arabic-nlp

Deep learning class project, Data Science for Business X-HEC 2020/2021.

Primary LanguagePython

Sentiment Analysis on Arabic tweets

Members

Description

In this project we apply NLP pipelines to the Arabic Language and perform sentiment analysis and topic extraction on Arabic Tweets. Performing NLP on the Arabic language has its own set of difficulties due to the nature of the language like detecting stopwords and reconciling the multiple dialects.

Goal

The ultimate goal is to be able to gain insights from political Tweets in Arabic countries and compare those insights to financial or economical indicators (market, currency) to see if we could have partially predicted a historical crisis like the Arab Spring.

Datasets

Papers

Available code

https://github.com/aub-mind/arabert

Steps

  • Exploring available arabic tweet datasets and joining them into a single big dataset with sentiment as label
  • Data preprocessing
  • Word Embedding
  • Topic extraction
  • Sentiment Analysis
  • Insights into people's emotions and viewpoints on a variety of products or political decisions