Ukrainian War: A Global Opinion Analysis Using Twitter Data 🌍🐦

Overview 📜

This project focuses on conducting a comprehensive sentiment analysis on the War in Ukraine, utilizing a vast dataset of tweets published throughout the year 2022. Our aim is to extract, analyze, and interpret the sentiments, opinions, and emerging trends expressed on Twitter regarding the ongoing conflict. This analysis will provide valuable insights into public perception and the global discourse surrounding the conflict.

Contributors 👥

Claudia Agromayor
Malo Langourieux
Arthur Fournon
Vincent Lefeuve
Gauthier Riquier
Nicolas Brandel

Dataset 📊

The primary dataset for this project is the "Ukraine Russian Crisis Twitter Dataset," which comprises over 1.2 million tweets. This extensive collection has been meticulously gathered to represent a wide array of perspectives and voices discussing the conflict. The dataset is publicly available on Kaggle and can be accessed through the following link: Ukraine Russian Crisis Twitter Dataset.

Project Structure 🏗️

Data:
- The data directory contains tweets related to the War in Ukraine found on an online database in csv format.
Src:
- The src directory includes code for running the web-application and all of the code.
Tests:
- The tests directory houses the code corresponding to the unit and coverage tests.
ML:
- The ml directory focuses on the code needed to construct the text classification models, including the Shallow learning and Transformer-based approaches.

How to Use 🛠️

Clone the Repository:

git clone https://gitlab-cw4.centralesupelec.fr/groupe-7-les-bg/war_ukraine.git

Install the necessary packages:
```
make init
```
Download the model and place it: Click here to download the model. Once you have downloaded it, simply extract it and place the /model folder inside the ml folder.
Download the pre-processed dataset Click here to download the pre-procces datasets. Once you have it, place the /tweets_processed folder inside /data.

If you have Python3 installed:

Run the project:
```
make build3
```
Run unit tests:
```
make test3
```

If you only have Python installed:

Run the project:
```
make build
```
Run unit tests:
```
make test
```

Enjoy!

Requirements ✅

Req №	Description	Importance	Current state
1	Pre-process the datasets and extract knowledge 📚	Crucial	✅ Done
2	Create data visualisations from the dataset 📊	Crucial	✅ Done
3	Perform sentiment analysis from the dataset 💭	Crucial	✅ Done
4	Create a transformer/shallow learning-based tweet classifier (pro Russian/Ukrainian) 🐦	Important	✅ Done
5	Make a web-application using dash 🌐	Important	✅ Done
6	Create wordclouds ☁️	Important	✅ Done
7	Implement a cloropleth using geographical data and the classification of the tweets 🗺️	Important	✅ Done
8	Provide a way for users to easily run the project (Makefile) 🏃	Important	✅ Done
9	Add other plots to the web application 📈	Medium	✅ Done
10	Add unit and coverage testing 🧪	Medium	🚧 Partial
11	Provide documentation with docstrings and a sphynx wiki 📝	Medium	🚧 Partial
12	Compare other methods of classifiers (rule-based, LSTMs...) 🔄	Low	❌ In the future
13	Put the repository in a docker container to run it easily 🐳	Low	❌ In the future
14	Write a project report 📄	Low	❌ In the future
15	Analyse the datasets as time-series ⏳	Very Low	🚧 Partial

Contributing 👫

If you'd like to contribute to this project, feel free to fork the repository, create a new branch, make your changes, and submit a pull request. Make sure to follow the project's coding standards and guidelines.

Contact 📪

For any questions or concerns, please contact the project maintainers:

Claudia Agromayor: [claudia.agromayor@student-cs.fr]
Malo Langourieux: [malo.langourieux@student-cs.fr]
Arthur Fournon: [arthur.fournon@student-cs.fr]
Vincent Lefeuve: [vincent.lefeuve@student-cs.fr]
Gauthier Riquier: [gauthier.riquier@student-cs.fr]
Nicolas Brandel: [nicolas.brandel@student-cs.fr]

V1ncenttt/nlp_ukraine_war