Welcome to the "A Graph Neural Network Framework For Post Engagement Prediction in Online Social Media" repository!
This repository contains two works focused on predicting post engagement in online social media using graph neural networks. The goal of this framework is to provide an effective solution for understanding and predicting the engagement of posts in online social media platforms, using the relationships between posts, users, and other network structures. The works presented here showcase the potential of graph neural networks in this field and provide a solid foundation for future research. We hope this repository serves as a valuable resource for the machine learning community.
Official implementation of the paper: "Predicting Tweet Engagement with Graph Neural Networks"
Published in the ACM International Conference on Multimedia Retrieval 2023 (ICMR2023)
In this paper we present TweetGage, a Graph Neural Network solution to predict the user engagement based on a novel graph-based model that represents the relationships among posts.
Marco Arazzi, Marco Cotogni, Antonino Nocera and Luca Virgili
In order to replicate our results you can create an environment via Anaconda and install the required packages using pip
conda create -n TweetGage python=3.9
conda activate TweetGage
pip install -r req.txt
For our experiments, we considered one week of data from twitter, from November 1st to November 7th 2021, obtained through the Twitter API.
Once the tweets have been downloaded, the graph network can be build and saved as a .pickle file with:
python3 CreateNetwork.py
The script will create the graph network as 'network_tweets.pickle'.
Once the graph network has been created, it is possible to replicate the results of our paper, executing the following command in your terminal:
python3 main.py --LOAD_CSV --EXTRACT_BERT --USE_PCA --USER_FEAT --BERT_FEAT --Model_Type 'GCN'
The following arguments can be passed to the main.py script:
- LOAD_CSV: If you have already computed the features in a csv file, you can load it with this argument. In our code, we load the file "first_week_posts_bert.csv", which contains post features and BERT-extracted text embeddings.
- EXTRACT_BERT: Computes the text embedding of the posts using BERT (valid only if LOAD_CSV is not provided).
- USE_PCA: If True, computes the Principal Component Analysis with 48 projected features that cover more than 80% of the variance of the text features.
- USER_FEAT: If True, includes Post Features in the final feature set.
- BERT_FEAT: If True, includes Text Features in the final feature set.
- Model_Type: Can be one of the following: 'GCN', 'MLP', 'Conv1D', 'GAT', 'XGBOOST'. Default value is 'GCN'.
Note: If any argument is omitted, its default value is False.
If this repo is useful to your research or you want to cite our paper please use:
@inproceedings{
10.1145/3591106.3592294,
author = {Arazzi, Marco and Cotogni, Marco and Nocera, Antonino and Virgili, Luca},
title = {Predicting Tweet Engagement with Graph Neural Networks},
year = {2023},
booktitle = {Proceedings of the 2023 ACM International Conference on Multimedia Retrieval},
pages = {172–180},
numpages = {9},
location = {Thessaloniki, Greece},
series = {ICMR '23}
}