This research study was conducted by Othman EL HOUFI and Dimitris KOTZINOS. A detailed article is published here: https://bit.ly/3mZvDup
As false information and fake news are propagating throughout the internet and social networks, the need of fact-checking operations becomes necessary in order to maintain a truthful digital environment where general information can be reliably exploited whether in politics, finance or other domains. The need of this online claim assessment comes from the fact that fake news and false information can have a big negative impact on politics, economy (2016 USA Elections) and public health (COVID-19).
A number of solutions have been proposed to deal with this problem and limit the spread of false information, both manual and automatic. Undoubtedly the manual approaches done on websites such as PolitiFact.com, FactCheck.org and Snopes.com don’t construct a viable solution for the long term as the speed and scale of information propagation increase exponentially rendering this manual fact-checking operation where human fact- checkers can’t scale up at the same rate limited and incapable of solving the problem.
Here, we present our contribution in this regard: an automated solution for fact-checking using state-of-the-art language models used today for NLP tasks (BERT, RoBERTa, XLNet...) and five well known datasets (FEVER, MultiFC, Liar, COVID19, and ANTiVax) containing annotated claims/tweets in order to fine-tune each LM and classify a given claim.
We successfully prove that fine-tuning a LM with the correct settings can achieve an accuracy of 98% and F1-score of 98% in COVID19 & ANTiVax datasets, as well as an accuracy of 64% and F1-score of 63% in FEVER dataset which is more advanced than the majority of fact-checking methods that exists today.
Index Terms: Natural Language Processing, Pre-trained Language Model, Wikipedia, Text corpus, Fine-tuning, Text processing, Natural Language Inferencing, Fact-Checking, Fake-news, Twitter, Complex Networks.
In order to run this program you must first install the following packages:
$ pip3 install numpy pandas sklearn transformers wandb texttable
$ pip3 install torch torchvision torchaudio torchinfo
$ pip3 install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cpu
Or you can install all the dependencies using the requirement.txt file in this repository:
$ pip3 install -r requirements.txt
It is recommended to use a GPU for fine-tuning Transformer based Language Models as it's a hard task that requires computational power. The program can still work on a CPU.
For the deployment of the LMs we used Python programming language and Hugging Face library that provides Transformers API to easily download and fine-tune state-of-the-art pre-trained models. Using pre-trained models can reduce your compute costs, carbon footprint, and save time from training a model from scratch.
In order to track training metrics, validation metrics, disk usage, CPU usage and other environment changes during our experiments we used Weights & Biases API (wandb). For each step or epoch we send all scores and changes to our wandb project profile and visualize everything in real time.
To execute the program you must be in the ./LM_for_fact_checking/model repository and execute:
$ ./language_model.py or $ python3 language_model.py
For more controllability you can add options to this command line, here is the list of options:
$ ./language_model.py --help
usage: language_model.py [-h] [-r] [-d DATASET] [-n NUM_LABELS] [-e EPOCH] [-t TRAIN_BATCH] [-v EVAL_BATCH]
options:
-h, --help show this help message and exit
-r, --report Possible values ["none", "wandb"]. If used all logs during training and evaluation
are reported through Wandb API (must be connected with the right credentials).
-d DATASET, --dataset DATASET
Choose dataset by specifying its name: FEVER, MultiFC, Liar...
-n NUM_LABELS, --num_labels NUM_LABELS
Specify the number of labels in the dataset. Required if dataset is manually
specified (minimum is 2).
-e EPOCH, --epoch EPOCH
Specify the number of training epochs.
-t TRAIN_BATCH, --train_batch TRAIN_BATCH
Specify the size of training batch.
-v EVAL_BATCH, --eval_batch EVAL_BATCH
Specify the size of validation batch.
For example if you want to run the program on the FEVER dataset:
$ ./language_model.py -d FEVER -n 3 -e 3 -t 20 -v 20
We created a generic code order to provide other developers/researchers with an easy plug and play application. When executing the program you interact with a simple interface that let you choose the LM you would like to use and a list of operations like fine-tuning or evaluating the model.
Hi, choose a Language Model:
1 - bert-base-uncased
2 - roberta-base
3 - albert-base-v2
4 - distilbert-base-uncased
5 - xlnet-base-cased
6 - google/bigbird-roberta-base
7 - YituTech/conv-bert-base
0 - Quit program
4
**************** distilbert-base-uncased Model ****************
1 - Show dataset description
2 - Start model fine-tuning
3 - Start model predictions
4 - Show model metrics
0 - Quit program
Finally, you can edit more parameters in the file ./LM_for_fact_checking/model/conf.py . You can also add your Wandb profile and other LMs:
DEFAULT_PARAMS = {
# Transformer model
'MODEL_NAME' : None,
# Dataset name
'DATASET_NAME' : 'FEVER',
'DATA_NUM_LABEL' : 2, # minimum 2 labels
# hyperparams
'MAX_SEQ_LEN' : 128,
'TRAIN_BATCH_SIZE' : 20,
'EVAL_BATCH_SIZE' : 20,
'EPOCHS' : 3,
'LR' : 3e-5,
'OPTIM' : 'adamw_hf',
# Huggingface Trainer params
'EVAL_STEPS' : 100,
'SAVE_STEPS' : 100,
'LOGGING_STEPS' : 100,
'SAVE_TOTAL_LIMIT' : 1,
'EARLY_STOPPING_PATIENCE' : 3,
'REPORT':'none',
}
WANDB_PARAMS = {
'project' : 'LM-for-fact-checking',
'entity' : 'othmanelhoufi',
}
""" Here you can add more LMs to the list for more experiments """
MODEL_LIST = [
'bert-base-uncased',
'roberta-base',
'albert-base-v2',
'distilbert-base-uncased',
'xlnet-base-cased',
'google/bigbird-roberta-base',
'YituTech/conv-bert-base'
]
If you want to use a different dataset, just follow the same architecture used for the existing datasets in this repository.