FND_research: A Jupyter Notebook repository from arvinsingh

This repository contains the complete data and code used in the following research article:

“Comparative evaluation of Machine Learning algorithms for fake news detection”

Authors Arvinder Pal Singh Bali Mexson Fernandes Sourabh Choubey

Special thanks to our beloved Prof. Pradosh K. Roy for all his guidance and help he provided us with.

Kindy contact us personally if you'd like to read about our research. It makes us happy to know our hardwork is being recognized and read widely.

Procedure to replicate our results

1. Install all the dependencies

Library Dependencies

Python <= 3.5
Jupyter Notebook
Scipy Stack (numpy, scipy and pandas)
scikit-learn
XGBoost
gensim (for word2vec)
NLTK (python NLP library)

2.clone the repo

3. Download the GloVe model trained on Wikipedia 2014 + Gigaword 5. Convert the file to word2vec.txt using convert_GloVe2Word2Vec.ipynb OR download datasets and model from here and save under src/feature_generators/datasets/.

In directory src/feature_generators/

4. Use prepare_data.ipynb then gen_features.ipynb to generate all the required features.

All the pickled files are saved under src/saved_data/.

6. Run xgb_train.py to train and make predictions on the test set. Output file is src/predictions_*.csv under directory src/results.

In directory src/

7. Use Result_visualization.ipynb and test_xgb_model.ipynb to study the output and use the model respectively.

8. For Cross Validation results check notebooks cross_validation.ipynb.

All the output files are also stored under results/ and all parameters are hard-coded that have been determined using grid and random search methods.

arvinsingh/FND_research

This repository contains the complete data and code used in the following research article:

Procedure to replicate our results