“Comparative evaluation of Machine Learning algorithms for fake news detection”
Authors Arvinder Pal Singh Bali Mexson Fernandes Sourabh Choubey
Special thanks to our beloved Prof. Pradosh K. Roy for all his guidance and help he provided us with.
Kindy contact us personally if you'd like to read about our research. It makes us happy to know our hardwork is being recognized and read widely.
1. Install all the dependencies
Library Dependencies
- Python <= 3.5
- Jupyter Notebook
- Scipy Stack (
numpy
,scipy
andpandas
) - scikit-learn
- XGBoost
- gensim (for word2vec)
- NLTK (python NLP library)
2.clone the repo
3. Download the GloVe
model trained on Wikipedia 2014 + Gigaword 5. Convert the file to word2vec.txt using convert_GloVe2Word2Vec.ipynb
OR
download datasets and model from here and save under src/feature_generators/datasets/
.
In directory src/feature_generators/
4. Use prepare_data.ipynb
then gen_features.ipynb
to generate all the required features.
All the pickled files are saved under src/saved_data/
.
6. Run xgb_train.py
to train and make predictions on the test set. Output file is src/predictions_*.csv
under directory src/results
.
In directory src/
7. Use Result_visualization.ipynb
and test_xgb_model.ipynb
to study the output and use the model respectively.
8. For Cross Validation results check notebooks cross_validation.ipynb
.
All the output files are also stored under results/
and all parameters are hard-coded that have been determined using grid and random search methods.