This project aims to build a fake news classifier that can accurately distinguish fake news from genuine news. We also developed a simple UI, to enable the users to efficiently verify news articles
- The advent of technology and the development of social media platforms has made it easier to share news updates with the masses.
- The effects of fake news can be catastrophic.
- Fact checking can prevent people from reacting and taking action on fake news.
- Extremely useful for news houses to fact-check their news before they share it with the masses.
-
Dataset: 20,800 samples with 10387 real news,10413 fake news. Dataset is publically available here
-
Preprocessing:
- Removing irrelevant texts and NAN values
- Stopwords
- Numerics and special characters
- Lemmatization
- Case folding
- Tokenization
- Padding
-
Vectorisation:
- OneHot Encoding
- Count Vectoriser
- Hashing-Vectorizer
- TF-IDF
- GloVe Embedding
- Word2Vec Embedding
- BERT
-
Machine Learning / Deep Learning Algorithms for Fake news classification:
- Naive Bayes (MultinomialNB)
- DecionTree
- AdaBoost Classification
- Logistic Regression
- Passive Aggressive
- Multilayer Perceptron
- LSTM
- BERT
- An end-to-end deployed tool which allows user to verify news articles in a click
- Efficient and accurate tool for fact checking
- A simple minimalistic user interface
- It allows user to input news text along with title and author name (both optional fields)
- ‘Load sample input’ button allows users to understand and test the app
- Flask, HTML, CSS, JS for building the webapp
- Heroku for deploying the webapp
- Python, along with Machine learning, deep learning frameworks.
This project has been completed as a course project of CSE556: Natural Language Processing.