Source Based Fake News Classification.
Social media is a vast pool of content, and among all the content available for users to access, news is an element that is accessed most frequently. This news can be posted by politicians, news channels, newspaper websites, or even common civilians. These posts must be checked for their authenticity, since spreading misinformation has been a real concern in today’s times, and many firms are taking steps to make the common people aware of the consequences of spreading misinformation. The measure of authenticity of the news posted online cannot be definitively measured, since the manual classification of news is tedious and time-consuming and is also subject to bias.
This repository contains a Python Streamlit application for analyzing news data, as well as Jupyter Notebooks for building the model. It also contains with python pipelines for preprocessing, training, monitoring model inferences
- Python - The programming language used
- Streamlit - The framework used
- Mlflow - For experiment tracking
- Prefect - For workflow orchestration
- Evidently - For model monitoring
Click here to get to the deployed News Post Checker Application
- Joseph Ologunja - Initial work - Joseun
Folder/Code | Content |
---|---|
.streamlit | Contains the config.toml to set certain design parameters |
Train data | Contains the data used in training the model CSV format |
Test data | Contains the data used in test the model excel format |
Submission | Contains the labelled test data using the model in excel format |
News_Classfication.ipynb | Contains the code for data exploration, analysis, visualization and model building |
app.py | Contains the actual Streamlit application |
model | Contains the trained model in pickled format |
tokenizer | Contains the tokenizer in pickled format |
requirements.txt | Contains all requirements (necessary for Streamlit deployment) |