Welcome to the Fake News Research Datasets repository. This repository is part of our paper contribution, where we re-upload publicly available datasets, summarize their contents, and compare them. This initiative aims to provide researchers with a centralized, comprehensive portal for accessing and analyzing relevant datasets, with regular updates.
Note: This repository is currently private and will be made public after the paper's acceptance. It will be provided as supplementary material.
In this section, we contribute to this paper by re-uploading publicly available datasets, summarizing their contents, and comparing them on our GitHub page. This initiative aims to offer researchers a centralized, comprehensive portal for accessing and analyzing relevant datasets, with regular updates. Due to page constraints, only a portion of the GitHub pages are displayed here.
Description: Dataset from social media, focusing on fake news and hoaxes.
Contents:
- Number of articles: Information not specified
- Source types: Social media platforms
Download: Link to BuzzFace (This link will be active upon repository's public release)
Description: A crowd-sourced dataset of events with credibility annotations.
Contents:
- Number of events: 60 million
- Source types: Social media platforms
Download: Link to CREDBANK-data (This link will be active upon repository's public release)
Description: Dataset of claims and their respective journalistic assessments.
Contents:
- Number of claims: 300
- Source types: News websites
Download: Link to EMERGENT (This link will be active upon repository's public release)
Description: Dataset for fake news detection collected in 2018.
Contents:
- Number of articles: Information not specified
- Source types: News websites, Social media platforms
Download: Link to FCV-2018 (This link will be active upon repository's public release)
Description: Fact Extraction and VERification dataset.
Contents:
- Number of claims: 185,445
- Source types: Various textual sources
Download: Link to FEVER (This link will be active upon repository's public release)
Description: Dataset focusing on hoaxes and fake news spread on Facebook.
Contents:
- Number of articles: Information not specified
- Source types: Facebook posts
Download: Link to FacebookHoax (This link will be active upon repository's public release)
Description: A comprehensive data repository for fake news research.
Contents:
- Number of articles: Information not specified
- Source types: News websites, social media platforms
Download: Link to FakeNewsNet (This link will be active upon repository's public release)
Description: Dataset for fake news detection with multiple classes of fake news.
Contents:
- Number of articles: 1 million
- Source types: Reddit posts
Download: Link to Fakeddit (This link will be active upon repository's public release)
Description: A benchmark dataset for fake news detection.
Contents:
- Number of statements: 12,836
- Source types: PolitiFact statements
Download: Link to LIAR (This link will be active upon repository's public release)
Description: Multimodal dataset for fake news detection.
Contents:
- Number of articles: Information not specified
- Source types: News websites, social media platforms
Download: Link to M4 (This link will be active upon repository's public release)
Description: Text-based dataset for misinformation research.
Contents:
- Number of articles: Information not specified
- Source types: Various textual sources
Download: Link to MisInfoText (This link will be active upon repository's public release)
Description: A large dataset for misinformation research collected in 2018.
Contents:
- Number of articles: 713,000
- Source types: News websites
Download: Link to NELA-GT-2018 (This link will be active upon repository's public release)
Description: Dataset for claim verification research.
Contents:
- Number of claims: Information not specified
- Source types: Various sources
Download: Link to Verification-corpus (This link will be active upon repository's public release)
Description: Political news dataset collected for research purposes.
Contents:
- Number of articles: Information not specified
- Source types: News websites
Download: Link to benjamin-political-news-dataset (This link will be active upon repository's public release)
Description: Dataset collected from BuzzFeed news articles.
Contents:
- Number of articles: Information not specified
- Source types: News websites
Download: Link to buzzfeed (This link will be active upon repository's public release)
Description: Dataset focusing on rumors and fake news.
Contents:
- Number of articles: Information not specified
- Source types: Social media platforms
Download: Link to pheme (This link will be active upon repository's public release)
To use these datasets, clone the repository:
git clone https://github.com/fakenewsresearch/dataset.git
<<<<<<< HEAD
=======
>>>>>>> 465753a7ba34c43d65f8eab88435b6e941f54755