FakeNewsNet
Please use the current up-to-date version of dataset:
- User may post same news several times, and this frequency information is added;
- User features are updated.
This is a repository for an ongoing data collection project for fake news research at ASU. We describe and compare FakeNewsNet with other existing datasets in Fake News Detection on Social Media: A Data Mining Perspective. We also perform a detail analysis of FakeNewsNet dataset, and build a fake news detection model on this dataset in Exploiting Tri-Relationship for Fake News Detection
News Content
It includes all the fake news articles, with the news content attributes as follows:
- source: It indicates the author or publisher of the news article
- headline: It refers to the short text that aims to catch the attention of readers and relates well to the major of the news topic.
- body_text: It elaborates the details of news story. Usually there is a major claim which shaped the angle of the publisher and is specifically highlighted and elaborated upon.
- image_video: It is an important part of body content of news article, which provides visual cues to frame the story.
Social Context
It includes the social engagements of fake news articles from Twitter. We extract profiles, posts and social network information for all relevant users.
- user_profile: It includes a set of profile fields that describe the users' basic information
- user_content: It collects the users' recent posts on Twitter
- user_followers: It includes the follower list of the relevant users
- user_followees: It includes list of users that are followed by relevant users
Source Code
We will publish the Python code that are used to collect this dataset. Stay tuned.
Data Availability
Due the term of service of social media platform, we are not able to public raw data of social context. We anonymize sensitive user information, and provide bag-of-word features for user profile and content, and keep social relationship of users. The raw data of user profiles and contents are upon request accordingly, please send email to <kai.shu at asu.edu>, and do not distribute it.
References
If you use this dataset, please cite the following papers:
@article{shu2017fake,
title={Fake News Detection on Social Media: A Data Mining Perspective},
author={Shu, Kai and Sliva, Amy and Wang, Suhang and Tang, Jiliang and Liu, Huan},
journal={ACM SIGKDD Explorations Newsletter},
volume={19},
number={1},
pages={22--36},
year={2017},
publisher={ACM}
}
@article{shu2017exploiting,
title={Exploiting Tri-Relationship for Fake News Detection},
author={Shu, Kai and Wang, Suhang and Liu, Huan},
journal={arXiv preprint arXiv:1712.07709},
year={2017}
}
(C) 2017 Arizona Board of Regents on Behalf of ASU