This repo contains the source code and data for the paper Neutral Bots Probe Political Bias on Social Media by Chen et al. (preprint).
Social media platforms attempting to curb abuse and misinformation have been accused of political bias. We deploy neutral social bots (we call them drifters) on Twitter to probe biases that may emerge from interactions between users, platform mechanisms, and manipulation by inauthentic actors.
bot
contains the source code of the drifters. Usecd bot; python drifter_main.py <your drifter screen_name>
to activate the drifters.data
contains the code and intermediate data files for the analyses.data/hashtag_political_alignment
has the implementation of hashtag embedding.data/GenerateDataFiles.ipynb
generates data files for our analyses.
database
contains the script to create a PostgreSQL database for the analyses.exps
contains scripts and a notebook to generate the plots and tables for the paper.exps/FinalPaperPlots.ipynb
reads the output files generated bydata/GenerateDataFiles.ipynb
to produce the final figures.exps/algorithm_bias_estimation.ipynb
reads the output files generated bydata/GenerateDataFiles.ipynb
and runs statistical tests to estimate the algorithmic bias.
exps/news_seed_popularity
contains the scripts and data for the news seed popularity analysismetric
contains the scripts to pre-processes the collected data for further analyses. Please runmetric_job.py
as an example.analysis.py
andmetrics.py
compute the hashtag-based and url-based political valence score for each Tweet.time_series_scores.py
computes the political valence changes across time for all drifters. The notebookdata/GenerateDataFiles.ipynb
calls this script.generate_networks_for_each_bot.py
builds the ego networks for drifters and computes metrics for the echo chamber analyses. The notebookdata/GenerateDataFiles.ipynb
depends on files generated by this script.
others
contains the code to initialize and clean up the drifters.
The software in this repository has been tested on a linux machine with Python 3 installed. Installing Python and the dependencies below might require up to one hour. Data collection for an experiment similar to the one described in the paper would require several months. Data processing would typically take a few days.
- Python 3
- Jupyter notebook is used in some cases to process and visulize the data.
- twurl, modified as shown here, is used to manage the drifters. First you need to create a Twitter app. Each drifter account must authorize the app. The keys of the app can then be used with
twurl
to control the drifters. - chatterbot is used when drifters reply to tweets that mention them.
- tweepy is used in analysis code.
- psycopg2 is the database driver.
- botometer client library is used in conjunction with the Botometer Pro API to get data from the Twitter API and then calculate bot scores for friends and followers of the drifters.
- gensim provides an implementation of the word2vec algorithm for calculating the political alignment of the hashtags.
You may cite our preprint as:
@techreport{drifter2020,
Author = {Wen Chen and Diogo Pacheco and Kai-Cheng Yang and Filippo Menczer},
Institution = {arXiv},
Number = {2005.08141},
Title = {Neutral Bots Reveal Political Bias on Social Media},
Type = {Preprint},
Url = {https://arxiv.org/abs/2005.08141},
Year = {2020}
}