/DrifterBot

Source code and data for paper "Neutral Bots Probe Political Bias on Social Media" by Chen et al.

Primary LanguageJupyter NotebookMIT LicenseMIT

DOI

Introduction

This repo contains the source code and data for the paper Neutral Bots Probe Political Bias on Social Media by Chen et al. (preprint).

Social media platforms attempting to curb abuse and misinformation have been accused of political bias. We deploy neutral social bots (we call them drifters) on Twitter to probe biases that may emerge from interactions between users, platform mechanisms, and manipulation by inauthentic actors.

Repo Structure

  • bot contains the source code of the drifters. Use cd bot; python drifter_main.py <your drifter screen_name> to activate the drifters.
  • data contains the code and intermediate data files for the analyses.
    • data/hashtag_political_alignment has the implementation of hashtag embedding.
    • data/GenerateDataFiles.ipynb generates data files for our analyses.
  • database contains the script to create a PostgreSQL database for the analyses.
  • exps contains scripts and a notebook to generate the plots and tables for the paper.
  • exps/news_seed_popularity contains the scripts and data for the news seed popularity analysis
  • metric contains the scripts to pre-processes the collected data for further analyses. Please run metric_job.py as an example.
    • analysis.py and metrics.py compute the hashtag-based and url-based political valence score for each Tweet.
    • time_series_scores.py computes the political valence changes across time for all drifters. The notebook data/GenerateDataFiles.ipynb calls this script.
    • generate_networks_for_each_bot.py builds the ego networks for drifters and computes metrics for the echo chamber analyses. The notebook data/GenerateDataFiles.ipynb depends on files generated by this script.
  • others contains the code to initialize and clean up the drifters.

Dependencies

The software in this repository has been tested on a linux machine with Python 3 installed. Installing Python and the dependencies below might require up to one hour. Data collection for an experiment similar to the one described in the paper would require several months. Data processing would typically take a few days.

  1. Python 3
  2. Jupyter notebook is used in some cases to process and visulize the data.
  3. twurl, modified as shown here, is used to manage the drifters. First you need to create a Twitter app. Each drifter account must authorize the app. The keys of the app can then be used with twurl to control the drifters.
  4. chatterbot is used when drifters reply to tweets that mention them.
  5. tweepy is used in analysis code.
  6. psycopg2 is the database driver.
  7. botometer client library is used in conjunction with the Botometer Pro API to get data from the Twitter API and then calculate bot scores for friends and followers of the drifters.
  8. gensim provides an implementation of the word2vec algorithm for calculating the political alignment of the hashtags.

Citation

You may cite our preprint as:

@techreport{drifter2020,
  Author = {Wen Chen and Diogo Pacheco and Kai-Cheng Yang and Filippo Menczer},
  Institution = {arXiv},
  Number = {2005.08141},
  Title = {Neutral Bots Reveal Political Bias on Social Media},
  Type = {Preprint},
  Url = {https://arxiv.org/abs/2005.08141},
  Year = {2020}
}