/foreign_fighters

Replication materials for Foreign Fighter Supply paper

Primary LanguageTeXOtherNOASSERTION

Foreign Fighter Supply and Kernel Regularized Hurdle Negative Binomial

This repository has the replication files for the paper "Predicting Foreign Fighter Flows to Syria Using Machine Learning: An Introduction to Kernel Regularized Hurdle Negative Binomial" (GitHub link, DB link) by George Derpanopoulos and Luke Sonnet. It also contains the script to use KRHNB. Please try and replicate, and please tell us when it fails!

To fully replicate, you have to do the following:

  • Make sure all of the packages we use are installed (may take some time to work through all of the files, we will look to automate and facilitate this in the future)
  • Make sure you set the working directory appropriately in each of the R files
  • Run code/build_ff_data.R to build our full dataset from the raw data
  • Run code/analyze_ff.R to run the full analysis. In the script there will be some flags you will have to turn to TRUE in order for the full analysis to run; these processes or slow so we instead rely on cached data
  • Run code/analyze_krhnb_performance.R to run the OOS performance analysis. Again in the script there will be some flags you will have to turn to TRUE in order for the full tests to run; these processes or slow so we instead rely on cached data
  • Run tex/derpanopoulos_sonnet.tex to recreate our final PDF. Note that we edit the map to trim some white space and that we directly paste edited versions of the tables into the tex file. You will have to update those in the tex file to get them to update if you change some features of the code

Folder structure

  • tex/ contains the tex file, working paper, and bibliography
    • tex/tabs_figs contains tables, figures, and more produced by the R code for the .tex file
  • code/ contains the analysis files, files that build the data, and some supporting functions
  • krhnb/ contains the script to run KRHNB
  • savedata/ contains some saved .RData files because some of the analyses can take time
  • data/ contains the cleaned data and the raw data used to create the cleaned data

Main files