Foreign Fighter Supply and Kernel Regularized Hurdle Negative Binomial
This repository has the replication files for the paper "Predicting Foreign Fighter Flows to Syria Using Machine Learning: An Introduction to Kernel Regularized Hurdle Negative Binomial" (GitHub link, DB link) by George Derpanopoulos and Luke Sonnet. It also contains the script to use KRHNB. Please try and replicate, and please tell us when it fails!
To fully replicate, you have to do the following:
- Make sure all of the packages we use are installed (may take some time to work through all of the files, we will look to automate and facilitate this in the future)
- Make sure you set the working directory appropriately in each of the
R
files - Run
code/build_ff_data.R
to build our full dataset from the raw data - Run
code/analyze_ff.R
to run the full analysis. In the script there will be some flags you will have to turn toTRUE
in order for the full analysis to run; these processes or slow so we instead rely on cached data - Run
code/analyze_krhnb_performance.R
to run the OOS performance analysis. Again in the script there will be some flags you will have to turn toTRUE
in order for the full tests to run; these processes or slow so we instead rely on cached data - Run
tex/derpanopoulos_sonnet.tex
to recreate our final PDF. Note that we edit the map to trim some white space and that we directly paste edited versions of the tables into the tex file. You will have to update those in the tex file to get them to update if you change some features of the code
Folder structure
tex/
contains the tex file, working paper, and bibliographytex/tabs_figs
contains tables, figures, and more produced by the R code for the .tex file
code/
contains the analysis files, files that build the data, and some supporting functionskrhnb/
contains the script to run KRHNBsavedata/
contains some saved .RData files because some of the analyses can take timedata/
contains the cleaned data and the raw data used to create the cleaned data
Main files
krhnb/krhnb.R
is the main script for the KRHNB methodcode/analyze_ff.R
is the main analysis scriptcode/analyze_krhnb_performance.R
is used to evaluate the OOS performance of our modeldata/foreignFightersImputed.csv
is the main imputed datasettex/derpanopoulos_sonnet_ff.pdf
is the paper