Var-CNN

This repository contains code for the following paper:

Var-CNN: A Data-Efficient Website Fingerprinting Attack Based on Deep Learning (PETS 2019, link to presentation)

Sanjit Bhat, David Lu, Albert Kwon, and Srini Devadas.

Dependencies

Ensure that you have a functioning machine with an NVIDIA GPU inside it. The model will take significantly longer to run on a CPU.
Make sure you have the TensorFlow/Keras deep learning stack installed. For detailed instructions, see this link under the "Software Setup" section. For our experiments, we used Ubuntu 16.04 LTS, CUDA 8.0, CuDNN v6, and TensorFlow 1.3.0 as a backend for Keras 2.0.8.
To install all required Python packages, simply issue the following command: pip install -r requirements.txt.

Control Flow

The first step in running our model is to place the adequate amount of raw packet sequences in the data_dir folder. Each monitored website needs to have at least num_mon_inst_train + num_mon_inst_test instances, and there needs to be at least num_unmon_sites_train + num_unmon_sites_test unmonitored sites.

If you use the Wang et al. data format (i.e., each line representing a new packet with the relative time and direction separated by a space), then we have that supported in wang_to_varcnn.py. Otherwise, you will need to modify wang_to_varcnn.py, or you can write your own glue code to move to the Wang et al. format.

After setting up the data and specifying the parameters in config.json, you can run all parts of our code just by issuing a python run_model.py command. After that, our programs will be called in the following sequence:

wang_to_varcnn.py: This parses the data_dir folder; extracts direction, time, metadata, and labels; and stores all the monitored and unmonitored traces in all_closed_world.npz and all_open_world.npz, respectively, in the data_dir folder.
preprocess_data.py: This uses the data in all_closed_world.npz to pick a random num_mon_inst_train and num_mon_inst_test instances of each of the num_mon_sites monitored sites for the training and test sets, respectively. It also performs a similar random split for the unmonitored sites (using the all_open_world.npz file) and preprocesses all of these traces to scale the metadata, change to inter-packet timing, etc. Finally, it saves the direction data, time data, metadata, and labels to .h5 files to conserve RAM during the training process.
run_model.py: This is the main file that first calls the prior two files. Next, it loads the model architectures from either var_cnn.py or df.py, trains the models, saves their predictions, and calls evaluate.py for evaluation.
During training, data_generator.py generates new batches of data in parallel. Since large datasets can contain hundreds of thousands of traces, data_generator.py uses .h5 files to access the traces for one batch without loading the entire dataset into memory.
evaluate.py: This first calculates metrics for each of the in-training combinations specified in mixture. Then, it averages each of their predictions together and reports metrics for the overall out-of-training ensemble. It saves all metrics to the job_result.json file.

Parameters

config.json provides the configuration settings to all the other programs. We describe its parameters in further detail below:

data_dir: This relative path provides the location of the "raw" packet sequences (e.g., the "0", "1", "0-0", "0-1" files in Wang et al.'s dataset). Also, it later stores the all_closed_world.npz and all_open_world.npz files generated by wang_to_varcnn.py and the .h5 data files generated by preprocess_data.py.
predictions_dir: After training the model, run_model.py generates predictions for the test set and stores them in this directory. evaluate.py later uses them to calculate test metrics.
num_mon_sites: The number of monitored websites. Each of the num_mon_sites sites in data_dir must have at least num_mon_inst_train + num_mon_inst_test instances.
num_mon_inst_train: The number of monitored instances used for training.
num_mon_inst_test: The number of monitored instances used for testing.
num_unmon_sites_train: The number of unmonitored sites used for training. Each site has one instance.
num_unmon_sites_test: The number of unmonitored sites used for testing. Each site has one instance, and these unmonitored websites are different from those used for training.
model_name: The model name. Either "var-cnn" or "df".
batch_size: The batch size used during training. For Var-CNN, we found that a batch size of 50 works well. The recommended batch size for DF is 128.
mixture: The mixture of ensembles used during training and evaluation. Each of the inner arrays represent models combined in-training. run_model will save the predictions for every such in-training combination. Subsequently, evaluate_ensemble will report metrics for these individual models as well as the overall out-of-training ensemble (i.e., the average of the individual predictions). Note: this functionality only works with Var-CNN (in fact, deep fingerprinting will automatically default to using [["dir"]]). Also, do not use two in-training combinations with the same components as their prediction files will be overwritten. Default: [["dir", "metadata"], ["time", "metadata"]] for Var-CNN.
seq_length: The length of the input sequence fed into the CNN (default: 5000). We use this parameter right from the start when scraping the raw data.
df_epochs: The number of epochs used to train DF (default: 30).
var_cnn_max_epochs: The maximum number of epochs used to train Var-CNN (default: 150). The EarlyStopping callback often cuts off training much sooner -- whenever validation accuracy fails to increase.
var_cnn_base_patience: The "patience" (i.e., number of epochs of no validation accuracy improvement) until we decrease the learning rate of Var-CNN and stop training (default: 5). We implement this functionality in the ReduceLROnPlateau and EarlyStopping callbacks inside var_cnn.py.
dir_dilations: Whether to use dilations with the direction ResNet (default: true).
time_dilations: Whether to use dilations with the time ResNet (default: true).
inter_time: Whether to use the inter-packet time (i.e., time between two packets) or the relative time (i.e., time from the first packet) for timing data (default: true, i.e., we do use inter-packet time).
scale_metadata. Whether to scale metadata to zero mean and unit variance (default: true).

Citation

If you find Var-CNN useful in your research, please consider citing:

@article{bhat19,
  title={{Var-CNN: A Data-Efficient Website Fingerprinting Attack Based on Deep Learning}},
  author={Bhat, Sanjit and Lu, David and Kwon, Albert and Devadas, Srinivas},
  journal={Proceedings on Privacy Enhancing Technologies},
  volume={4},
  pages={292--310},
  year={2019}
}

Contact

sanjit.bhat (at) gmail.com

davidboxboro (at) gmail.com

kwonal (at) mit.edu

devadas (at) mit.edu