This repository contains code for the following paper:
Var-CNN: A Data-Efficient Website Fingerprinting Attack Based on Deep Learning (PETS 2019, link to presentation)
Sanjit Bhat, David Lu, Albert Kwon, and Srini Devadas.
- Ensure that you have a functioning machine with an NVIDIA GPU inside it. The model will take significantly longer to run on a CPU.
- Make sure you have the TensorFlow/Keras deep learning stack installed. For detailed instructions, see this link under the "Software Setup" section. For our experiments, we used Ubuntu 16.04 LTS, CUDA 8.0, CuDNN v6, and TensorFlow 1.3.0 as a backend for Keras 2.0.8.
- To install all required Python packages, simply issue the following command:
pip install -r requirements.txt
.
The first step in running our model is to place the adequate amount
of raw packet sequences in the data_dir
folder. Each monitored
website needs to have at least num_mon_inst_train
+ num_mon_inst_test
instances, and there needs to be at least
num_unmon_sites_train
+ num_unmon_sites_test
unmonitored sites.
If you use the Wang et al. data format (i.e., each line representing a
new packet with the relative time and direction separated by a space),
then we have that supported in wang_to_varcnn.py
. Otherwise, you will need
to modify wang_to_varcnn.py
, or you can write your own glue code to move
to the Wang et al. format.
After setting up the data and specifying the parameters in config.json
,
you can run all parts of our code just by issuing a python run_model.py
command. After that, our programs will be called in the following sequence:
wang_to_varcnn.py
: This parses thedata_dir
folder; extracts direction, time, metadata, and labels; and stores all the monitored and unmonitored traces inall_closed_world.npz
andall_open_world.npz
, respectively, in thedata_dir
folder.preprocess_data.py
: This uses the data inall_closed_world.npz
to pick a randomnum_mon_inst_train
andnum_mon_inst_test
instances of each of thenum_mon_sites
monitored sites for the training and test sets, respectively. It also performs a similar random split for the unmonitored sites (using theall_open_world.npz
file) and preprocesses all of these traces to scale the metadata, change to inter-packet timing, etc. Finally, it saves the direction data, time data, metadata, and labels to.h5
files to conserve RAM during the training process.run_model.py
: This is the main file that first calls the prior two files. Next, it loads the model architectures from eithervar_cnn.py
ordf.py
, trains the models, saves their predictions, and callsevaluate.py
for evaluation.- During training,
data_generator.py
generates new batches of data in parallel. Since large datasets can contain hundreds of thousands of traces,data_generator.py
uses.h5
files to access the traces for one batch without loading the entire dataset into memory. evaluate.py
: This first calculates metrics for each of the in-training combinations specified inmixture
. Then, it averages each of their predictions together and reports metrics for the overall out-of-training ensemble. It saves all metrics to thejob_result.json
file.
config.json
provides the configuration settings to all the
other programs. We describe its parameters in further detail below:
data_dir
: This relative path provides the location of the "raw" packet sequences (e.g., the "0", "1", "0-0", "0-1" files in Wang et al.'s dataset). Also, it later stores theall_closed_world.npz
andall_open_world.npz
files generated bywang_to_varcnn.py
and the.h5
data files generated bypreprocess_data.py
.predictions_dir
: After training the model,run_model.py
generates predictions for the test set and stores them in this directory.evaluate.py
later uses them to calculate test metrics.num_mon_sites
: The number of monitored websites. Each of thenum_mon_sites
sites indata_dir
must have at leastnum_mon_inst_train
+num_mon_inst_test
instances.num_mon_inst_train
: The number of monitored instances used for training.num_mon_inst_test
: The number of monitored instances used for testing.num_unmon_sites_train
: The number of unmonitored sites used for training. Each site has one instance.num_unmon_sites_test
: The number of unmonitored sites used for testing. Each site has one instance, and these unmonitored websites are different from those used for training.model_name
: The model name. Either "var-cnn" or "df".batch_size
: The batch size used during training. For Var-CNN, we found that a batch size of 50 works well. The recommended batch size for DF is 128.mixture
: The mixture of ensembles used during training and evaluation. Each of the inner arrays represent models combined in-training.run_model
will save the predictions for every such in-training combination. Subsequently,evaluate_ensemble
will report metrics for these individual models as well as the overall out-of-training ensemble (i.e., the average of the individual predictions). Note: this functionality only works with Var-CNN (in fact, deep fingerprinting will automatically default to using[["dir"]]
). Also, do not use two in-training combinations with the same components as their prediction files will be overwritten. Default:[["dir", "metadata"], ["time", "metadata"]]
for Var-CNN.seq_length
: The length of the input sequence fed into the CNN (default: 5000). We use this parameter right from the start when scraping the raw data.df_epochs
: The number of epochs used to train DF (default: 30).var_cnn_max_epochs
: The maximum number of epochs used to train Var-CNN (default: 150). TheEarlyStopping
callback often cuts off training much sooner -- whenever validation accuracy fails to increase.var_cnn_base_patience
: The "patience" (i.e., number of epochs of no validation accuracy improvement) until we decrease the learning rate of Var-CNN and stop training (default: 5). We implement this functionality in theReduceLROnPlateau
andEarlyStopping
callbacks insidevar_cnn.py
.dir_dilations
: Whether to use dilations with the direction ResNet (default: true).time_dilations
: Whether to use dilations with the time ResNet (default: true).inter_time
: Whether to use the inter-packet time (i.e., time between two packets) or the relative time (i.e., time from the first packet) for timing data (default: true, i.e., we do use inter-packet time).scale_metadata
. Whether to scale metadata to zero mean and unit variance (default: true).
If you find Var-CNN useful in your research, please consider citing:
@article{bhat19,
title={{Var-CNN: A Data-Efficient Website Fingerprinting Attack Based on Deep Learning}},
author={Bhat, Sanjit and Lu, David and Kwon, Albert and Devadas, Srinivas},
journal={Proceedings on Privacy Enhancing Technologies},
volume={4},
pages={292--310},
year={2019}
}
sanjit.bhat (at) gmail.com
davidboxboro (at) gmail.com
kwonal (at) mit.edu
devadas (at) mit.edu
Any discussions, suggestions, and questions are welcome!