Please find the paper here.
Requirements:
- Python 3.7
- We use Pipenv to handle Python package installations. Note: we use
torch=1.1.0
for compatability with Pipenv. ThePipfile.lock
can be used to ensure that you are running with the exact same dependencies as we have used. - OSX users may need to install
libomp
e.g. with Homebrewbrew install libomp
- A MongoDB is used to store results, and Sacred (Github) is used to place useful information in this database.
Please see the experiments/workshop
directory for a detailed description of how to reproduce plots found in our NeurIPS 2019 Privacy in Machine Learning (PiML) Workshop Submission.
preprocess_data.py
expects the UCI datasets (here) in the data/abalone
and the data/adult
directories to be pre-processing. Each dataset then creates a labels file, y.csv
and additionally several data files, x_ordinal.csv, x_scaled.csv
and x_scaled_ordinal.csv
The Sacred (Github) logging functionality relies on the existence of a Mongo database, by default assumed to be in localhost:9001
. If this is on another machine, ssh port forwarding is an easy way to get this working without changing the code. (E.g. for mlg people, if the database is on Hinton: ssh hinton -L 9001:localhost:9001
will connect port 9001 on Hinton to port 9001 on your machine. In order to bring up and bring down the database, suitable scripts are in /scratch/mjh252/mongo
)
Omniboard (Github) is an excellent way to view results in a Sacred database.
Deleting database entries can be done as follows, in the mongo shell:
use sacred # use the correct database name
db.runs.remove({'config.experiment_tag':'....'}) # apply the correct filter
db.metrics.remove({run_id: {$gt: 333, $lt: 600}}) # remove metrics, linked by run id
Beware: whilst this removes runs from the omniboard view window, it will not actually delete the log files!!