/Variations-of-Cumulative-Noise-Addition

Cumulative Noise Addition is the state of the art of noise addition methods to preserve the privacy of data while maintaining a good level of accuracy. When this method is applied to a data stream there can be a drastic drop of accuracy with the time because of the huge amount of noise added to the stream. We experimented seven different variations of cumulative noise addition by in-cooperating different techniques to control the noise level. The coding was done using Clojure and needs to be run via Jupyter notebook.

Primary LanguageClojureGNU Affero General Public License v3.0AGPL-3.0

Optimizing the Trade-off Between Data Privacy and Classification Accuracy in Data Stream Mining

This repository contains experimentation with variations of cumulative noise additions performing privacy-preserving data stream mining. Adaptive random forest for the classification and known I/O attacks to measure the privacy has been used. The objective of this work is to control the maximum noise level of cumulative noise addition by in- cooperating different techniques.

Quickstart

If you have Docker installed, you can run the experiments contained within this codebase by executing make jupyter, opening the returned URL in a web browser, and executing the contents of the provided Jupyter notebooks (This has only been tested on an Ubuntu 16.04 host running Docker 17.05.0-ce).

You will need to run the notebooks in the "dataset-construction" sub-folder before the notebooks that depend on those datasets.

Dependencies

  • Java (>= 1.8.0)
  • Leiningen (>= 2.0)

Further Usage

See Makefile commands