/LLP_PPDM

Prototype designed to prevent data from privacy leaking during Machine Learning pipeline.

Primary LanguagePython

Semi-supervised Learning & epislon Differential Privacy Protection

Paper Abstract Website

Combined Semi-supervised learning with Privacy Perserving Data Mining algorithm, which outperformed all previous privacy preserving methods.

Learning from Label proportion

LLP is framework to enable semi-supervised learning on classification algorithm. I implemented LLP on logistic regression, so the model learned from bag proprotion rather than individual label. LLP dataset

Differential Privacy

Adding random noise sample from Laplace distribution which ensures differential privacy. LLP dataset

Experiments

Tested on Adult dataset on income classification, and the model converged after adding enough laplace noise to both label proprotion and data matrix. Tested on Instagram hostile comment dataset, and the model converged after adding enough laplace noise to both label proprotion and data matrix.

Adding random noise sample from Laplace distribution which ensures differential privacy.

Proving

image image image image image

Thanks

Thanks supporting from Professor Aron and his PHD students.