mltrace-ifc-demo

Description: this demo is for CS294 (Privacy-Preserving Systems).

This tutorial builds a training and testing pipeline for a toy ML prediction problem: to predict whether a passenger in a NYC taxicab ride will give the driver a nontrivial tip. This is a binary classification task. A nontrivial tip is arbitrarily defined as greater than 10% of the total fare (before tip). To evaluate the model or measure the efficacy of the model, we measure the F1 score. This task is modeled after the task described in toy-ml-pipeline.

The purpose of this demo is to demonstrate how we have incorporated information flow control techniques to help developers retract data from customers who request data deletion. In this demo, we:

Run training pipeline on Jan 2020 data
Run inference “weekly” from Feb 1, 2020 to May 31, 2020
Delete user_109 label (not used in training)

“Weekly” inference will still run successfully

Delete user_139 label (used in training)

Use 30-second threshold (default is 30 days)
“Weekly” inference will throw errors

Experiments

vary number of labels, measure runtime & space
vary number of deleted labels, measure runtime & space
on committing new labels, vary cardinality and measure runtime
on propagating through pipeline, vary cardinality and measure runtime

TODO

clean up deletion experiment
run each experiment many times
put in paper

shreyashankar/mltrace-ifc-demo

mltrace-ifc-demo

Experiments

TODO