aks2203/poisoning-benchmark

Black-box setting

ArshadIram opened this issue · 4 comments

Hi, I like to thank you for providing a benchmark for the fair analysis. However, I would like to know more about black-box settings.
You mentioned in the paper that we craft poison using the known model and tested it on the two unknown models, averaging the results.

I do not clearly understand this setting. It would be nice if you clear it.

Is the dataset is known to attacker?

Many thanks

Hi there,
Yes, the dataset is known to the attacker. In the CIFAR-10 case: The victim's model architecture, however, is not known. So the attacker uses a ResNet-18 to craft poisons, but to evaluate, we assume the victim is using a MobileNet or a VGG so we test both and average the results. With TinyImagenet, the attacker uses a VGG to craft poisons, and evaluation is done on a ResNet-34 and a MobileNet. Does that help?

So in a black box setting. Dataset is known to the attacker. The attacker craft the poison instances, which are provided to the victim.
In the white box set. Dataset is known, and the victim model is also known.

These settings are consistent in transfer learning and training from scratch?

No, for white-box tests in the transfer learning benchmarks, we use the
same frozen feature extractor that is given to the attacker for
evaluation.

However, since the victim is training from a random initialization, there is no white-box level. When training from scratch, there is only one situation. The attacker uses a ResNet-18, and the poisoned dataset is evaluated by averaging the attacks success rate when training models of three architectures (including ResNet-18). This example is for CIFAR-10.

Does that clarify things?

Many thanks for your answers.