Trusted-AI/adversarial-robustness-toolbox

Implementation of BadDet Poisoning Attack on Object Detectors

f4str opened this issue · 0 comments

f4str commented

Is your feature request related to a problem? Please describe.
BadDet is a backdoor poisoning attack on object detection models such as Faster R-CNN and YOLO. This attack is a generalization of BadNet (the dirty label backdoor attack) and has shown to be effective in inserting backdoor triggers to degrade models. This attack has four variations:

  1. Object Generation Attack (OGA)
  2. Regional Misclassification Attack (RMA)
  3. Global Misclassification Attack (GMA)
  4. Object Disappearance Attack (ODA)

Ideally, all four attacks will be implemented eventually, but the first will begin with the simple Regional Misclassification Attack (RMA) which changes the label of a detection.

Paper link: https://arxiv.org/abs/2205.14497

Describe the solution you'd like
Under the adversarial-robustness-toolbox/attacks/poisoning directory, a subdirectory bad_det will be created. In this directory, an abstract class BadDet will extend art.attacks.PoisoningAttackBlackBox to serve as the base for the four attack variants. Additional classes (e.g., BadDetOGA, BadDetRMA, etc.) will be created in the adversarial-robustness-toolbox/attacks/poisoning/bad_det directory which extend the BadDet abstract class and implement the respective attack.

Describe alternatives you've considered
One idea is to create one class art.attacks.poisoning.BadDet which contains all four variants of the attack. However, this will make the class too complex as each variant is a unique attack and will require too much optional parameter parsing. It is simpler to keep each variant separate as their own class.

Additional context
It might be worth breaking this into multiple issues, one for each attack variant. This is mainly due to the fact that each variant of BadDet is essentially its own unique attack.