Azure/counterfit

support for non probability outputs

priamai opened this issue · 5 comments

Hi there,
very nice project, is there a plan to implement also attacks that works on the label output (no probabilities) or limited API query setting?
There was an article here a while ago which would be nice to have.
Is this something that should be implemented in the ART component?

Okay maybe I am not interpreting the documentation right, for example the HopSpikJumpAttack works on predicted labels not probabilities, but in the creditfraud example:
https://github.com/Azure/counterfit/blob/main/demo/WEBINAR-DEMO-2.md
the target is designed with out probabilities.
Would be nice to get some clarity there.

You effectively translate probabilities to labels depending on what a model gives you back in outputs_to_labels.

A good example here in the wiki . TextAttack requires a numerical value in model_output_classes, and ART will work on with a text label of a numerical label.

Hi there,
that I understand but is there an example where the target outputs a label (not the probability) ? The creditcard example provides output probabilities.
The HopSkipJump should work directly with binary labels and not probabilities.

Set your outputs to [0, 1] where 1 is the positive class.

Will try that thanks.