support for non probability outputs
priamai opened this issue · 5 comments
Hi there,
very nice project, is there a plan to implement also attacks that works on the label output (no probabilities) or limited API query setting?
There was an article here a while ago which would be nice to have.
Is this something that should be implemented in the ART component?
Okay maybe I am not interpreting the documentation right, for example the HopSpikJumpAttack works on predicted labels not probabilities, but in the creditfraud example:
https://github.com/Azure/counterfit/blob/main/demo/WEBINAR-DEMO-2.md
the target is designed with out probabilities.
Would be nice to get some clarity there.
You effectively translate probabilities to labels depending on what a model gives you back in outputs_to_labels
.
A good example here in the wiki . TextAttack requires a numerical value in model_output_classes
, and ART will work on with a text label of a numerical label.
Hi there,
that I understand but is there an example where the target outputs a label (not the probability) ? The creditcard example provides output probabilities.
The HopSkipJump should work directly with binary labels and not probabilities.
Set your outputs to [0, 1]
where 1 is the positive class.
Will try that thanks.