Feature Guidance attack for VLP models. The approach involves the ALBEF, TCL, CLIP, and BEiT3 models, as well as the VE (Visual Entailment), VG (Visual Grounding), VR (Visual Reasoning), VQA (Visual Question Answering), ZC (Zero-shot Classification), and ITR (Image-Text Retrieval) tasks.
The code is being organized.
It is very stressful for one person to write all the code.
I hope you can understand.
If you really need it, you can contact me by email.
I can provide the unorganized source code.
The code is mainly based on the following two works:
Co-Attack:https://github.com/adversarial-for-goodness/Co-Attack
Set-level Guidance Attack:https://github.com/Zoky-2020/SGA
And other basic works:
CLIP: https://github.com/openai/CLIP
ALBEF: https://github.com/salesforce/ALBEF
BLIP: https://github.com/salesforce/BLIP
We are very grateful for their open-source work, which enabled us to complete our work——FGA.