Customized training scripts of multimodal components: clip, ViT, detr
No issues in this repository yet.