tanwanirahul/CLIP_from_scratch
OpenAI's CLIP model implementation. It uses ViT as Image Encode and BERT like transformer Encoder as Text Encoder.
PythonApache-2.0
OpenAI's CLIP model implementation. It uses ViT as Image Encode and BERT like transformer Encoder as Text Encoder.
PythonApache-2.0