vclip-model: A Python repository from noahtren

VCLIP is an extension of OpenAI's CLIP for variational inference. It was fine-tuned on a subset of Conceptual Captions. This repo contains the simple implementation code and a link to the weights. The implementation is an extension of HuggingFace's FlaxCLIPModel.

Pretrained weights (Google Cloud Storage).

VCLIP computes a Gaussian distribution over images for each prompt, rather than returning a single point. The similarity function between (text, img) is the normal probability density function rather than cosine similarity.

left: CLIP, right: VCLIP

noahtren/vclip-model