HY-Wong's Stars
NVlabs/ffhq-dataset
Flickr-Faces-HQ Dataset (FFHQ)
FoundationVision/VAR
[NeurIPS 2024 Oral][GPT beats diffusionš„] [scaling laws in visual generationš] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". An *ultra-simple, user-friendly yet state-of-the-art* codebase for autoregressive image generation!
mlfoundations/open_clip
An open source implementation of CLIP.
openai/CLIP
CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
MILVLG/bottom-up-attention.pytorch
A PyTorch reimplementation of bottom-up-attention models
Link-Li/CLMLF
IsaacBravo/streamlit-app
This is an interactive app that allow users play around with the clip model to analyze images
zhutong0219/ITIN
Multimodal Sentiment Analysis with Image-Text Interaction Network