Laoannnnn's Stars
CompVis/stable-diffusion
A latent text-to-image diffusion model
Stability-AI/stablediffusion
High-Resolution Image Synthesis with Latent Diffusion Models
openai/CLIP
CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
milesial/Pytorch-UNet
PyTorch implementation of the U-Net for image semantic segmentation with high quality images
QwenLM/Qwen-VL
The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.
THUDM/VisualGLM-6B
Chinese and English multimodal conversational language model | 多模态中英双语对话语言模型
dvlab-research/MGM
Official repo for "Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models"
KimMeen/Time-LLM
[ICLR 2024] Official implementation of " 🦙 Time-LLM: Time Series Forecasting by Reprogramming Large Language Models"
thu-ml/unidiffuser
Code and models for the paper "One Transformer Fits All Distributions in Multi-Modal Diffusion"
DeepGraphLearning/KnowledgeGraphEmbedding
gnobitab/RectifiedFlow
Official Implementation of Rectified Flow (ICLR2023 Spotlight)
baofff/U-ViT
A PyTorch implementation of the paper "All are Worth Words: A ViT Backbone for Diffusion Models".
lukemelas/PyTorch-Pretrained-ViT
Vision Transformer (ViT) in PyTorch
VincentStimper/normalizing-flows
PyTorch implementation of normalizing flow models
wavefrontshaping/complexPyTorch
A high-level toolbox for using complex valued neural networks in PyTorch
SingleZombie/DL-Demos
Demos for deep learning
rentainhe/visualization
a collection of visualization function
rshaojimmy/MultiModal-DeepFake
[TPAMI 2024 & CVPR 2023] PyTorch code for DGM4: Detecting and Grounding Multi-Modal Media Manipulation and beyond
HaozheLiu-ST/T-GATE
T-GATE: Temporally Gating Attention to Accelerate Diffusion Model for Free!
qitianwu/DIFFormer
The official implementation for ICLR23 spotlight paper "DIFFormer: Scalable (Graph) Transformers Induced by Energy Constrained Diffusion"
sail-sg/CLoT
CVPR'24, Official Codebase of our Paper: "Let's Think Outside the Box: Exploring Leap-of-Thought in Large Language Models with Creative Humor Generation".
kennqiang/MDFEND-Weibo21
Source code and data in paper "MDFEND: Multi-domain Fake News Detection (CIKM'21)"
headacheboy/data-of-multimodal-sarcasm-detection
firojalam/harmful-memes-detection-resources
Resources (conference/journal publications, references to dataset) for harmful memes detection.
less-and-less-bugs/HKEmodel
Official implementation of Towards Multi-Modal Sarcasm Detection via Hierarchical Congruity Modeling with Knowledge Enhancement.
HITSZ-HLT/CMGCN
[ACL 2022] The source code of Multi-Modal Sarcasm Detection via Cross-Modal Graph Convolutional Network
TIAN-viola/DynRT
Official implementation of Dynamic Routing Transformer Network for Multimodal Sarcasm Detection (ACL'23)
LCS2-IIITD/MOMENTA
wangbing1416/HAMI-M3D
Source code of our MM'24 paper "Harmfully Manipulated Images Matter in Multimodal Misinformation Detection"
AI-Machine-Vision-Lab/dpm-solver-ODE-Solver-for-Diffusion
Official code for "DPM-Solver: A Fast ODE Solver for Diffusion Probabilistic Model Sampling in Around 10 Steps" (Neurips 2022 Oral)