Pinned Repositories
Kandinsky-2
Kandinsky 2 — multilingual text2image latent diffusion model
InternLM-XComposer
InternLM-XComposer2.5-OmniLive: A Comprehensive Multimodal System for Long-term Streaming Video and Audio Interactions
llm-action
本项目旨在分享大模型相关技术原理以及实战经验(大模型工程化、大模型应用落地)
waifu2x
Image Super-Resolution for Anime-Style Art
PixArt-sigma
PixArt-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation
BLIP
PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
LAVIS
LAVIS - A One-stop Library for Language-Vision Intelligence
InstructIR
InstructIR: High-Quality Image Restoration Following Human Instructions https://huggingface.co/spaces/marcosv/InstructIR
GLM-4
GLM-4 series: Open Multilingual Multimodal Chat LMs | 开源多语言多模态对话模型
Caption-Anything
Caption-Anything is a versatile tool combining image segmentation, visual captioning, and ChatGPT, generating tailored captions with diverse controls for user preferences. https://huggingface.co/spaces/TencentARC/Caption-Anything https://huggingface.co/spaces/VIPLab/Caption-Anything
shams2023's Repositories
shams2023/InstructIR
InstructIR: High-Quality Image Restoration Following Human Instructions https://huggingface.co/spaces/marcosv/InstructIR