felixfuu

felixfuu's Stars

hiyouga/LLaMA-Factory
Unified Efficient Fine-Tuning of 100+ LLMs (ACL 2024)
Language:Python35k 213 5.3k4.3k
rasbt/LLMs-from-scratch
Implement a ChatGPT-like LLM in PyTorch from scratch, step by step
Language:Jupyter Notebook33.6k 364 1074.1k
facebookresearch/segment-anything-2
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
Language:Jupyter Notebook11.1k 64 259944
lllyasviel/Omost
Your image is almost there!
Language:Python7.3k 45 81422
luosiallen/latent-consistency-model
Latent Consistency Models: Synthesizing High-Resolution Images with Few-Step Inference
Language:Python4.4k 61 96231
LLaVA-VL/LLaVA-NeXT
Language:Python3k 35 309252
Alpha-VLLM/Lumina-T2X
Lumina-T2X is a unified framework for Text to Any Modality Generation
Language:Python2.1k 31 8788
aigc-apps/EasyAnimate
📺 An End-to-End Solution for High-Resolution and Long Video Generation Based on Transformer Diffusion
Language:Python1.5k 19 110108
dvlab-research/ControlNeXt
Controllable video and image Generation, SVD, Animate Anyone, ControlNet, ControlNeXt, LoRA
Language:Python1.4k 26 7570
FoundationVision/LlamaGen
Autoregressive Model Beats Diffusion: 🦙 Llama for Scalable Image Generation
Language:Python1.3k 21 6655
omerbt/MultiDiffusion
Official Pytorch Implementation for "MultiDiffusion: Fusing Diffusion Paths for Controlled Image Generation" presenting "MultiDiffusion" (ICML 2023)
Language:Jupyter Notebook1k 35 2659
Zheng-Chong/CatVTON
CatVTON is a simple and efficient virtual try-on diffusion model with 1) Lightweight Network (899.06M parameters totally), 2) Parameter-Efficient Training (49.57M parameters trainable) and 3) Simplified Inference (< 8G VRAM for 1024X768 resolution).
Language:Python957 12 73114
HarborYuan/ovsam
[ECCV 2024] The official code of paper "Open-Vocabulary SAM".
Language:Python951 13 4828
OpenGVLab/VisionLLM
VisionLLM Series
Language:Python938 44 1328
megvii-research/megactor
Language:Python822 40 28119
mlfoundations/MINT-1T
MINT-1T: A one trillion token multimodal interleaved dataset.
778 25 1020
mit-han-lab/fastcomposer
[IJCV] FastComposer: Tuning-Free Multi-Subject Image Generation with Localized Attention
Language:Python665 20 3338
LeapLabTHU/Agent-Attention
Official repository of Agent Attention (ECCV2024)
Language:Python540 4 4537
IDEA-Research/X-Pose
[ECCV 2024] Official implementation of the paper "X-Pose: Detecting Any Keypoints"
Language:Python521 23 3325
AIGText/Glyph-ByT5
[ECCV2024] This is an official inference code of the paper "Glyph-ByT5: A Customized Text Encoder for Accurate Visual Text Rendering" and "Glyph-ByT5-v2: A Strong Aesthetic Baseline for Accurate Multilingual Visual Text Rendering""
Language:Jupyter Notebook512 18 1722
Kobaayyy/Awesome-CVPR2024-ECCV2024-AIGC
A Collection of Papers and Codes for CVPR2024/ECCV2024 AIGC
450 7 313
OPPO-Mente-Lab/Subject-Diffusion
Subject-Diffusion:Open Domain Personalized Text-to-Image Generation without Test-time Fine-tuning
Language:Python284 8 1011
aim-uofa/MovieDreamer
254 21 37
zamling/PSALM
[ECCV2024] This is an official implementation for "PSALM: Pixelwise SegmentAtion with Large Multi-Modal Model"
Language:Python193 7 2210
baaivision/DenseFusion
DenseFusion-1M: Merging Vision Experts for Comprehensive Multimodal Perception
Language:Python120 4 51
TIGER-AI-Lab/UniIR
Official code for paper "UniIR: Training and Benchmarking Universal Multimodal Information Retrievers" (ECCV 2024)
Language:Python110 3 1413
callsys/ControlCap
[ECCV 2024] ControlCap: Controllable Region-level Captioning
Language:Python55 5 91
lorebianchi98/FG-OVD
[CVPR2024 Highlight] Official repository of the paper "The devil is in the fine-grained details: Evaluating open-vocabulary object detectors for fine-grained understanding."
Language:Python45 5 93
pasqualedem/LabelAnything
Multi-Class Few-Shot Semantic Segmentation with Visual Prompts
Language:Jupyter Notebook38 2 56
byeongjun-park/Switch-DiT
[ECCV 2024] Official pytorch implementation of "Switch Diffusion Transformer: Synergizing Denoising Tasks with Sparse Mixture-of-Experts"
Language:Python32 3 16