yakunpku's Stars
jimmycv07/DiffIR2VR-Zero
nianticlabs/mickey
[CVPR 2024 - Oral] Matching 2D Images in 3D: Metric Relative Pose from Metric Correspondences
Bing-su/adetailer
Auto detecting, masking and inpainting with detection model.
CASIA-IVA-Lab/FastSAM
Fast Segment Anything
Nightmare-n/UniPAD
UniPAD: A Universal Pre-training Paradigm for Autonomous Driving (CVPR 2024)
SalesforceAIResearch/DiffusionDPO
Code for "Diffusion Model Alignment Using Direct Preference Optimization"
large-ocr-model/large-ocr-model.github.io
lyuwenyu/RT-DETR
[CVPR 2024] Official RT-DETR (RTDETR paddle pytorch), Real-Time DEtection TRansformer, DETRs Beat YOLOs on Real-time Object Detection. 🔥 🔥 🔥
sanweiliti/RoHM
The official PyTorch code for RoHM: Robust Human Motion Reconstruction via Diffusion.
ggerganov/llama.cpp
LLM inference in C/C++
KwaiVGI/I2V-Adapter
I2V-Adapter: A General Image-to-Video Adapter for Diffusion Models
hellock/icrawler
A multi-thread crawler framework with many builtin image crawlers provided.
Imageomics/bioclip
This is the repository for the BioCLIP model and the TreeOfLife-10M dataset [CVPR'24 Oral, Best Student Paper].
facebookresearch/PlatoNeRF
PlatoNeRF: 3D Reconstruction in Plato's Cave via Single-View Two-Bounce Lidar
hiDaDeng/cntext
文本分析包,支持字数统计、可读性、文档相似度、情感分析在内的多种文本分析方法。chinese text sentiment analysis
ICTMCG/Make-Your-Anchor
[CVPR 2024] Make-Your-Anchor: A Diffusion-based 2D Avatar Generation Framework.
fudan-generative-vision/champ
Champ: Controllable and Consistent Human Image Animation with 3D Parametric Guidance
PRIS-CV/DemoFusion
Let us democratise high-resolution generation! (CVPR 2024)
TMElyralab/MuseV
MuseV: Infinite-length and High Fidelity Virtual Human Video Generation with Visual Conditioned Parallel Denoising
NVlabs/instant-ngp
Instant neural graphics primitives: lightning fast NeRF and more
fudan-generative-vision/hallo
Hallo: Hierarchical Audio-Driven Visual Synthesis for Portrait Image Animation
whai362/PVT
Official implementation of PVT series
FoundationVision/GLEE
[CVPR2024 Highlight]GLEE: General Object Foundation Model for Images and Videos at Scale
PeizeSun/SparseR-CNN
[CVPR2021, PAMI2023] End-to-End Object Detection with Learnable Proposal
ShoufaChen/DiffusionDet
[ICCV2023 Best Paper Finalist] PyTorch implementation of DiffusionDet (https://arxiv.org/abs/2211.09788)
FoundationVision/LlamaGen
Autoregressive Model Beats Diffusion: 🦙 Llama for Scalable Image Generation
tencent-ailab/IP-Adapter
The image prompt adapter is designed to enable a pretrained text-to-image diffusion model to generate images with image prompt.
damian0815/compel
A prompting enhancement library for transformers-type text embedding systems
seatgeek/thefuzz
Fuzzy String Matching in Python
OpenBMB/MiniCPM-V
MiniCPM-V 2.6: A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone