byby666

byby666's Stars

karpathy/nanoGPT
The simplest, fastest repository for training/finetuning medium-sized GPTs.
Language:Python37k5.9k
Kedreamix/Linly-Talker
Digital Avatar Conversational System - Linly-Talker. 😄✨ Linly-Talker is an intelligent AI system that combines large language models (LLMs) with visual models to create a novel human-AI interaction method. 🤝🤖 It integrates various technologies like Whisper, Linly, Microsoft Speech Services, and SadTalker talking head generation system. 🌟🔬
Language:Python2k319
lllyasviel/Omost
Your image is almost there!
Language:Python7.3k418
TinyLLaVA/TinyLLaVA_Factory
A Framework of Small-scale Large Multimodal Models
Language:Python61859
dvlab-research/MGM
Official repo for "Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models"
Language:Python3.2k277
mhamilton723/FeatUp
Official code for "FeatUp: A Model-Agnostic Frameworkfor Features at Any Resolution" ICLR 2024
Language:Jupyter Notebook1.4k78
wysoczanska/clip_dinoiser
Official implementation of 'CLIP-DINOiser: Teaching CLIP a few DINO tricks' paper.
Language:Jupyter Notebook2049
prs-eth/Marigold
[CVPR 2024 - Oral, Best Paper Award Candidate] Marigold: Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation
Language:Python2.3k130
vvictoryuki/AnimateZero
Official PyTorch implementation for the paper "AnimateZero: Video Diffusion Models are Zero-Shot Image Animators"
3507
ChenyangSi/FreeU
FreeU: Free Lunch in Diffusion U-Net (CVPR2024 Oral)
1.7k73
harlanhong/ICCV2023-MCNET
The official code of our ICCV2023 work: Implicit Identity Representation Conditioned Memory Compensation Network for Talking Head video Generation
Language:Python24622
paulpanwang/POPE
Welcome to the project repository for POPE (Promptable Pose Estimation), a state-of-the-art technique for 6-DoF pose estimation of any object in any scene using a single reference.
Language:Python13911
THUDM/CogVLM
a state-of-the-art-level open visual language model | 多模态预训练模型
Language:Python6k413
bcmi/DCI-VTON-Virtual-Try-On
[ACM Multimedia 2023] Taming the Power of Diffusion Models for High-Quality Virtual Try-On with Appearance Flow.
Language:Python40461
cure-lab/PnPInversion
[ICLR2024] Official repo for paper "PnP Inversion: Boosting Diffusion-based Editing with 3 Lines of Code"
Language:Jupyter Notebook2488
sled-group/CycleNet
Official Code for NeurIPS 2023 Paper: CycleNet: Rethinking Cycle Consistent in Text‑Guided Diffusion for Image Manipulation
Language:Python756
caiyuanhao1998/Retinexformer
"Retinexformer: One-stage Retinex-based Transformer for Low-light Image Enhancement" (ICCV 2023) & (NTIRE 2024 Challenge)
Language:Python88074
tjiiv-cprg/EPro-PnP
[CVPR 2022 Oral, Best Student Paper] EPro-PnP: Generalized End-to-End Probabilistic Perspective-n-Points for Monocular Object Pose Estimation
Language:Python1.1k106
shangbuhuan13/SO-Pose
This repository contains codes of ICCV2021 paper: SO-Pose: Exploiting Self-Occlusion for Direct 6D Pose Estimation
Language:Python6710
Zheng-Chong/FashionMatrix
Fashion Matrix is dedicated to bridging various visual and language models and continuously refining its capabilities as a comprehensive fashion AI assistant. This project will continue to update new features and optimization effects.
Language:Jupyter Notebook1218
cuiziteng/Illumination-Adaptive-Transformer
🌕 [BMVC 2022] You Only Need 90K Parameters to Adapt Light: A Light Weight Transformer for Image Enhancement and Exposure Correction. SOTA for low light enhancement, 0.004 seconds try this for pre-processing.
Language:Python46843
JianghaiSCU/R2RNet
Official code of "R2RNet: Low-light Image Enhancement Via Real-low to Real-normal Network".
Language:Python15017
qiuyu96/CoDeF
[CVPR 2024 Highlight] Official PyTorch implementation of CoDeF: Content Deformation Fields for Temporally Consistent Video Processing
Language:Python4.8k388
SizheAn/PanoHead
Code Repository for CVPR 2023 Paper "PanoHead: Geometry-Aware 3D Full-Head Synthesis in 360 degree"
Language:Python1.9k237
williamyang1991/StyleGANEX
[ICCV 2023] StyleGANEX: StyleGAN-Based Manipulation Beyond Cropped Aligned Faces
Language:Jupyter Notebook50536
jianzongwu/Awesome-Open-Vocabulary
(TPAMI 2024) A Survey on Open Vocabulary Learning
82648
baaivision/Emu
Emu Series: Generative Multimodal Models from BAAI
Language:Python1.6k86
Meta-Portrait/MetaPortrait
[CVPR 2023] MetaPortrait: Identity-Preserving Talking Head Generation with Fast Personalized Adaptation
Language:Python53541
facebookresearch/ijepa
Official codebase for I-JEPA, the Image-based Joint-Embedding Predictive Architecture. First outlined in the CVPR paper, "Self-supervised learning from images with a joint-embedding predictive architecture."
Language:Python2.8k356
DAMO-NLP-SG/Video-LLaMA
[EMNLP 2023 Demo] Video-LLaMA: An Instruction-tuned Audio-Visual Language Model for Video Understanding
Language:Python2.8k255