XiaoyuShi97's Stars
RVC-Boss/GPT-SoVITS
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
hpcaitech/Open-Sora
Open-Sora: Democratizing Efficient Video Production for All
LiheYoung/Depth-Anything
[CVPR 2024] Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data. Foundation Model for Monocular Depth Estimation
ChaoningZhang/MobileSAM
This is the official code for MobileSAM project that makes SAM lightweight for mobile applications and beyond!
deepseek-ai/DeepSeek-V2
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
praydog/UEVR
Universal Unreal Engine VR Mod (4.8 - 5.4)
dvlab-research/ControlNeXt
Controllable video and image Generation, SVD, Animate Anyone, ControlNet, ControlNeXt, LoRA
TencentARC/MotionCtrl
Official Code for MotionCtrl [SIGGRAPH 2024]
THUDM/ImageReward
[NeurIPS 2023] ImageReward: Learning and Evaluating Human Preferences for Text-to-image Generation
Junyi42/monst3r
Official Implementation of paper "MonST3R: A Simple Approach for Estimating Geometry in the Presence of Motion"
henry123-boy/SpaTracker
[CVPR 2024 Highlight] Official PyTorch implementation of SpatialTracker: Tracking Any 2D Pixels in 3D Space
pixeli99/SVD_Xtend
Stable Video Diffusion Training Code and Extensions.
OpenGVLab/DCNv4
[CVPR 2024] Deformable Convolution v4
maitrix-org/Pandora
Pandora: Towards General World Model with Natural Language Actions and Video States
hehao13/CameraCtrl
segmind/segmoe
tgxs002/HPSv2
Human Preference Score v2: A Solid Benchmark for Evaluating Human Preferences of Text-to-Image Synthesis
bytedance/particle-sfm
ParticleSfM: Exploiting Dense Point Trajectories for Localizing Moving Cameras in the Wild. ECCV 2022.
mbzuai-oryx/VideoGPT-plus
Official Repository of paper VideoGPT+: Integrating Image and Video Encoders for Enhanced Video Understanding
lixiaoyu2000/Poly-MOT
Official Repo For IROS 2023 Accepted Paper "Poly-MOT"
LeonHLJ/FouriScale
Official implementation of FouriScale (ECCV2024)
CaraJ7/CoMat
[Neurips 2024] đź’«CoMat: Aligning Text-to-Image Diffusion Model with Image-to-Text Concept Matching
G-U-N/Motion-I2V
[SIGGRAPH 2024] Motion I2V: Consistent and Controllable Image-to-Video Generation with Explicit Motion Modeling
rongyaofang/PUMA
Empowering Unified MLLM with Multi-granular Visual Generation
wwsource/SceneTracker
SceneTracker: Long-term Scene Flow Estimation Network
jianghd1996/Camera-control
This project explores the opportunities of deep learning for camera control in virtual cinematography.
Mawiszus/World-GAN
Official repository for "World-GAN: a Generative Model for Minecraft Worlds" by Maren Awiszus, Frederik Schubert and Bodo Rosenhahn.
OmicsML/CellPLM
Official repo for CellPLM: Pre-training of Cell Language Model Beyond Single Cells.
wwsource/SplatFlow
[IJCV 2024] SplatFlow: Learning Multi-frame Optical Flow via Splatting
wwsource/SplatFlow3D