wkbian's Stars
yt-dlp/yt-dlp
A feature-rich command-line audio/video downloader
graphdeco-inria/gaussian-splatting
Original reference implementation of "3D Gaussian Splatting for Real-Time Radiance Field Rendering"
facebookresearch/sam2
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
PKU-YuanGroup/Open-Sora-Plan
This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.
THUDM/CogVideo
text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)
facebookresearch/dinov2
PyTorch code and models for the DINOv2 self-supervised learning method.
nerfstudio-project/nerfstudio
A collaboration friendly studio for NeRFs
open-mmlab/mmsegmentation
OpenMMLab Semantic Segmentation Toolbox and Benchmark.
rany2/edge-tts
Use Microsoft Edge's online text-to-speech service from Python WITHOUT needing Microsoft Edge or Windows or an API key
LiheYoung/Depth-Anything
[CVPR 2024] Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data. Foundation Model for Monocular Depth Estimation
gaomingqi/Track-Anything
Track-Anything is a flexible and interactive tool for video object tracking and segmentation, based on Segment Anything, XMem, and E2FGVI.
naver/dust3r
DUSt3R: Geometric 3D Vision Made Easy
DepthAnything/Depth-Anything-V2
[NeurIPS 2024] Depth Anything V2. A More Capable Foundation Model for Monocular Depth Estimation
open-mmlab/mmtracking
OpenMMLab Video Perception Toolbox. It supports Video Object Detection (VID), Multiple Object Tracking (MOT), Single Object Tracking (SOT), Video Instance Segmentation (VIS) with a unified framework.
Doubiiu/DynamiCrafter
[ECCV 2024, Oral] DynamiCrafter: Animating Open-domain Images with Video Diffusion Priors
nerfstudio-project/gsplat
CUDA accelerated rasterization of gaussian splatting
isl-org/ZoeDepth
Metric depth estimation from a single image
qianqianwang68/omnimotion
Alpha-VLLM/Lumina-T2X
Lumina-T2X is a unified framework for Text to Any Modality Generation
mhamilton723/FeatUp
Official code for "FeatUp: A Model-Agnostic Frameworkfor Features at Any Resolution" ICLR 2024
google-deepmind/tapnet
Tracking Any Point (TAP)
Tencent/DepthCrafter
DepthCrafter: Generating Consistent Long Depth Sequences for Open-world Videos
Junyi42/monst3r
Official Implementation of paper "MonST3R: A Simple Approach for Estimating Geometry in the Presence of Motion"
vye16/shape-of-motion
henry123-boy/SpaTracker
[CVPR 2024 Highlight] Official PyTorch implementation of SpatialTracker: Tracking Any 2D Pixels in 3D Space
facebookresearch/projectaria_tools
projectaria_tools is an C++/Python open-source toolkit to interact with Project Aria data
iejMac/video2dataset
Easily create large video dataset from video urls
aharley/pips2
PIPs++
y-zheng18/point_odyssey
Official code for PointOdyssey: A Large-Scale Synthetic Dataset for Long-Term Point Tracking (ICCV 2023)
serycjon/MFT
MFT: Long-Term Tracking of Every Pixel -- code for the WACV 2024 paper