wkbian

wkbian's Stars

yt-dlp/yt-dlp
A feature-rich command-line audio/video downloader
Language:Python105k 574 8.9k8.2k
graphdeco-inria/gaussian-splatting
Original reference implementation of "3D Gaussian Splatting for Real-Time Radiance Field Rendering"
Language:Python16.1k 123 1.1k2.2k
facebookresearch/sam2
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
Language:Jupyter Notebook14.6k 82 4811.5k
PKU-YuanGroup/Open-Sora-Plan
This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.
Language:Python11.9k 155 3731.1k
THUDM/CogVideo
text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)
Language:Python11k 134 5751.1k
facebookresearch/dinov2
PyTorch code and models for the DINOv2 self-supervised learning method.
Language:Jupyter Notebook10k 96 423903
nerfstudio-project/nerfstudio
A collaboration friendly studio for NeRFs
Language:Python10k 117 1.7k1.4k
open-mmlab/mmsegmentation
OpenMMLab Semantic Segmentation Toolbox and Benchmark.
Language:Python8.7k 52 2.4k2.7k
rany2/edge-tts
Use Microsoft Edge's online text-to-speech service from Python WITHOUT needing Microsoft Edge or Windows or an API key
Language:Python7.7k 56 252737
LiheYoung/Depth-Anything
[CVPR 2024] Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data. Foundation Model for Monocular Depth Estimation
Language:Python7.4k 51 225562
gaomingqi/Track-Anything
Track-Anything is a flexible and interactive tool for video object tracking and segmentation, based on Segment Anything, XMem, and E2FGVI.
Language:Python6.7k 61 141489
naver/dust3r
DUSt3R: Geometric 3D Vision Made Easy
Language:Python6k 53 181643
DepthAnything/Depth-Anything-V2
[NeurIPS 2024] Depth Anything V2. A More Capable Foundation Model for Monocular Depth Estimation
Language:Python4.9k 40 235438
open-mmlab/mmtracking
OpenMMLab Video Perception Toolbox. It supports Video Object Detection (VID), Multiple Object Tracking (MOT), Single Object Tracking (SOT), Video Instance Segmentation (VIS) with a unified framework.
Language:Python3.7k 48 466599
Doubiiu/DynamiCrafter
[ECCV 2024, Oral] DynamiCrafter: Animating Open-domain Images with Video Diffusion Priors
Language:Python2.8k 33 148227
nerfstudio-project/gsplat
CUDA accelerated rasterization of gaussian splatting
Language:Python2.7k 51 318375
isl-org/ZoeDepth
Metric depth estimation from a single image
Language:Jupyter Notebook2.5k 36 120230
qianqianwang68/omnimotion
Language:Python2.2k 121 56124
Alpha-VLLM/Lumina-T2X
Lumina-T2X is a unified framework for Text to Any Modality Generation
Language:Python2.2k 32 9091
mhamilton723/FeatUp
Official code for "FeatUp: A Model-Agnostic Frameworkfor Features at Any Resolution" ICLR 2024
Language:Jupyter Notebook1.5k 16 7784
google-deepmind/tapnet
Tracking Any Point (TAP)
Language:Jupyter Notebook1.4k 31 130138
Tencent/DepthCrafter
DepthCrafter: Generating Consistent Long Depth Sequences for Open-world Videos
Language:Python1.2k 52 5066
Junyi42/monst3r
Official Implementation of paper "MonST3R: A Simple Approach for Estimating Geometry in the Presence of Motion"
Language:Python1.1k 36 6660
vye16/shape-of-motion
Language:Python937 16 6668
henry123-boy/SpaTracker
[CVPR 2024 Highlight] Official PyTorch implementation of SpatialTracker: Tracking Any 2D Pixels in 3D Space
Language:Python813 59 4432
facebookresearch/projectaria_tools
projectaria_tools is an C++/Python open-source toolkit to interact with Project Aria data
Language:C++586 44 11982
iejMac/video2dataset
Easily create large video dataset from video urls
Language:Python585 9 15670
aharley/pips2
PIPs++
Language:Python303 4 2736
y-zheng18/point_odyssey
Official code for PointOdyssey: A Large-Scale Synthetic Dataset for Long-Term Point Tracking (ICCV 2023)
Language:Python140 8 227
serycjon/MFT
MFT: Long-Term Tracking of Every Pixel -- code for the WACV 2024 paper
Language:Python56 3 47