4mm7

4mm7's Stars

myshell-ai/OpenVoice
Instant voice cloning by MIT and MyShell.
Language:Python30.3k 222 2603k
colmap/colmap
COLMAP - Structure-from-Motion and Multi-View Stereo
Language:C++8k 174 2.1k1.6k
showlab/Awesome-Video-Diffusion
A curated list of recent diffusion models for video generation, editing, restoration, understanding, etc.
3.7k 141 28211
MLNLP-World/Paper-Writing-Tips
MLNLP社区用来帮助大家避免论文投稿小错误的整理仓库。 Paper Writing Tips
3.7k 44 5474
315386775/DeepLearing-Interview-Awesome-2024
AIGC-interview/CV-interview/LLMs-interview面试问题与答案集合仓，同时包含工作和科研过程中的新想法、新问题、新资源与新项目
1.9k 29 1182
VainF/Awesome-Anything
General AI methods for Anything: AnyObject, AnyGeneration, AnyModel, AnyTask, AnyX
1.7k 54 196
janosh/awesome-normalizing-flows
Awesome resources on normalizing flows.
Language:Python1.5k 45 13127
zalandoresearch/pytorch-ts
PyTorch based Probabilistic Time Series forecasting framework based on GluonTS backend
Language:Python1.3k 26 142197
jfzhang95/pytorch-video-recognition
PyTorch implemented C3D, R3D, R2Plus1D models for video activity recognition.
Language:Python1.2k 17 75255
stereolabs/zed-sdk
⚡️The spatial perception framework for rapidly building smart robots and spaces
Language:C++837 25 622470
pixeli99/SVD_Xtend
Stable Video Diffusion Training Code and Extensions.
Language:Python641 13 6365
Tsingularity/dift
[NeurIPS'23] Emergent Correspondence from Image Diffusion
Language:Python638 8 2835
Mikoto10032/AutomaticWeightedLoss
Multi-task learning using uncertainty to weigh losses for scene geometry and semantics, Auxiliary Tasks in Multi-task Learning
Language:Python593 5 1883
rese1f/MovieChat
[CVPR 2024] MovieChat: From Dense Token to Sparse Memory for Long Video Understanding
Language:Python563 12 8442
ubicomplab/rPPG-Toolbox
rPPG-Toolbox: Deep Remote PPG Toolbox (NeurIPS 2023)
Language:Python538 13 174137
OpenRobotLab/EmbodiedScan
[CVPR 2024 & NeurIPS 2024] EmbodiedScan: A Holistic Multi-Modal 3D Perception Suite Towards Embodied AI
Language:Python520 7 7338
Sid2697/awesome-egocentric-vision
A curated list of egocentric (first-person) vision and related area resources
272 9 230
genforce/ctrl-x
Official implementation of "Ctrl-X: Controlling Structure and Appearance for Text-To-Image Generation Without Guidance" (NeurIPS 2024)
Language:Python271 22 79
GitGyun/visual_token_matching
[ICLR'23 Oral] Universal Few-shot Learning of Dense Prediction Tasks with Visual Token Matching
Language:Python253 7 1713
rese1f/Awesome-VQVAE
A collection of resources and papers on Vector Quantized Variational Autoencoder (VQ-VAE) and its application
241 11 28
xliucs/MTTS-CAN
Multi-Task Temporal Shift Attention Networks for On-Device Contactless Vitals Measurement (NeurIPS 2020)
Language:Python175 8 3557
ZitongYu/PhysNet
code of BMVC2019 paper 'Photoplethysmograph Signal Measurement from Facial Videos Using Spatio-Temporal Networks'
Language:Python80 1 514
carl-vbn/minecraft-voxel-loader
A Fabric Mod and a set of Python scripts to load and play 3D animations inside Minecraft
Language:Java46 3 227
EgoAlpha/Awesome-Egocentric
44 5 08
PKU-RL/Creative-Agents
Language:Python43 1 02
togheppi/DualGAN
PyTorch implementation of DualGAN
Language:Python23 1 26
Wayne-Mai/EgoLoc
For Ego4D VQ3D Task
Language:Python19 1 42
rese1f/UniAP
[AAAI 2024] UniAP: Towards Universal Animal Perception in Vision via Few-shot Learning
Language:Python11 4 03
shu-le/Notes
5 1 00
rese1f/old_web
personal website built on beautiful jekyll, feel free to clone and modify
Language:HTML3 1 06