4mm7's Stars
myshell-ai/OpenVoice
Instant voice cloning by MIT and MyShell.
colmap/colmap
COLMAP - Structure-from-Motion and Multi-View Stereo
showlab/Awesome-Video-Diffusion
A curated list of recent diffusion models for video generation, editing, restoration, understanding, etc.
MLNLP-World/Paper-Writing-Tips
MLNLP社区用来帮助大家避免论文投稿小错误的整理仓库。 Paper Writing Tips
315386775/DeepLearing-Interview-Awesome-2024
AIGC-interview/CV-interview/LLMs-interview面试问题与答案集合仓,同时包含工作和科研过程中的新想法、新问题、新资源与新项目
VainF/Awesome-Anything
General AI methods for Anything: AnyObject, AnyGeneration, AnyModel, AnyTask, AnyX
janosh/awesome-normalizing-flows
Awesome resources on normalizing flows.
zalandoresearch/pytorch-ts
PyTorch based Probabilistic Time Series forecasting framework based on GluonTS backend
jfzhang95/pytorch-video-recognition
PyTorch implemented C3D, R3D, R2Plus1D models for video activity recognition.
stereolabs/zed-sdk
⚡️The spatial perception framework for rapidly building smart robots and spaces
pixeli99/SVD_Xtend
Stable Video Diffusion Training Code and Extensions.
Tsingularity/dift
[NeurIPS'23] Emergent Correspondence from Image Diffusion
Mikoto10032/AutomaticWeightedLoss
Multi-task learning using uncertainty to weigh losses for scene geometry and semantics, Auxiliary Tasks in Multi-task Learning
rese1f/MovieChat
[CVPR 2024] MovieChat: From Dense Token to Sparse Memory for Long Video Understanding
ubicomplab/rPPG-Toolbox
rPPG-Toolbox: Deep Remote PPG Toolbox (NeurIPS 2023)
OpenRobotLab/EmbodiedScan
[CVPR 2024 & NeurIPS 2024] EmbodiedScan: A Holistic Multi-Modal 3D Perception Suite Towards Embodied AI
Sid2697/awesome-egocentric-vision
A curated list of egocentric (first-person) vision and related area resources
genforce/ctrl-x
Official implementation of "Ctrl-X: Controlling Structure and Appearance for Text-To-Image Generation Without Guidance" (NeurIPS 2024)
GitGyun/visual_token_matching
[ICLR'23 Oral] Universal Few-shot Learning of Dense Prediction Tasks with Visual Token Matching
rese1f/Awesome-VQVAE
A collection of resources and papers on Vector Quantized Variational Autoencoder (VQ-VAE) and its application
xliucs/MTTS-CAN
Multi-Task Temporal Shift Attention Networks for On-Device Contactless Vitals Measurement (NeurIPS 2020)
ZitongYu/PhysNet
code of BMVC2019 paper 'Photoplethysmograph Signal Measurement from Facial Videos Using Spatio-Temporal Networks'
carl-vbn/minecraft-voxel-loader
A Fabric Mod and a set of Python scripts to load and play 3D animations inside Minecraft
EgoAlpha/Awesome-Egocentric
PKU-RL/Creative-Agents
togheppi/DualGAN
PyTorch implementation of DualGAN
Wayne-Mai/EgoLoc
For Ego4D VQ3D Task
rese1f/UniAP
[AAAI 2024] UniAP: Towards Universal Animal Perception in Vision via Few-shot Learning
shu-le/Notes
rese1f/old_web
personal website built on beautiful jekyll, feel free to clone and modify