yztongzhan's Stars
facebookresearch/segment-anything
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
XingangPan/DragGAN
Official Code for DragGAN (SIGGRAPH 2023)
meta-llama/codellama
Inference code for CodeLlama models
NVIDIA/open-gpu-kernel-modules
NVIDIA Linux open GPU kernel module source
qiuyu96/CoDeF
[CVPR 2024 Highlight] Official PyTorch implementation of CoDeF: Content Deformation Fields for Temporally Consistent Video Processing
OpenGVLab/InternGPT
InternGPT (iGPT) is an open source demo platform where you can easily showcase your AI models. Now it supports DragGAN, ChatGPT, ImageBind, multimodal chat like GPT-4, SAM, interactive image editing, etc. Try it at igpt.opengvlab.com (支持DragGAN、ChatGPT、ImageBind、SAM的在线Demo系统)
TigerResearch/TigerBot
TigerBot: A multi-language multi-task LLM
ShoufaChen/DiffusionDet
[ICCV2023 Best Paper Finalist] PyTorch implementation of DiffusionDet (https://arxiv.org/abs/2211.09788)
fenglinglwb/MAT
MAT: Mask-Aware Transformer for Large Hole Image Inpainting
OpenGVLab/VideoMAEv2
[CVPR 2023] VideoMAE V2: Scaling Video Masked Autoencoders with Dual Masking
tinatiansjz/hmr-survey
[TPAMI 2023] Recovering 3D Human Mesh from Monocular Images: A Survey
ShoufaChen/AdaptFormer
[NeurIPS 2022] Implementation of "AdaptFormer: Adapting Vision Transformers for Scalable Visual Recognition"
MCG-NJU/SparseBEV
[ICCV 2023] SparseBEV: High-Performance Sparse 3D Object Detection from Multi-Camera Videos
chaytonmin/Occupancy-MAE
Official implementation of our TIV'23 paper: Occupancy-MAE: Self-supervised Pre-training Large-scale LiDAR Point Clouds with Masked Occupancy Autoencoders
implus/UM-MAE
Official Codes for "Uniform Masking: Enabling MAE Pre-training for Pyramid-based Vision Transformers with Locality"
fenglinglwb/EDT
On Efficient Transformer-Based Image Pre-training for Low-Level Vision
MCG-NJU/SportsMOT
[ICCV 2023] SportsMOT: A Large Multi-Object Tracking Dataset in Multiple Sports Scenes
AILab-CVC/GroupMixFormer
GroupMixAttention and GroupMixFormer
zhaoyue-zephyrus/AVION
Code release for "Training a Large Video Model on a Single Machine in a Day"
zhaoyue-zephyrus/TeSTra
Code for ECCV2022 "Real-time Online Video Detection with Temporal Smoothing Transformers"
ChongjianGE/MetaBEV
MetaBEV: Solving Sensor Failures for BEV Detection and Map Segmentation
showlab/sparseformer
(ICLR 2024, CVPR 2024) SparseFormer
MCG-NJU/DDM
[CVPR 2022] Progressive Attention on Multi-Level Dense Difference Maps for Generic Event Boundary Detection
MCG-NJU/STMixer
[CVPR 2023] STMixer: A One-Stage Sparse Action Detector
MCG-NJU/VideoMAE-Action-Detection
[NeurIPS 2022 Spotlight] VideoMAE for Action Detection
MCG-NJU/EVAD
[ICCV 2023] Efficient Video Action Detection with Token Dropout and Context Refinement
leexinhao/ZeroI2V
Official implementation of "ZeroI2V: Zero-Cost Adaptation of Pre-trained Transformers from Image to Video"
ChongjianGE/SNCLR
[ICLR 2023] Soft Neighbors are Positive Supporters in Contrastive Visual Representation Learning
sebgao/chatgpt_mini_helper
My customized GPT 3.5 helper
yztongzhan/VideoMAE
VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training