Pinned Repositories
AdaMixer
[CVPR 2022 Oral] AdaMixer: A Fast-Converging Query-Based Object Detector
CamLiFlow
[CVPR 2022 Oral & TPAMI 2023] Learning Optical Flow and Scene Flow with Bidirectional Camera-LiDAR Fusion
EMA-VFI
[CVPR 2023] Extracting Motion and Appearance via Inter-Frame Attention for Efficient Video Frame Interpolatio
MixFormer
[CVPR 2022 Oral & TPAMI 2024] MixFormer: End-to-End Tracking with Iterative Mixed Attention
MixFormerV2
[NeurIPS 2023] MixFormerV2: Efficient Fully Transformer Tracking
MOC-Detector
[ECCV 2020] Actions as Moving Points
SparseBEV
[ICCV 2023] SparseBEV: High-Performance Sparse 3D Object Detection from Multi-Camera Videos
SparseOcc
[ECCV 2024] Fully Sparse 3D Occupancy Prediction & RayIoU Evaluation Metric
TDN
[CVPR 2021] TDN: Temporal Difference Networks for Efficient Action Recognition
VideoMAE
[NeurIPS 2022 Spotlight] VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training
Multimedia Computing Group, Nanjing University's Repositories
MCG-NJU/VideoMAE
[NeurIPS 2022 Spotlight] VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training
MCG-NJU/MixFormer
[CVPR 2022 Oral & TPAMI 2024] MixFormer: End-to-End Tracking with Iterative Mixed Attention
MCG-NJU/SparseBEV
[ICCV 2023] SparseBEV: High-Performance Sparse 3D Object Detection from Multi-Camera Videos
MCG-NJU/SparseOcc
[ECCV 2024] Fully Sparse 3D Occupancy Prediction & RayIoU Evaluation Metric
MCG-NJU/CamLiFlow
[CVPR 2022 Oral & TPAMI 2023] Learning Optical Flow and Scene Flow with Bidirectional Camera-LiDAR Fusion
MCG-NJU/MeMOTR
[ICCV 2023] MeMOTR: Long-Term Memory-Augmented Transformer for Multi-Object Tracking
MCG-NJU/MixFormerV2
[NeurIPS 2023] MixFormerV2: Efficient Fully Transformer Tracking
MCG-NJU/MOTIP
Multiple Object Tracking as ID Prediction
MCG-NJU/LinK
[CVPR 2023] LinK: Linear Kernel for LiDAR-based 3D Perception
MCG-NJU/AWT
[NeurIPS 2024] AWT: Transferring Vision-Language Models via Augmentation, Weighting, and Transportation
MCG-NJU/SGM-VFI
[CVPR 2024] Sparse Global Matching for Video Frame Interpolation with Large Motion
MCG-NJU/BIVDiff
[CVPR 2024] BIVDiff: A Training-free Framework for General-Purpose Video Synthesis via Bridging Image and Video Diffusion Models
MCG-NJU/VFIMamba
[NeurIPS 2024] VFIMamba: Video Frame Interpolation with State Space Models
MCG-NJU/PointTAD
[NeurIPS 2022] PointTAD: Multi-Label Temporal Action Detection with Learnable Query Points
MCG-NJU/CoMAE
[AAAI 2023 Oral] CoMAE: Single Model Hybrid Pre-training on Small-Scale RGB-D Datasets
MCG-NJU/DEQDet
[ICCV 2023] Deep Equilibrium Object Detection
MCG-NJU/MGMAE
[ICCV 2023] MGMAE: Motion Guided Masking for Video Masked Autoencoding
MCG-NJU/Dynamic-MDETR
[TPAMI 2024] Dynamic MDETR: A Dynamic Multimodal Transformer Decoder for Visual Grounding
MCG-NJU/SPLAM
[ECCV 2024 Oral] SPLAM: Accelerating Image Generation with Sub-path Linear Approximation Model
MCG-NJU/ZeroI2V
[ECCV 2024] ZeroI2V: Zero-Cost Adaptation of Pre-trained Transformers from Image to Video
MCG-NJU/AMD
[CVPR 2024] Asymmetric Masked Distillation for Pre-Training Small Foundation Models
MCG-NJU/SportsHHI
[CVPR 2024] SportsHHI: A Dataset for Human-Human Interaction Detection in Sports Videos
MCG-NJU/VLG
VLG: General Video Recognition with Web Textual Knowledge (https://arxiv.org/abs/2212.01638)
MCG-NJU/StageInteractor
[ICCV 2023] StageInteractor: Query-based Object Detector with Cross-stage Interaction
MCG-NJU/ProVP
[IJCV] Progressive Visual Prompt Learning with Contrastive Feature Re-formation
MCG-NJU/ViT-TAD
[CVPR 2024] Adapting Short-Term Transformers for Action Detection in Untrimmed Videos
MCG-NJU/DGN
[IJCV 2023] Dual Graph Networks for Pose Estimation in Crowded Scenes
MCG-NJU/PRVG
[CVIU 2024] End-to-end dense video grounding via parallel regression
MCG-NJU/VideoEval
VideoEval: Comprehensive Benchmark Suite for Low-Cost Evaluation of Video Foundation Model
MCG-NJU/LogN
[IJCV 2024] Logit Normalization for Long-Tail Object Detection