Pinned Repositories
ACmix
Official repository of ACmix (CVPR2022)
Agent-Attention
Official repository of Agent Attention (ECCV2024)
DAT
Repository of Vision Transformer with Deformable Attention (CVPR2022) and DAT++: Spatially Dynamic Vision Transformerwith Deformable Attention
EfficientTrain
1.5−3.0× lossless training or pre-training speedup. An off-the-shelf, easy-to-implement algorithm for the efficient training of foundation visual backbones.
ExpeL
FLatten-Transformer
Official repository of FLatten Transformer (ICCV2023)
GSVA
[CVPR2024] GSVA: Generalized Segmentation via Multimodal Large Language Models
MLLA
Official repository of MLLA (NeurIPS 2024)
Pseudo-Q
[CVPR 2022] Pseudo-Q: Generating Pseudo Language Queries for Visual Grounding
Slide-Transformer
Official repository of Slide-Transformer (CVPR2023)
LeapLabTHU's Repositories
LeapLabTHU/Agent-Attention
Official repository of Agent Attention (ECCV2024)
LeapLabTHU/FLatten-Transformer
Official repository of FLatten Transformer (ICCV2023)
LeapLabTHU/MLLA
Official repository of MLLA (NeurIPS 2024)
LeapLabTHU/UltraBot
[Nature Communications 2025] Towards Expert-level Autonomous Carotid Ultrasonography with Large-scale Learning-based Robotic System
LeapLabTHU/EfficientTrain
1.5−3.0× lossless training or pre-training speedup. An off-the-shelf, easy-to-implement algorithm for the efficient training of foundation visual backbones.
LeapLabTHU/Slide-Transformer
Official repository of Slide-Transformer (CVPR2023)
LeapLabTHU/ExpeL
LeapLabTHU/Pseudo-Q
[CVPR 2022] Pseudo-Q: Generating Pseudo Language Queries for Visual Grounding
LeapLabTHU/GSVA
[CVPR2024] GSVA: Generalized Segmentation via Multimodal Large Language Models
LeapLabTHU/ARC
[ICCV 2023] Adaptive Rotated Convolution for Rotated Object Detection
LeapLabTHU/AdaFocusV2
LeapLabTHU/ProCo
[TPAMI 2024] Probabilistic Contrastive Learning for Long-Tailed Visual Recognition
LeapLabTHU/Segment3D
LeapLabTHU/LAUDNet
[IEEE TPAMI] Latency-aware Unified Dynamic Networks for Efficient Image Recognition
LeapLabTHU/Attention-Mediators
[ECCV 2024] Efficient Diffusion Transformer with Step-wise Dynamic Attention Mediators
LeapLabTHU/InLine
Official repository of InLine attention (NeurIPS 2024)
LeapLabTHU/ImprovedNAT
A PyTorch implementation of the paper "Revisiting Non-Autoregressive Transformers for Efficient Image Synthesis"
LeapLabTHU/Uni-AdaFocus
Official repository of Uni-AdaFocus (TPAMI 2024).
LeapLabTHU/AdaNAT
[ECCV 2024] AdaNAT: Exploring Adaptive Policy for Token-Based Image Generation
LeapLabTHU/OVM3D-Det
LeapLabTHU/SimPro
[ICML 2024] SimPro: A Simple Probabilistic Framework Towards Realistic Long-Tailed Semi-Supervised Learning
LeapLabTHU/ENAT
[NeurIPS 2024] ENAT: Rethinking Spatial-temporal Interactions in Token-based Image Synthesis
LeapLabTHU/DAT-Detection
Repository of Vision Transformer with Deformable Attention (CVPR2022) and DAT++: Spatially Dynamic Vision Transformerwith Deformable Attention
LeapLabTHU/GridMix
Repository of GridMix (ICLR 2025)
LeapLabTHU/UniTTA
LeapLabTHU/diver-ct
LeapLabTHU/CODA
CODA: Repurposing Continuous VAEs for Discrete Tokenization
LeapLabTHU/CheXWorld
[CVPR 2025] CheXWorld: Exploring Image World Modeling for Radiograph Representation Learning
LeapLabTHU/EchoWorld
[CVPR 2025] EchoWorld: Learning Motion-Aware World Models for Echocardiography Probe Guidance
LeapLabTHU/DeeR-VLA
Fork from Official code of paper "DeeR-VLA: Dynamic Inference of Multimodal Large Language Models for Efficient Robot Execution"