LeapLabTHU

Pinned Repositories

ACmix
Official repository of ACmix (CVPR2022)
Language:Python412 5 2944
Agent-Attention
Official repository of Agent Attention (ECCV2024)
Language:Python640 4 5244
DAT
Repository of Vision Transformer with Deformable Attention (CVPR2022) and DAT++: Spatially Dynamic Vision Transformerwith Deformable Attention
Language:Python899 12 3882
EfficientTrain
1.5−3.0× lossless training or pre-training speedup. An off-the-shelf, easy-to-implement algorithm for the efficient training of foundation visual backbones.
Language:Python222 5 119
ExpeL
Language:Python163 5 617
FLatten-Transformer
Official repository of FLatten Transformer (ICCV2023)
Language:Python422 3 3324
GSVA
[CVPR2024] GSVA: Generalized Segmentation via Multimodal Large Language Models
Language:Python130 7 150
MLLA
Official repository of MLLA (NeurIPS 2024)
Language:Python305 3 3415
Pseudo-Q
[CVPR 2022] Pseudo-Q: Generating Pseudo Language Queries for Visual Grounding
Language:Python148 3 2210
Slide-Transformer
Official repository of Slide-Transformer (CVPR2023)
Language:Python172 12 107

LeapLabTHU's Repositories

LeapLabTHU/Agent-Attention
Official repository of Agent Attention (ECCV2024)
Language:Python640 4 5244
LeapLabTHU/FLatten-Transformer
Official repository of FLatten Transformer (ICCV2023)
Language:Python422 3 3324
LeapLabTHU/MLLA
Official repository of MLLA (NeurIPS 2024)
Language:Python305 3 3415
LeapLabTHU/UltraBot
[Nature Communications 2025] Towards Expert-level Autonomous Carotid Ultrasonography with Large-scale Learning-based Robotic System
Language:Python228
LeapLabTHU/EfficientTrain
1.5−3.0× lossless training or pre-training speedup. An off-the-shelf, easy-to-implement algorithm for the efficient training of foundation visual backbones.
Language:Python222 5 119
LeapLabTHU/Slide-Transformer
Official repository of Slide-Transformer (CVPR2023)
Language:Python172 12 107
LeapLabTHU/ExpeL
Language:Python163 5 617
LeapLabTHU/Pseudo-Q
[CVPR 2022] Pseudo-Q: Generating Pseudo Language Queries for Visual Grounding
Language:Python148 3 2210
LeapLabTHU/GSVA
[CVPR2024] GSVA: Generalized Segmentation via Multimodal Large Language Models
Language:Python130 7 150
LeapLabTHU/ARC
[ICCV 2023] Adaptive Rotated Convolution for Rotated Object Detection
Language:Python127 3 286
LeapLabTHU/AdaFocusV2
Language:Python89 2 712
LeapLabTHU/ProCo
[TPAMI 2024] Probabilistic Contrastive Learning for Long-Tailed Visual Recognition
Language:Python83 3 57
LeapLabTHU/Segment3D
Language:Python79 2 55
LeapLabTHU/LAUDNet
[IEEE TPAMI] Latency-aware Unified Dynamic Networks for Efficient Image Recognition
Language:Jupyter Notebook47 3 22
LeapLabTHU/Attention-Mediators
[ECCV 2024] Efficient Diffusion Transformer with Step-wise Dynamic Attention Mediators
Language:Python45 1 13
LeapLabTHU/InLine
Official repository of InLine attention (NeurIPS 2024)
Language:Python44 2 61
LeapLabTHU/ImprovedNAT
A PyTorch implementation of the paper "Revisiting Non-Autoregressive Transformers for Efficient Image Synthesis"
Language:Python43 2 22
LeapLabTHU/Uni-AdaFocus
Official repository of Uni-AdaFocus (TPAMI 2024).
Language:Python41 1 01
LeapLabTHU/AdaNAT
[ECCV 2024] AdaNAT: Exploring Adaptive Policy for Token-Based Image Generation
Language:Python33 2 01
LeapLabTHU/OVM3D-Det
Language:Python312
LeapLabTHU/SimPro
[ICML 2024] SimPro: A Simple Probabilistic Framework Towards Realistic Long-Tailed Semi-Supervised Learning
Language:Python28 2 21
LeapLabTHU/ENAT
[NeurIPS 2024] ENAT: Rethinking Spatial-temporal Interactions in Token-based Image Synthesis
Language:Python22 2 0
LeapLabTHU/DAT-Detection
Repository of Vision Transformer with Deformable Attention (CVPR2022) and DAT++: Spatially Dynamic Vision Transformerwith Deformable Attention
Language:Python19 1 42
LeapLabTHU/GridMix
Repository of GridMix (ICLR 2025)
Language:Python181
LeapLabTHU/UniTTA
Language:Python16 2 01
LeapLabTHU/diver-ct
Language:Python13 2 00
LeapLabTHU/CODA
CODA: Repurposing Continuous VAEs for Discrete Tokenization
Language:Python10
LeapLabTHU/CheXWorld
[CVPR 2025] CheXWorld: Exploring Image World Modeling for Radiograph Representation Learning
Language:Python30
LeapLabTHU/EchoWorld
[CVPR 2025] EchoWorld: Learning Motion-Aware World Models for Echocardiography Probe Guidance
3
LeapLabTHU/DeeR-VLA
Fork from Official code of paper "DeeR-VLA: Dynamic Inference of Multimodal Large Language Models for Efficient Robot Execution"
Language:Python0 0