Awesome-Vision-Mamba

✨✨Latest Papers on Vision Mamba and Related Areas

Survey

Vision Mamba: A Comprehensive Survey and Taxonomy [arxiv]
A Survey on Vision Mamba: Models, Applications and Challenges [arxiv]
Mamba-360: Survey of State Space Models as Transformer Alternative for Long Sequence Modelling: Methods, Applications, and Challenges [arxiv]
A Survey on Visual Mamba [arxiv]
State Space Model for New-Generation Network Alternative to Transformers: A Survey [arxiv]

Computer Vision

QueryMamba: A Mamba-Based Encoder-Decoder Architecture with a Statistical Verb-Noun Interaction Module for Video Action Forecasting @ Ego4D Long-Term Action Anticipation Challenge 2024 [arxiv]
MTMamba: Enhancing Multi-Task Dense Scene Understanding by Mamba-Based Decoders [arxiv] [code]
VFIMamba: Video Frame Interpolation with State Space Models [arxiv] [code]
Mamba or RWKV: Exploring High-Quality and High-Efficiency Segment Anything Model [arxiv] [code]
VideoMambaPro: A Leap Forward for Mamba in Video Understanding [arxiv] [code]
Mamba24/8D: Enhancing Global Interaction in Point Clouds via State Space Model [arxiv]
SUM: Saliency Unification through Mamba for Visual Attention Modeling [arxiv] [code]
Vision Mamba-based autonomous crack segmentation on concrete, asphalt, and masonry surfaces [arxiv]
LFMamba: Light Field Image Super-Resolution with State Space Model [arxiv]
Voxel Mamba: Group-Free State Space Models for Point Cloud based 3D Object Detection [arxiv] [code]
PyramidMamba: Rethinking Pyramid Feature Fusion with Selective Space State Model for Semantic Segmentation of Remote Sensing Imagery [arxiv] [code]
Q-Mamba: On First Exploration of Vision Mamba for Image Quality Assessment [arxiv]
PixMamba: Leveraging State Space Models in a Dual-Level Architecture for Underwater Image Enhancement [arxiv] [code]
Towards Evaluating the Robustness of Visual State Space Models [arxiv] [code]
DualMamba: A Lightweight Spectral-Spatial Mamba-Convolution Network for Hyperspectral Image Classification [arxiv]
Autoregressive Pretraining with Mamba in Vision [arxiv] [code]
MambaDepth: Enhancing Long-range Dependency for Self-Supervised Fine-Structured Monocular Depth Estimation [arxiv] [code]
Efficient 3D Shape Generation via Diffusion Mamba with Bidirectional SSMs [arxiv]
MHS-VM: Multi-Head Scanning in Parallel Subspaces for Vision Mamba [arxiv] [code]
HDMba: Hyperspectral Remote Sensing Imagery Dehazing with State Space Model [arxiv] [code]
Mamba YOLO: SSMs-Based YOLO For Object Detection [arxiv] [code]
MVGamba: Unify 3D Content Generation as State Space Sequence Modeling [arxiv]
RoboMamba: Multimodal State Space Model for Efficient Robot Reasoning and Manipulation [arxiv] [code]
GrootVL: Tree Topology is All You Need in State Space Model [arxiv] [code]
CDMamba: Remote Sensing Image Change Detection with Mamba [arxiv] [code]
LLEMamba: Low-Light Enhancement via Relighting-Guided Mamba with Deep Unfolding Network [arxiv]
Dimba: Transformer-Mamba Diffusion Models [arxiv] [code]
S4Fusion: Saliency-aware Selective State Space Model for Infrared Visible Image Fusion [arxiv]
DeMamba: AI-Generated Video Detection on Million-Scale GenVideo Benchmark [arxiv] [code]
FourierMamba: Fourier Learning Integration with State Space Models for Image Deraining [arxiv]
Vim-F: Visual State Space Model Benefiting from Learning in the Frequency Domain [arxiv] [code]
MambaLLIE: Implicit Retinex-Aware Low Light Enhancement with Global-then-Local State Space [arxiv] [code]
Image Deraining with Frequency-Enhanced State Space Model [arxiv]
Demystify Mamba in Vision: A Linear Attention Perspective [arxiv] [code]
MambaVC: Learned Visual Compression with Selective State Spaces [arxiv]
PoinTramba: A Hybrid Transformer-Mamba Framework for Point Cloud Analysis [arxiv] [code]
Meteor: Mamba-based Traversal of Rationale for Large Language and Vision Models [arxiv] [code]
Multi-Scale VMamba: Hierarchy in Hierarchy Visual State Space Model [arxiv] [code]
DiM: Diffusion Mamba for Efficient High-Resolution Image Synthesis [arxiv]
MAMBA4D: Efficient Long-Sequence Point Cloud Video Understanding with Disentangled Spatial-Temporal State Space Models [arxiv]
Scalable Visual State Space Model with Fractal Scanning [arxiv]
Efficient Visual State Space Model for Image Deblurring [arxiv]
Mamba®: Vision Mamba ALSO Needs Registers [arxiv] [code]
3DSS-Mamba: 3D-Spectral-Spatial Mamba for Hyperspectral Image Classification [arxiv]
Mamba-in-Mamba: Centralized Mamba-Cross-Scan in Tokenized Mamba Model for Hyperspectral Image Classification [arxiv] [code]
CM-UNet: Hybrid CNN-Mamba UNet for Remote Sensing Image Semantic Segmentation [arxiv] [code]
IRSRMamba: Infrared Image Super-Resolution via Mamba-based Wavelet Transform Feature Modulation Model [arxiv] [code]
RSDehamba: Lightweight Vision Mamba for Remote Sensing Satellite Image Dehazing [arxiv]
WaterMamba: Visual State Space Model for Underwater Image Enhancement [arxiv]
Rethinking Scanning Strategies with Vision Mamba in Semantic Segmentation of Remote Sensing Imagery: An Experimental Study [arxiv]
MambaOut: Do We Really Need Mamba for Vision? [arxiv] [code]
OverlapMamba: Novel Shift State Space Model for LiDAR-based Place Recognition [arxiv]
Rethinking Efficient and Effective Point-based Networks for Event Camera Classification and Regression: EventMamba [arxiv]
Frequency-Assisted Mamba for Remote Sensing Image Super-Resolution [arxiv]
StyleMamba: State Space Model for Efficient Text-driven Image Style Transfer [arxiv]
VMambaCC: A Visual State Space Model for Crowd Counting [arxiv]
DVMSR: Distillated Vision Mamba for Efficient Super-Resolution [arxiv] [code]
SMCD: High Realism Motion Style Transfer via Mamba-based Diffusion [arxiv]
Matten: Video Generation with Mamba-Attention [arxiv]
Retinexmamba: Retinex-based Mamba for Low-light Image Enhancement [arxiv] [code]
MemoryMamba: Memory-Augmented State Space Model for Defect Recognition [arxiv]
SSUMamba: Spatial-Spectral Selective State Space Model for Hyperspectral Image Denoising [arxiv] [code]
FER-YOLO-Mamba: Facial Expression Detection and Classification Based on Selective State Space [arxiv] [code]
CLIP-Mamba: CLIP Pretrained Mamba Models with OOD and Hessian Evaluation [arxiv] [code]
Mamba-FETrack: Frame-Event Tracking via State Space Model [arxiv] [code]
S2Mamba: A Spatial-spectral State Space Model for Hyperspectral Image Classification [arxiv] [code]
Spectral-Spatial Mamba for Hyperspectral Image Classification [arxiv]
RSCaMa: Remote Sensing Image Change Captioning with State Space Model [arxiv] [code]
Sparse Reconstruction of Optical Doppler Tomography Based on State Space Model [arxiv]
CFMW: Cross-modality Fusion Mamba for Multispectral Object Detection under Adverse Weather Conditions [arxiv] [code]
Mamba3D: Enhancing Local Features for 3D Point Cloud Analysis via State Space Model [arxiv]
MambaUIE: Unraveling the Ocean's Secrets with Only 2.8 FLOPs [arxiv] [code]
MambaMOS: LiDAR-based 3D Moving Object Segmentation with Motion-aware State Space Model [arxiv] [code]
CU-Mamba: Selective State Space Models with Channel Learning for Image Restoration [arxiv]
MambaPupil: Bidirectional Selective Recurrent model for Event-based Eye tracking [arxiv]
Text-controlled Motion Mamba: Text-Instructed Temporal Grounding of Human Motion [arxiv]
A Novel State Space Model with Local Enhancement and State Sharing for Image Fusion [arxiv]
Fusion-Mamba for Cross-modality Object Detection [arxiv]
FreqMamba: Viewing Mamba from a Frequency Perspective for Image Deraining [arxiv]
HSIDMamba: Exploring Bidirectional State-Space Models for Hyperspectral Denoising [arxiv]
MambaDFuse: A Mamba-based Dual-phase Model for Multi-modality Image Fusion [arxiv]
SpectralMamba: Efficient Mamba for Hyperspectral Image Classification [arxiv] [code]
Simba: Mamba augmented U-ShiftGCN for Skeletal Action Recognition in Videos [arxiv]
DGMamba: Domain Generalization via Generalized State Space Model [arxiv] [code]
FusionMamba: Efficient Image Fusion with State Space Model [arxiv]
MambaAD: Exploring State Space Models for Multi-class Unsupervised Anomaly Detection [arxiv] [code]
3DMambaComplete: Exploring Structured State Space Model for Point Cloud Completion [arxiv]
RhythmMamba: Fast Remote Physiological Measurement with Arbitrary Length Videos [arxiv] [code]
Sigma: Siamese Mamba Network for Multi-Modal Semantic Segmentation [arxiv] [code]
ChangeMamba: Remote Sensing Change Detection with Spatio-Temporal State Space Model [arxiv] [code]
InsectMamba: Insect Pest Classification with State Space Model [arxiv]
RS3Mamba: Visual State Space Model for Remote Sensing Images Semantic Segmentation [arxiv] [code]
RS-Mamba for Large Remote Sensing Image Dense Prediction [arxiv] [code]
Samba: Semantic Segmentation of Remotely Sensed Images with State Space Model [arxiv] [code]
HSIMamba: Hyperpsectral Imaging Efficient Feature Learning with Bidirectional State Space for Classification [arxiv]
SpikeMba: Multi-Modal Spiking Saliency Mamba for Temporal Video Grounding [arxiv]
MambaMixer: Efficient Selective State Space Models with Dual Token and Channel Selection [arxiv] [code]
Aggregating Local and Global Features via Selective State Spaces Model for Efficient Image Deblurring [arxiv]
HARMamba: Efficient Wearable Sensor Human Activity Recognition Based on Bidirectional Selective SSM [arxiv]
RSMamba: Remote Sensing Image Classification with State Space Model [arxiv] [code]
Gamba: Marry Gaussian Splatting with Mamba for single view 3D reconstruction [arxiv]
Integrating Mamba Sequence Model and Hierarchical Upsampling Network for Accurate Semantic Segmentation of Multiple Sclerosis Legion [arxiv]
PlainMamba: Improving Non-Hierarchical Mamba in Visual Recognition [arxiv] [code]
ReMamber: Referring Image Segmentation with Mamba Twister [arxiv]
VMRNN: Integrating Vision Mamba and LSTM for Efficient and Accurate Spatiotemporal Forecasting [arxiv] [code]
SiMBA: Simplified Mamba-based Architecture for Vision and Multivariate Time series [arxiv] [code]
Cobra: Extending Mamba to Multi-Modal Large Language Model for Efficient Inference [arxiv] [code]
VL-Mamba: Exploring State Space Models for Multimodal Learning [arxiv]
ZigMa: Zigzag Mamba Diffusion Model [arxiv] [code]
VmambaIR: Visual State Space Model for Image Restoration [arxiv] [code]
LocalMamba: Visual State Space Model with Windowed Selective Scan [arxiv] [code]
MambaTalk: Efficient Holistic Gesture Synthesis with Selective State Space Models [arxiv]
Video Mamba Suite: State Space Model as a Versatile Alternative for Video Understanding [arxiv] [code]
Motion Mamba: Efficient and Long Sequence Motion Generation with Hierarchical and Bidirectional Selective SSM [arxiv] [code]
Point Mamba: A Novel Point Cloud Backbone Based on State Space Model with Octree-Based Ordering Strategy [arxiv] [code]
VideoMamba: State Space Model for Efficient Video Understanding [arxiv] [code]
MiM-ISTD: Mamba-in-Mamba for Efficient Infrared Small Target Detection [arxiv] [code]
Point Could Mamba: Point Cloud Learning via State Space Model [arxiv] [code]
Res-VMamba: Fine-Grained Food Category Visual Classification Using Selective State Space Models with Deep Residual Learning [arxiv] [code]
MambaIR: A Simple Baseline for Image Restoration with State-Space Model [arxiv] [code]
Pan-Mamba: Effective pan-sharpening with State Space Model [arxiv] [code]
PointMamba: A Simple State Space Model for Point Cloud Analysis [arxiv] [code]
Scalable Diffusion Models with State Space Backbone [arxiv] [code]
Mamba-ND: Selective State Space Modeling for Multi-Dimensional Data [arxiv]
Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model [arxiv] [code]
VMamba: Visual State Space Model [arxiv] [code]
U-shaped Vision Mamba for Single Image Dehazing [arxiv] [code]

Medical Imaging

Vision Mamba for Classification of Breast Ultrasound Images [arxiv]
MMR-Mamba: Multi-Contrast MRI Reconstruction with Mamba and Spatial-Frequency Information Fusion [arxiv]
Soft Masked Mamba Diffusion Model for CT to MRI Conversion [arxiv] [code]
SEDMamba: Enhancing Selective State Space Modelling with Bottleneck Mechanism and Fine-to-Coarse Temporal Fusion for Efficient Error Detection in Robot-Assisted Surgery [arxiv]
Vision Mamba: Cutting-Edge Classification of Alzheimer's Disease with 3D MRI Scans [arxiv]
Convolution and Attention-Free Mamba-based Cardiac Image Segmentation [arxiv]
MUCM-Net: A Mamba Powered UCM-Net for Skin Lesion Segmentation [arxiv] [code]
I2I-Mamba: Multi-modal medical image synthesis via selective state space modeling [arxiv]
VM-DDPM: Vision Mamba Diffusion for Medical Image Synthesis [arxiv]
HC-Mamba: Vision MAMBA with Hybrid Convolutional Techniques for Medical Image Segmentation [arxiv]
AC-MAMBASEG: An adaptive convolution and Mamba-based architecture for enhanced skin lesion segmentation [arxiv] [code]
Vim4Path: Self-Supervised Vision Mamba for Histopathology Images [arxiv] [code]
FusionMamba: Dynamic Feature Enhancement for Multimodal Image Fusion with Mamba [arxiv] [code]
ViM-UNet: Vision Mamba for Biomedical Segmentation [arxiv] [code]
VMambaMorph: a Visual Mamba-based Framework with Cross-Scan Module for Deformable 3D Image Registration [arxiv] [code]
T-Mamba: Frequency-Enhanced Gated Long-Range Dependency for Tooth 3D CBCT Segmentation [arxiv] [code]
Rotate to Scan: UNet-like Mamba with Triplet SSM Module for Medical Image Segmentation [arxiv]
H-vmunet: High-order Vision Mamba UNet for Medical Image Segmentation [arxiv] [code]
ProMamba: Prompt-Mamba for polyp segmentation [arxiv]
VM-UNET-V2 Rethinking Vision Mamba UNet for Medical Image Segmentation [arxiv] [code]
MD-Dose: A diffusion model based on the Mamba for radiation dose prediction [arxiv] [code]
Large Window-based Mamba UNet for Medical Image Segmentation: Beyond Convolution and Self-attention [arxiv] [code]
MambaMIL: Enhancing Long Sequence Modeling with Sequence Reordering in Computational Pathology [arxiv] [code]
LightM-UNet: Mamba Assists in Lightweight UNet for Medical Image Segmentation [arxiv] [code]
MamMIL: Multiple Instance Learning for Whole Slide Images with State Space Models [arxiv]
MedMamba: Vision Mamba for Medical Image Classification [arxiv] [code]
MambaMIR: An Arbitrary-Masked Mamba for Joint Medical Image Reconstruction and Uncertainty Estimation [arxiv] [code]
Weak-Mamba-UNet: Visual Mamba Makes CNN and ViT Work Better for Scribble-based Medical Image Segmentation [arxiv] [code]
P-Mamba: Marrying Perona Malik Diffusion with Mamba for Efficient Pediatric Echocardiographic Left Ventricular Segmentation [arxiv]
Semi-Mamba-UNet: Pixel-Level Contrastive Cross-Supervised Visual Mamba-based UNet for Semi-Supervised Medical Image Segmentation [arxiv] [code]
FD-Vision Mamba for Endoscopic Exposure Correction [arxiv] [code]
MambaMorph: a Mamba-based Backbone with Contrastive Feature Learning for Deformable MR-CT Registration [arxiv] [code]
Vivim: a Video Vision Mamba for Medical Video Object Segmentation [arxiv] [code]
U-Mamba: Enhancing Long-range Dependency for Biomedical Image Segmentation [arxiv] [code]
Swin-UMamba: Mamba-based UNet with ImageNet-based pretraining [arxiv] [code]
nnMamba: 3D Biomedical Image Segmentation, Classification and Landmark Detection with State Space Model [arxiv] [code]
SegMamba: Long-range Sequential Modeling Mamba For 3D Medical Image Segmentation [arxiv] [code]
VM-UNet: Vision Mamba UNet for Medical Image Segmentation [arxiv] [code]
Mamba-UNet: UNet-Like Pure Visual Mamba for Medical Image Segmentation [arxiv] [code]

jieLin-world/Awesome-Vision-Mamba

Awesome-Vision-Mamba

Survey

Computer Vision

Medical Imaging