/MemSAM

[CVPR 2024 Oral] MemSAM: Taming Segment Anything Model for Echocardiography Video Segmentation.

Primary LanguagePythonMIT LicenseMIT

MemSAM

MemSAM: Taming Segment Anything Model for Echocardiography Video Segmentation, CVPR 2024, Oral

Xiaolong Deng^, Huisi Wu*, Runhao Zeng, Jing Qin

[Paper] [Video] [Project]

MemSAM Design

Installation

conda create --name memsam python=3.10
conda activate memsam
pip install torch torchvision --index-url https://download.pytorch.org/whl/cu118
pip install requirements.txt

Usage

prepare dataset

First, download the dataset from:

Then process the dataset according to utils/preprocess_echonet.py and utils/preprocess_camus.py, for example:

# CAMUS
python utils/preprocess_camus.py -i /data/dengxiaolong/CAMUS_public/database_nifti -o /data/dengxiaolong/memsam/CAMUS_public

# EchoNet-Dynamic
python utils/preprocess_echonet.py -i /data/dengxiaolong/EchoNet-Dynamic -o /data/dengxiaolong/memsam/EchoNet

pretrain checkpoint download

ViT-B SAM model

train and test

Use train_video.py and test_video.py to train and test separately.

Acknowledgement

The work is based on SAM, SAMUS and XMem. Thanks for the open source contributions to these efforts!

Citation

if you find our work useful, please cite our paper, thank you!

@InProceedings{Deng_2024_CVPR,
    author    = {Deng, Xiaolong and Wu, Huisi and Zeng, Runhao and Qin, Jing},
    title     = {MemSAM: Taming Segment Anything Model for Echocardiography Video Segmentation},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2024},
    pages     = {9622-9631}
}