This project is the official implementation of MMIL-Transformer proposed in paper Multi-level Multiple Instance Learning with Transformer for Whole Slide Image Classification
A new grouping method(MSA grouping) and the corresponding pre-trained weights will be updated soon.
- Python 3.8.10
- Pytorch 1.12.1
- torchmetrics 0.4.1
- CUDA 11.6
- numpy 1.24.2
- einops 0.6.0
- sklearn 1.2.2
- h5py 3.8.0
- pandas 2.0.0
- nystrom_attention
- argparse
All test experiments were conducted 10 times to calculate the average ACC and AUC.
model name | grouping method | weight | ACC | AUC |
---|---|---|---|---|
TCGA_embed |
Embedding grouping | HF link | 93.15% | 98.97% |
TCGA_random |
Random grouping | HF link | 94.37% | 99.04% |
TCGA_random_with_subbags_0.75masked |
Random grouping + mask | HF link | 93.95% | 99.02% |
camelyon16_random |
Random grouping | HF link | 91.78% | 94.07% |
camelyon16_random_with_subbags_0.6masked |
Random grouping + mask | HF link | 93.41% | 94.74% |
We use the same configuration of data preprocessing as DSMIL. Or you can directly download the feature vector they provided for TCGA.
We use CLAM to preprocess CAMELYON16 at 20x.
Preprocess WSI is time consuming and difficult. We also provide processed feature vector for two datasets. Aforementioned works DSMIL and CLAM greatly simplified the preprocessing. Thanks again to their wonderful works!
For TCGA testing:
python main.py \
--test {Your_Path_to_Pretrain} \
--num_test 10 \
--type TCGA \
--num_subbags 4 \
--mode {embed or random} \
--num_msg 1 \
--num_layers 2 \
--csv {Your_Path_to_TCGA_csv} \
--h5 {Your_Path_to_h5_file}
For CAMELYON16 testing:
python main.py \
--test {Your_Path_to_Pretrain} \
--num_test 10 \
--type camelyon16 \
--num_subbags 10 \
--mode random \
--num_msg 1 \
--num_layers 2 \
--csv {Your_Path_to_CAMELYON16_csv}\
--h5 {Your_Path_to_h5_file}