NeurIPS-CellSeg
A naive baseline and submission demo for the microscopy image segmentation challenge in NeurIPS 2022
Requirements
Install requirements by
python -m pip install -r requirements.txt
Preprocessing
Download training data to the data
folder
Preprocess dataset with
python data/pre_process_3class.py
We convert the instance label in ground truth to a three-class label (0-background, 1-interior, 2-edge). After the segmentation, we can convert the three-class label map to an instance map by
skimage.measure.label
.
Please always keep in mind that this is a instance segmentation task. The baseline is a very simple and naive solution. We highly recommend trying the state-of-the-art methods that mentioned in the end.
Training
See all training options with
python baseline/model_training_3class.py --help
Train baseline model with
python baseline/model_training_3class.py --data_path 'path to training data' --batch_size 8
Inference
Run
python predict.py -i input_path -o output_path
Your prediction file should have at least the two arguments:
input_path
andoutput_path
. The two arguments are important to establishing connections between local folders and docker folders.
Compute Evaluation Metric (F1 Score)
Run
python compute_metric.py --gt_path path_to_labels --seg_path path_to_segmentation
Cells on the boundaries are not considered during evaluation.
Build Docker
We recommend this great tutorial: https://nbviewer.org/github/ericspod/ContainersForCollaboration/blob/master/ContainersForCollaboration.ipynb
1) Preparation
The docker is built based on MONAI
docker pull projectmonai/monai
Prepare Dockerfile
FROM projectmonai/monai:latest
WORKDIR /workspace
COPY ./ /workspace
Put the inference command in the predict.sh
# !/bin/bash -e
python predict.py -i "/workspace/inputs/" -o "/workspace/outputs/"
The
input_path
andoutput_path
augments should specify the corresponding docker workspace folders rather than local folders, because we will map the local folders to the docker workspace folders when running the docker container.
2) Build Docker and make sanity test
The submitted docker will be evaluated by the following command:
docker container run --gpus "device=0" -m 28G --name teamname --rm -v $PWD/CellSeg_Test/:/workspace/inputs/ -v $PWD/teamname_seg/:/workspace/outputs/ teamname:latest /bin/bash -c "sh predict.sh"
--gpus
: specify the available GPU during inference-m
: spedify the maximum RAM--name
: container name during running--rm
: remove the container after running-v $PWD/CellSeg_Test/:/workspace/inputs/
: map local image data folder to Dockerworkspace/inputs
folder.-v $PWD/teamname_seg/:/workspace/outputs/
: map Dockerworkspace/outputs
folder to local folder. The segmentation results will be in$PWD/teamname_outputs
teamname:latest
: docker image name (should beteamname
) and its version tag. The version tag should belatest
. Please do not usev0
,v1
... as the version tag/bin/bash -c "sh predict.sh"
: start the prediction command. It will load testing images fromworkspace/inputs
and save the segmentation results toworkspace/outputs
Assuming the team name is baseline
, the Docker build command is
docker build -t baseline .
Test the docker to make sure it works. There should be segmentation results in the baseline_seg
folder.
docker container run --gpus "device=0" -m 28G --name baseline --rm -v $PWD/TuningSet/:/workspace/inputs/ -v $PWD/baseline_seg/:/workspace/outputs/ baseline:latest /bin/bash -c "sh predict.sh"
During the inference, please monitor the GPU memory consumption using
watch nvidia-smi
. The GPU memory consumption should be less than 10G. Otherwise, it will run into an OOM error on the official evaluation server.
3) Save Docker
docker save baseline | gzip -c > baseline.tar.gz
Upload the docker to Google drive (example) or Baidu net disk (example) and send the download link to NeurIPS.CellSeg@gmail.com
.
Please do not upload the Docker to dockerhub!
Limitations and potential improvements
The naive baseline's primary aim is to give participants out-of-the-box scripts that can generate successful submisions. Thus, there are many ways to surpass this baseline:
- New cell representation methods. In the baseline, we separated touching cells by simply removing their boundaries. More advanced cell representation could be used to address this issue, such as stardist, cellpose, omnipose, deepcell, and so on. We highly recommend trying these SOTA methods.
- New architectures
- More data augmentations and the use of additional public datasets or the set of unlabeled data provided.
- Well-designed training protocols
- Postprocessing