This repo serves as reproduction code for the following papers
- "Deployment of Image Analysis Algorithms under Prevalence Shifts" see Springer and ArXiV. It was published as conference paper at MICCAI 2023.
- "Navigating Prevalence Shifts in Image Analysis Algorithm Deployment" (unpublished) an extended version of the former, submitted to the Medical Image Analysis journal.
In the following, the reproduction descriptions only focus on the latter, extended version. To reproduce our original results please use the respective git tag and check out "miccai23".
DISCLAIMER: The mml dependency has not been published yet - this means so far the training part cannot be reproduced publicly. We are working on it and release this dependency later. All evaluation and plotting scripts are available. We provide three sample tasks with the produced predictions in /data/. mml-free reproducibility provides instructions to produce plots for the sample tasks.
- code structure
- installation
- image data preparation
- model training and prediction generation
- experiments and figures
The code of this repository is structured as follows in src
:
mml-plugin
implements amml
plugin to re-distribute samples within a task according to our needsprev
contains definitions and routines that are shared through our experiments (note that__init__.py
modifiespsrcal
behaviour)training_scripts
contains all commands with respect to task data (prepare, preprocess) and neural networks (train and predict)- the notebooks
1_...
to8_...
contain the steps to reproduce our experiments
To run the notebooks and reproduce the plots exactly you might need to install the font we used.
- download
NewCM10-Regular.otf
- place it in the
/data
folder at project root (necessary for notebook 5) - install the font (the details of this step depend on your OS)
- create a virtualenv with conda and install python 3.10, install the requirements
conda update -n base -c defaults conda
conda create --yes --name prev python=3.10
conda activate prev
git clone https://github.com/IMSY-DKFZ/prevalence-shifts.git
cd prevalences
pip install -r requirements.txt
- Run the following notebooks to follow our experiments:
3_prevalence_estimation.ipynb
- Research Question 1, creates figures 5 and C.114_calibration.ipynb
- Research Question 2a, creates figure 65_threshold_visualization.ipynb
- Research Question 2b, creates figures 3, 7, and 86_decision_rule.ipynb
- Research Question 2b, creates figure 97_validation_metrics.ipynb
- Research Question 2c, creates figure 108_uncertainty.ipynb
- Creates uncertainty tables
DISCLAIMER: mml is not yet public. Please follow the installation instructions from mml-free reproducibility.
- create a virtualenv with conda and install python 3.10
conda update -n base -c defaults conda
conda create --yes --name prev python=3.10
conda activate prev
- install mml-core and mml-data plugin
pip install --index-url https://mmlToken:<personal_access_token>@git.dkfz.de/api/v4/projects/89/packages/pypi/simple mml-core==0.13.3
pip install mml-data==0.4.1 --index-url https://__token__:<your_personal_token>@git.dkfz.de/api/v4/projects/89/packages/pypi/simple
- install local prevalence plugin and other requirements
git clone https://github.com/IMSY-DKFZ/prevalence-shifts.git
cd prevalences
pip install -r requirements.txt
cd src/mml_plugin/prevalences
pip install .
- install the fonts (http://mirrors.ctan.org/fonts/newcomputermodern/otf/NewCM10-Regular.otf)
- setup environment variables for
mml
cd ../../..
mml-env-setup
nano mml.env # modify at least MML_DATA_PATH, MML_RESULTS_PATH and MML_LOCAL_WORKERS accordingly
pwd | conda env config vars set MML_ENV_PATH=$(</dev/stdin)/mml.env
conda activate prev
DISCLAIMER: This section requires mml which is not yet public.
- the data and predictions generation process in handled with the
mml
framework - the commands to leverage
mml
are generated in1_generate_predictions.ipynb
and stored intraining_scripts
01_create_cmds.txt
for data download / task generation02_pp_cmds.txt
for data preprocessing03_tag_cmds.txt
for the splitting of tasks according to our experimental setup (train-validation-development test and deployment test)05_dataseed_cmds.txt
for creating 5 additional splittings with different splitting seeds
- if the commands shall be run on some external infrastructure (like a GPU cluster) the
1_generate_predictions.ipynb
contains configuration possibilities to adapt the txt files - the commands can be run locally by
bash 0X_XXX_cmds.txt
(stick to the order indicated by numbering)
DISCLAIMER: This section requires mml which is not yet public.
- once more
mml
is leveraged for this step and the commands are generated in1_generate_predictions.ipynb
and stored intraining_scripts
- for uncertainty assessment (
8_uncertainty.ipynb
):04_reproduce_cmds.txt
for re-training the original experiments with 6 additional seeds on the original splits06_dataseed_predict_cmds.txt
for re-training each task once per additional splittings
- for re-calibration assessment (
4_calibration.ipynb
):07_retraining_cmds.txt
for re-training on the original splits with adapted loss weights according to exact deployment prevalences (for each IR 1.0, 1.5, ..., 10.0)08_retraining_estimated_cmds.txt
for re-training on the original splits with adapted loss weights according to estimated deployment prevalences - using ACC (for each IR 1.0, 1.5, ..., 10.0)
- for uncertainty assessment (
- the files can be run or adapted as mentioned before
- keep in mind that they incorporate (6+5+19+19) * 30 = 1470 training+prediction pipelines and take some time to complete
- in our analysis we also used the original training and previous 3 additional seeds next to
06_dataseed_predict_cmds.txt
(in sum 10 seeded repetitions on the original split)
DISCLAIMER: For licensing reasons we may not provide all predictions, but attach some sample predictions in /data. We also provide intermediate results:
24_prev_estimation_df.pkl
- holds all quantification results (RQ1)24_recalibration_results.csv
- holds all re-calibration results (RQ2a)24_decision_rule_results_....pkl
- holds results on applying various decision rules (RQ2b)24_metric_performance_....pkl
- holds all metric evaluations (RQ2c)
- navigate to the top level folder named
data
and store the project folders generated by the previous commands in there, more precisely- locate your
MML_RESULTS_PATH
as provided in the installation - within search the project folders
- original publication (n=4):
mic23_predictions_original_0
mic23_predictions_reproduce_0
, ...,.._2
- generated by
04_reproduce_cmds.txt
(n=6):mic23_predictions_original_10
, ...,.._15
- generated by
06_dataseed_predict_cmds.txt
(n=5):mic23_predictions_datasplit_seed_3
,..._31
,..._314
,..._3141
,..._31415
- generated by
07_retraining_cmds.txt
(n=19):mic23_predictions_extension_balanced_0_1.0
,..._0_1.5
,..._0_2.0
, ...,..._0_10.0
- generated by
08_retraining_estimated_cmds.txt
(n=19):mic23_predictions_extension_balanced_estimated_0_1.0
,..._0_1.5
,..._0_2.0
, ...,..._0_10.0
- original publication (n=4):
- copy those to the
data
folder next tosrc
- locate your
- now you can load the predictions inside the jupyter notebooks, they should run straight forward and produce the figures in
results
, next tosrc
anddata
Notebooks:
2_plot_examples.ipynb
- Task overview, Figure 43_prevalence_estimation.ipynb
- Research Question 1, creates Figures 5 and C.114_calibration.ipynb
- Research Question 2a, creates Figure 65_threshold_visualization.ipynb
- Research Question 2b, creates Figures 3, 7, and 86_decision_rule.ipynb
- Research Question 2b, creates Figure 97_validation_metrics.ipynb
- Research Question 2c, creates Figure 108_uncertainty.ipynb
- Creates uncertainty tables