This project implements an ecosystem for multi-annotator learning approaches.
data_collection
: scripts to emulate or adjust our data collection, including the annotation campaign via LabelStudiolabel_studio_interfaces
: scripts to perform an annotation campaign via Label Studio using the example of the datasetdopanim
annotation.xml
: code for the annotation interfacepost-questionnaire.xml
: code for the post-questionnaire interfacepre-questionnaire.xml
: code for the pre-questionnaire interface
python_scripts
: scripts to download task data from iNaturalist and to prepare it for annotation via Label Studioannotation_tasks.py
: script to create batches of annotation tasks for the upload to Label Studiodownload.py
: script to download data from iNaturalistpreprocessing.py
: script to preprocess downloaded datataxons.py
: contains taxon names and IDs to be downloaded from iNaturalist
empirical_evaluation
: scripts to reproduce or adjust our empirical evaluation, including the benchmark and case studieshydra_configs
: collection ofhydra
config files for defining hyperparametersarchitecture
: config group of config files for network architecturesclassifier
: config group of config files for multi-annotator classification approachesdata
: config group of config files for datasetsssl_model
: config group of config files for self-supervised learning models as backbonesexperiment.yaml
: config file to define the architecture(s), dataset, and multi-annotator classification approach for an experiment
jupyter_notebooks
: Jupyter notebooks to analyze results or use casesanalyze_collected_data.ipynb
: Jupyter notebook to analyze the datasetdopanim
annotation_times_active_learning.ipynb
: Jupyter notebook to reproduce the use case on annotation times in active learning for the datasetdopanim
t_sne_features.ipynb
: Jupyter notebook to create the t-SNE plots of self-supervised features for the datasetdopanim
tabular_results.ipynb
: Jupyter notebook to create the tables of results obtained after executing the experiments for the datasetdopanim
python_scripts
: collection of scripts to perform experimental evaluationperform_experiments.py
: script to execute a single experiment for a given configurationwrite_bash_scripts.py
: script to write Bash or Slurm scripts for evaluation
maml
: Python package for multi-annotator machine learning consisting of several sub-packagesarchitectures
: implementations of network architectures for the ground truth and annotator performance modelsclassifiers
: implementations of multi-annotator machine learning approaches usingpytorch_lightning
modulesdata
: implementations ofpytorch
data sets with class labels provided by multiple, error-prone annotatorsutils
: helper functions, e.g., for visualization
environment.yml
: file containing all package details to create aconda
environment
As a prerequisite, we assume to have a Linux distribution as operating system.
- Download a
conda
version to be installed on your machine. - Setup the environment via
projectpath$ conda env create -f environment.yml
- Activate the new environment
projectpath$ conda activate maml
- Verify that the
maml
(multi-annotator machine learning) environment was installed correctly:
projectpath$ conda env list
Based on the example of the dopanim
dataset, we provide scripts to
download task data from iNaturalist and to annotate this data via
Label Studio.
- Check the file
taxons.py
and ajust the taxon IDs and names according to your preferences. - Inspect the parameters of the script
download.py
and adjust them to your preferences. For example, you can download only one page with 20 observations per class and only 1s per request via
projectpath$ conda activate maml
projectpath$ cd data_collection/python_scripts
projectpath/data_collection/python_scripts$ python download.py --n_pages 1 --per_page 20 --request_time_interval 1
Do not use a too small request_time_interval
to satisfy the requirements of the iNaturalist API.
3. Inspect the parameters of the script preprocessing.py
and
adjust them to your preferences. For example, you can define 1 validation and 1 test sample per class via
projectpath$ conda activate maml
projectpath$ cd data_collection/python_scripts
projectpath/data_collection/python_scripts$ python preprocessing.py --no_of_test_images_per_taxon 1 --no_of_validation_images_per_taxon 1
- Inspect the parameters of the script
annotation_tasks.py
and adjust them to your preferences. For example, you can extract the annotation tasks for batch0
and1
via
projectpath$ conda activate maml
projectpath$ cd data_collection/python_scripts
projectpath/data_collection/python_scripts$ python annotation_tasks.py --batches "[0,1]"
The obtained batches can then be uploaded to Label Studio to be manually assigned to certain annotators.
Furthermore, you can upload and employ the corresponding interfaces label_studio_interfaces
.
We refer to the documentation of Label Studio for understanding the exact steps of setting up the annotation platform.
We provide scripts and Jupyter notebooks to benchmark and visualize multi-annotator machine learning approaches on datasets annotated by multiple error-prone annotators.
The Python script for executing a single experiment is
perform_experiment.py
and the corresponding main config file
is evaluation
. In this config file, you also need to specify the mlruns_path
defining the path, where the results are to be saved via mlflow
. Further, you have the option
to select the 'gpu' or 'cpu' as accelerator
.
- Before starting a single experiment or Jupyter notebook, check whether the dataset is already downloaded.
For example, if you want to ensure that the dataset
dopanim
is downloaded, update thedownload
flag in its config filedopanim.yaml
. - An experiment can then be started by executing the following commands
projectpath$ conda activate maml
projectpath$ cd empirical_evaluation/python_scripts
projectpath/empirical_evaluation/python_scripts$ python perform_experiment.py data=dopanim data.class_definition.variant="full" classifier=majority_vote seed=0
- Since there are many different experimental configuration including ten repetitions with different seeds, you can
create Bash scripts by following the instructions in
write_bash_scripts.py
and then execute the following commands
projectpath$ conda activate maml
projectpath$ cd empirical_evaluation/python_scripts
projectpath/empirical_evaluation/python_scripts$ python write_bash_scripts.py
- There is a bash script for the hyperparameter search, each dataset variant of the benchmark and use cases. For
example, executing the benchmark experiments for the variant
full
via SLURM can be done according to
projectpath$ conda activate maml
projectpath$ sbatch path_to_bash_scripts/dopanim_benchmark_full.sh
Once, an experiment is completed, its associated results can be loaded via mlflow
.
For getting a tabular presentation of these results, you need to start the Jupyter notebook
tabular_results.ipynb
and follow its instructions.
projectpath$ conda activate maml
projectpath$ cd empirical_evaluation/jupyter_notebooks
projectpath/empirical_evaluation/jupyter_notebooks$ jupyter-notebook tabular_results.ipynb
For reproducing the confusion matrices of the top-label predictions, reliability diagrams of the likelihoods, and
histograms of annotation times, you need to start the Jupyter notebook
analyze_collected_data.ipynb
and follow
its instructions.
projectpath$ conda activate maml
projectpath$ cd empirical_evaluation/jupyter_notebooks
projectpath/empirical_evaluation/jupyter_notebooks$ jupyter-notebook analyze_collected_data.ipynb
For reproducing the t-SNE plot of the self-supervised features learned by the DINOv2 ViT-S/14, you need to start
the Jupyter notebook t_sne_features.ipynb
and follow
its instructions.
projectpath$ conda activate maml
projectpath$ cd empirical_evaluation/jupyter_notebooks
projectpath/empirical_evaluation/jupyter_notebooks$ jupyter-notebook t_sne_features.ipynb
For reproducing the use case study on annotation times in active learning, you need to start
the Jupyter notebook annotation_times_active_learning.ipynb
and follow its instructions.
projectpath$ conda activate maml
projectpath$ cd empirical_evaluation/jupyter_notebooks
projectpath/empirical_evaluation/jupyter_notebooks$ jupyter-notebook annotation_times_active_learning.ipynb
If you encounter any problems, watch out for any TODO
comments, which give hints or instructions to ensure the
functionality of the code. If the problems are still not resolved, feel free to create a corresponding GitHub issue
or contact us directly via the e-mail marek.herde@uni-kassel.de