This project aims to provide a number of utilities to address multilabel classification problems related to chest X-ray images. This project is currently under construction and is subject to multiple changes.
Pull the DeepSars - XR Multilabel Classification code repository and follow the instructions below to get you a copy of the project up and running on your local machine for development and testing purposes.
All the packages are listed in requirements-pip.txt
. You can install all of them by executing the next line in the terminal:
pip install -r requirements-pip.txt
The first thing to be done is the contruction of *.tfrecords
files to store the datasets that will be used for training, validation and testing of the models built. These files can be stored somewhere of the user's preference and the path to these must be referenced in config.py
. In future versions of this project it is planned to make available a tool to easily build these datasets.
Once the datasets are constructed, a variety of functions can be executed that have as their purpose the training of neural networks for the classification of multi-label problems and the evaluation of these networks through many tasks and metrics. The general procedure to start an execution is as follows:
- Edit
config.py
to specificy the dataset and process configuration by commenting/uncommenting/editing specific lines - Run
train.py
- The results are written into a newly created subdirectory under
config.result_dir
The processes available for execution are the following:
Depending on the process being executed, a new subdirectory will be created within results
with an unique identifier (uid) for the experiment and a description given by the user, e.g. uid-network-dataset
. The structure for each subdirectory looks something like
📦DeepSars_multilabel_rx
┗ 📂results
┗ 📂uid-network-dataset
┣ 📜log.txt
┣ 📜config.txt
┣ 📜best_auc_model.h5
┣ 📜best_loss_model.h5
┣ 📜metrics_on_training.png
┣ 📜evaluation_log.txt
┗ 📜metrics_on_evaluation.pkl
- Enhance current behaviour of existing features:
- Make so that every function in
train.py
creates aREADME.md
in the experiment subfolder with information about the training - Modify the model save in
train_single_network
to match the DeepSars naming convention - Refactor all the functions in
train.py
to work as follows:- A training function ALWAYS creates a new result subdirectory.
- In the beginning of each training process, the user might choose between building the model from scratch and load it with the imagenet or other pretrained Model weights or load an already built model
- Work on
utils.HistoryPlotter
to make it plot the metrics when there's no validation data
- Make so that every function in
- Add the following functions to
train.py
:-
train_single_network
: trains a defaulttf.keras.applications
network using a test and validation record -
train_ensemble_network
: trains$N$ defaulttf.keras.applications
networks using either a standardAverage
layer or a customWeightedAverage
layer
-
- Add the following functions to
util_scripts.py
:-
evaluate_single_network
: evaluates a single model using a test set -
evaluate_late_fusion_ensemble
: takes two trained models or a late fusion ensemble to evaluate using a test set
-