This repository contains the scripts used in the performance evaluation of 16 artificial intelligence models for the WCEbleedGen Dataset. 10 classification-based, 3 segmentation-based and 3 detection-based pipelines have been trained, validated, and tested. The models used are:
- Classification:
- VGG19
- Xception
- ResNet101V2
- InceptionV3
- InceptionResNetV2
- MobileNetV2
- DenseNet169
- NasNetMobile
- EfficientNetB7
- ConvNeXt
- Segmentation
- UNet
- SegNet
- LinkNet
- Detection
- YOLOV5nu
- YOLOV8n
- YOLOV8x
- The dataset structure were as follows:
- datasets/
- WCEBleedGen/
- bleeding/
- images/
- bounding_boxes/
- YOLO-TXT/
- annotations/
- non-bleeding/
- images/
- annotations/
- bleeding/
- WCEBleedGen/
- data_loader_classify.py
This script is designed to load and preprocess image data for binary classification tasks. It reads images from a directory, processes them, and splits them into training, validation, and test sets. Additionally, it supports data augmentation and saves the preprocessed data for future use.
data_dir
: (Required) Path to the dataset directory.--test_size
: (Optional) Test set size ratio (default is 0.30).--val_size
: (Optional) Validation set size ratio from the test set (default is 0.33).--augment
: (Optional) Apply data augmentation (default is False).--output_dir
: (Optional) Output directory to save the preprocessed data (default is 'data').
-
Basic Usage
To preprocess data from a directory
dataset
without data augmentation and save the results to the default output directory:python data_loader_classify.py dataset
-
With Data Augmentation
To apply data augmentation during preprocessing:
python data_loader_classify.py dataset --augment
-
Custom Test and Validation Sizes
To set custom test and validation set size ratios:
python data_loader_classify.py dataset --test_size 0.25 --val_size 0.25
-
Custom Output Directory
To specify a custom output directory:
python data_loader_classify.py dataset --output_dir custom_data
Suppose you have a dataset stored in my_dataset
, and you want to split the data with a test size of 20% and validation size of 25% from the test set. You also want to apply data augmentation and save the preprocessed data in processed_data
directory:
python data_loader_classify.py my_dataset --test_size 0.20 --val_size 0.25 --augment --output_dir processed_data
The script saves the preprocessed data in compressed NumPy format and the data augmentation configuration in a pickle file within the specified output directory. The files generated are:
train_data.npz
: Contains training images and labels.val_data.npz
: Contains validation images and labels.test_data.npz
: Contains test images and labels.datagen.pkl
: Contains the data augmentation configuration.
- model_classify.py
This script is designed to create and define deep learning models using various pre-trained architectures available in Keras and TensorFlow. It allows for easy customization of the base model and optimizer, enabling quick experimentation with different configurations.
base_model_name
: (Required) Name of the pre-trained model to use (e.g.,VGG19
,Xception
,ResNet50V2
,InceptionV3
,InceptionResNetV2
,MobileNetV2
,DenseNet169
,NASNetMobile
,EfficientNetB7
,ConvNeXtBase
).optimizer_name
: (Required) Name of the optimizer to use (e.g.,Adam
,SGD
,RMSprop
).--learning_rate
: (Optional) Learning rate for the optimizer (default is 0.0001).--input_shape
: (Optional) Input shape of the images (default is (224, 224, 3)).
-
Create a Model with VGG19 and Adam Optimizer
To create a model using the VGG19 base model and Adam optimizer with default learning rate:
python model_classify.py VGG19 Adam
-
Custom Learning Rate
To use a custom learning rate of 0.001:
python model_classify.py VGG19 Adam --learning_rate 0.001
-
Different Base Model and Optimizer
To create a model using the InceptionV3 base model and SGD optimizer:
python model_classify.py InceptionV3 SGD
-
Custom Input Shape
To specify a different input shape for the images:
python model_classify.py ResNet50V2 RMSprop --input_shape 299 299 3
Suppose you want to create a model using the EfficientNetB7 base model, RMSprop optimizer, with a learning rate of 0.0005, and input shape of 256x256x3:
python model_classify.py EfficientNetB7 RMSprop --learning_rate 0.0005 --input_shape 256 256 3
- train_classify.py
This script is designed to train deep learning models using various pre-trained architectures available in Keras and TensorFlow. It integrates data loading, model creation, and training functionalities, providing an end-to-end solution for model training.
--data_dir
: (Required) Directory containing the preprocessed data.--base_model
: (Required) Base model to use for training (choices areVGG19
,Xception
,ResNet50V2
,InceptionV3
,InceptionResNetV2
,MobileNetV2
,DenseNet169
,NASNetMobile
,EfficientNetB7
,ConvNeXtBase
).--optimizer
: (Optional) Optimizer to use (default isAdam
; choices areAdam
,SGD
,RMSprop
).--learning_rate
: (Optional) Learning rate for the optimizer (default is 0.0001).--loss
: (Optional) Loss function to use (default iscategorical_crossentropy
).--metrics
: (Optional) Metrics for evaluation (default is['accuracy']
).--batch_size
: (Optional) Batch size for training (default is 32).--epochs
: (Optional) Number of epochs to train (default is 10).--model_path
: (Optional) Path to save the trained model (default ismodel.h5
).
-
Basic Usage
To train a model using the VGG19 base model with default settings and data from the
dataset
directory:python train_model.py --data_dir dataset --base_model VGG19
-
Custom Optimizer and Learning Rate
To use the SGD optimizer with a custom learning rate of 0.001:
python train_model.py --data_dir dataset --base_model VGG19 --optimizer SGD --learning_rate 0.001
-
Different Base Model and Loss Function
To train using the InceptionV3 base model with binary cross-entropy loss:
python train_model.py --data_dir dataset --base_model InceptionV3 --loss binary_crossentropy
-
Custom Batch Size and Epochs
To specify a batch size of 64 and train for 20 epochs:
python train_model.py --data_dir dataset --base_model ResNet50V2 --batch_size 64 --epochs 20
-
Custom Model Save Path
To save the trained model to a custom path:
python train_model.py --data_dir dataset --base_model EfficientNetB7 --model_path my_model.h5
Suppose you want to train a model using the EfficientNetB7 base model, RMSprop optimizer, with a learning rate of 0.0005, batch size of 64, for 15 epochs, and save the model to trained_model.h5
:
python train_model.py --data_dir dataset --base_model EfficientNetB7 --optimizer RMSprop --learning_rate 0.0005 --batch_size 64 --epochs 15 --model_path trained_model.h5
The model weights are saved at the required location.
- validate_classify.py
- test_classify.py
This guide explains how to use the provided Python scripts to validate and test trained deep learning models. The scripts use various pre-trained architectures available in Keras and TensorFlow. They include functionalities for data loading, model creation, and evaluation.
This script validates a trained model using the validation dataset.
--data_dir
: (Required) Directory containing the preprocessed data.--base_model
: (Required) Base model to use for validation (choices:VGG19
,Xception
,ResNet50V2
,InceptionV3
,InceptionResNetV2
,MobileNetV2
,DenseNet169
,NASNetMobile
,EfficientNetB7
,ConvNeXtBase
).--model_weights
: (Required) Path to the model weights file (.h5).--augment
: (Optional) Apply data augmentation if specified.--optimizer
: (Optional) Optimizer to use (default:Adam
; choices:Adam
,SGD
,RMSprop
).--learning_rate
: (Optional) Learning rate for the optimizer (default: 0.0001).--loss
: (Optional) Loss function to use (default:categorical_crossentropy
).--metrics
: (Optional) Metrics for evaluation (default:['accuracy']
).
To validate a model using the VGG19 base model with default settings and data from the dataset
directory:
python validate_classify.py --data_dir dataset --base_model VGG19 --model_weights model.h5
To use the SGD optimizer with a custom learning rate of 0.001:
python validate_classify.py --data_dir dataset --base_model VGG19 --model_weights model.h5 --optimizer SGD --learning_rate 0.001
This script tests a trained model using the test dataset.
--data_dir
: (Required) Directory containing the preprocessed data.--base_model
: (Required) Base model to use for testing (choices:VGG19
,Xception
,ResNet50V2
,InceptionV3
,InceptionResNetV2
,MobileNetV2
,DenseNet169
,NASNetMobile
,EfficientNetB7
,ConvNeXtBase
).--model_weights
: (Required) Path to the model weights file (.h5).--augment
: (Optional) Apply data augmentation if specified.--optimizer
: (Optional) Optimizer to use (default:Adam
; choices:Adam
,SGD
,RMSprop
).--learning_rate
: (Optional) Learning rate for the optimizer (default: 0.0001).--loss
: (Optional) Loss function to use (default:categorical_crossentropy
).--metrics
: (Optional) Metrics for evaluation (default:['accuracy']
).
To test a model using the InceptionV3 base model with default settings and data from the dataset
directory:
python test_classify.py --data_dir dataset --base_model InceptionV3 --model_weights model.h5
To test using the ResNet50V2 base model with binary cross-entropy loss:
python test_classify.py --data_dir dataset --base_model ResNet50V2 --model_weights model.h5 --loss binary_crossentropy
- inference_classify.py
This guide explains how to use the
inference_classify.py
script for running inference using a trained deep learning model on test images. The script uses various pre-trained architectures available in Keras and TensorFlow for image classification.
--test_dir
: (Required) Directory containing the test images.--base_model
: (Required) Base model to use for inference (choices:VGG19
,Xception
,ResNet50V2
,InceptionV3
,InceptionResNetV2
,MobileNetV2
,DenseNet169
,NASNetMobile
,EfficientNetB7
,ConvNeXtBase
).--model_weights
: (Required) Path to the model weights file (.h5).--optimizer
: (Optional) Optimizer to use (default:Adam
; choices:Adam
,SGD
,RMSprop
).--learning_rate
: (Optional) Learning rate for the optimizer (default: 0.0001).--loss
: (Optional) Loss function to use (default:categorical_crossentropy
).--metrics
: (Optional) Metrics for evaluation (default:['accuracy']
).
To perform inference using the VGG19 base model with default settings on images from the test_images
directory:
python inference_classify.py --test_dir test_images --base_model VGG19 --model_weights model.h5
To use the SGD optimizer with a custom learning rate of 0.001:
python inference_classify.py --test_dir test_images --base_model VGG19 --model_weights model.h5 --optimizer SGD --learning_rate 0.001
This guide explains how to use the data_loader_segment.py script for loading and preparing image segmentation data using TensorFlow and OpenCV. This script loads image data for segmentation tasks, preparing it as TensorFlow datasets.
--path
: (Required) Path to the dataset directory containing subdirectories for each category (bleeding
,non-bleeding
) withImages
andAnnotations
folders.--validation_size
: (Optional) Proportion of validation data (default: 0.2).--test_size
: (Optional) Proportion of test data (default: 0.1).--batch_size
: (Optional) Batch size for training (default: 32).
To load data from the dataset
directory with default settings:
python data_loader_segment.py --path dataset
To customize validation size and batch size:
python data_loader_segment.py --path dataset --validation_size 0.15 --batch_size 16
- model_segment.py
This script allows the creation and compilation of three different segmentation models: UNet, SegNet, and LinkNet. The script utilizes TensorFlow/Keras for building the models. It also provides an option to specify various hyperparameters such as input size, filters, and learning rate.
Run the script from the command line by specifying the model type, input size, filters, and learning rate. Below are examples of how to use the script.
-
UNet Model
python model_segment.py --model unet --input_size 224 --filters 64 128 256 512 --learning_rate 0.001
-
SegNet Model
python model_segment.py --model segnet --input_size 224 --learning_rate 0.001
-
LinkNet Model
python model_segment.py --model linknet --input_size 224 --learning_rate 0.001
-
--model
: (Required) Specifies the type of model to create. Choices are "unet", "segnet", "linknet". -
--input_size
: (Optional) Specifies the size of the input image. Default is 224. -
--filters
: (Optional) Specifies the number of filters for each convolutional layer in the UNet model. Default is [64, 128, 256, 512]. Only applicable for the UNet model. -
--learning_rate
: (Optional) Specifies the learning rate for the optimizer. Default is 0.001.
- metrics_segment.py
- train_segment.py
These scripts are used to train and evaluate the segmentation models.
This script defines custom metrics and losses used during model training and evaluation.
- IoU (Intersection over Union): Measures the overlap between the predicted and true masks.
- Dice Coefficient: Measures the similarity between the predicted and true masks.
- Dice Coefficient Loss: Defined as 1 - Dice Coefficient. The custom metrics and losses are imported and used in the train_segment.py script for model training and evaluation.
from metrics_segment import iou, dice_coefficient
This script trains and evaluates the segmentation models defined in model_segment.py Run the script from the command line by specifying the model type, data path, input size, filters, learning rate, number of epochs, batch size, validation size, and test size.
-
Train UNet Model
python train_segment.py --model unet --data_path /path/to/dataset --input_size 224 --filters 64 128 256 512 --learning_rate 0.0001 --epochs 250 --batch_size 32 --validation_size 0.2 --test_size 0.1
-
Train SegNet Model
python train_segment.py --model segnet --data_path /path/to/dataset --input_size 224 --learning_rate 0.0001 --epochs 250 --batch_size 32 --validation_size 0.2 --test_size 0.1
-
Train LinkNet Model
python train_segment.py --model linknet --data_path /path/to/dataset --input_size 224 --learning_rate 0.0001 --epochs 250 --batch_size 32 --validation_size 0.2 --test_size 0.1
--model
: (Required) Specifies the type of model to create. Choices are "unet", "segnet", "linknet".--data_path
: (Required) Path to the dataset.--input_size
: (Optional) Specifies the size of the input image. Default is 224.--filters
: (Optional) Specifies the number of filters for each convolutional layer in the UNet model. Default is [64, 128, 256, 512]. Only applicable for the UNet model.--learning_rate
: (Optional) Specifies the learning rate for the optimizer. Default is 0.0001.--epochs
: (Optional) Number of epochs to train the model. Default is 250.--batch_size
: (Optional) Batch size for training. Default is 32.--validation_size
: (Optional) Fraction of the dataset to use for validation. Default is 0.2.--test_size
: (Optional) Fraction of the dataset to use for testing. Default is 0.1.
The script loads the dataset, creates TensorFlow datasets for training, validation, and testing, builds the specified model, and trains the model. It also evaluates the model on the validation and test datasets and prints the metrics.
The script outputs the summary of the created model architecture and compiles the model using the specified learning rate. During training, it saves the training history in a CSV file and saves the model weights. It also prints the validation and test metrics.
- segment_inference.py
This script is designed for performing inference on images using pre-trained segmentation models. It supports three model architectures: UNet, SegNet, and LinkNet. The script reads an input image, applies the segmentation model, and displays the original image along with the segmentation overlay.
Run the script from the command line by specifying the input image path, model architecture, model weights path, input shape, and filters (for UNet only).
-
UNet Model
python segment_inference.py --image_path /path/to/image.jpg --model_name unet --weights_path /path/to/weights.h5 --input_shape 224 224 3 --filters 64 128 256 512
-
SegNet Model
python segment_inference.py --image_path /path/to/image.jpg --model_name segnet --weights_path /path/to/weights.h5 --input_shape 224 224 3
-
LinkNet Model
python segment_inference.py --image_path /path/to/image.jpg --model_name linknet --weights_path /path/to/weights.h5 --input_shape 224 224 3
--image_path
: (Required) Path to the input image.--model_name
: (Required) Specifies the model architecture to use. Choices are "unet", "segnet", "linknet".--weights_path
: (Required) Path to the model weights file.--input_shape
: (Optional) Input shape for the model. Default is [224, 224, 3].--filters
: (Optional) Filters for the UNet model. Default is [64, 128, 256, 512]. Only applicable for the UNet model.
When the script is run, it displays a window with two images: the original input image and the input image with the segmentation overlay.
Original Image Segmentation Overlay
[Shows the input image] [Shows the input image with green segmentation overlay]
This script is useful for visualizing the results of a segmentation model on new images. Modify the script as needed for further customization and experimentation.
The Binary_mask_to_bounding_box.py
script is designed to process binary mask images and generate bounding boxes around the objects detected within these masks. It supports saving the bounding box information in three different formats:
- TXT format - Contains the bounding box coordinates (xmin, ymin, xmax, ymax).
- XML format (Pascal VOC) - Contains bounding box details in the Pascal VOC annotation format, which is commonly used in object detection datasets.
- YOLO TXT format - Contains bounding box details formatted for use with the YOLO (You Only Look Once) object detection framework.
To run the script, use the following command:
python Binary_mask_to_bounding_box.py <path_to_binary_masks_folder> <save_root_folder>
<path_to_binary_masks_folder>
: The path to the folder containing your binary mask images (.png
files).<save_root_folder>
: The root folder where the output bounding box files will be saved. The script will automatically create three subfolders (TXT
,XML
,YOLO_TXT
) to store the results in the corresponding formats.
Assume you have a folder masks/
containing binary mask images, and you want to save the bounding box files in a folder named bounding_boxes_output/
. You would run:
python Binary_mask_to_bounding_box.py masks/ bounding_boxes_output/
After running this command, the script will generate the following structure within the bounding_boxes_output/
folder:
bounding_boxes_output/
├── TXT/
│ ├── mask1.txt
│ ├── mask2.txt
│ └── ...
├── XML/
│ ├── mask1.xml
│ ├── mask2.xml
│ └── ...
└── YOLO_TXT/
├── mask1.txt
├── mask2.txt
└── ...
Each subfolder will contain the bounding box files in the respective format, corresponding to each mask image processed.
-
TXT Format:
Each
.txt
file contains a single line with the bounding box coordinates:xmin ymin xmax ymax
-
XML Format (Pascal VOC):
Each
.xml
file follows the Pascal VOC structure and includes information about the image and bounding box:<annotation> <folder>XML</folder> <filename>mask1.png</filename> <path>masks/mask1.png</path> <source> <database>Unknown</database> </source> <size> <width>width_value</width> <height>height_value</height> <depth>3</depth> </size> <segmented>0</segmented> <object> <name>object</name> <pose>Unspecified</pose> <truncated>0</truncated> <difficult>0</difficult> <bndbox> <xmin>xmin_value</xmin> <ymin>ymin_value</ymin> <xmax>xmax_value</xmax> <ymax>ymax_value</ymax> </bndbox> </object> </annotation>
-
YOLO TXT Format:
Each
.txt
file contains a single line with normalized bounding box information:class_id x_center y_center width height
class_id
: By default, this is set to0
.x_center
,y_center
: Normalized coordinates of the bounding box center.width
,height
: Normalized width and height of the bounding box.
For detection, models were used directly from the ultralytics repositories.
All the models were trained for a total of 250 epochs, without any preprocessing or modification. The codes were run using 40 GB DGX A100 NVIDIA GPU workstation available at the Department of Electronics and Communication Engineering, Indira Gandhi Delhi Technical University for Women, New Delhi, India.
The results and the findings will be released in the form of a research paper soon, the preprint has been released and can be accessed at link
Palak Handa conceptualized the research idea, performed the data collection, mask analysis, literature review, and did the research paper writing. Manas Dhir contributed in developing the benchmarking pipeline, developing the github repository, and writing the initial draft of the research paper. Dr. Deepak Gunjan from the Department of Gastroenterology and HNU, AIIMS Delhi performed the medical annotations, and was involved in suggestions for improving artificial intelligence algorithms. Dr. Nidhi Goel was involved in literature review and administration. Jyoti Dhatarwal contributed in the initial data collection. Harshita Mangotra contributed in development of the bounding boxes. Divyansh Nautiyal contributed in correcting the multiple bleeding regions and re-making the bounding boxes, and Nishu, Sanya, Shriya, and Sneha Singh contributed in the result replications on the GPU workstation and table entries. The WCEbleedGen Dataset has been actively downloaded more than 1000 times and was utilized in Auto-WCEBleedGen Version 1 and 2 challenge as training dataset. The challenge page is available here.