/YOLO-Breast-UltraSound-Images

YOLO BUSI (Breast UltraSound Images) Dataset

Primary LanguagePythonApache License 2.0Apache-2.0

YOLO-Breast-UltraSound-Images (Updated: 2023/04/14)

This is a simple tool to create YOLO BUSI (Breast UltraSound Images) Dataset from a BUSI Dataset with mask images (segmentations).

The original BUSI Dataset used here has been taken from the following web site:

Dataset-BUSI-with-GT

https://www.kaggle.com/datasets/aryashah2k/breast-ultrasound-images-dataset

About Dataset (Taken from the above web site)

Breast cancer is one of the most common causes of death among women worldwide. Early detection helps in reducing the number of early deaths. The data reviews the medical images of breast cancer using ultrasound scan. Breast Ultrasound Dataset is categorized into three classes: normal, benign, and malignant images. Breast ultrasound images can produce great results in classification, detection, and segmentation of breast cancer when combined with machine learning.

Citation:
Al-Dhabyani W, Gomaa M, Khaled H, Fahmy A. 
Dataset of breast ultrasound images. Data in Brief. 
2020 Feb;28:104863. 
DOI: 10.1016/j.dib.2019.104863.

1 Clone repository

Please clone this repostory to your local PC.
>git clone https://github.com/sarah-antillia/YOLO-Breast-UltraSound-Images.git

2 Dataset_BUSI_with_GT

Please download image dataset from the following, and expand it under your local repository YOLO-BreastSound-Images
https://www.kaggle.com/datasets/aryashah2k/breast-ultrasound-images-dataset
YOLO-BREAST-ULTRASOUND-IMAGES
└─Dataset_BUSI_with_GT

3 Create Master Dataset

Please run the following command to create master dataset.

>python create_augmented_master_512x512.py

This create_augmented_master_512x512.py will create BUSI_augmented_master_512x512 folder which contains test, train, and valid datasets from Dataset_BUSI_with_GT.

./BUSI_augmented_master_512x512/
├─test/
│  ├─benign/
│  └─malignant/
├─train/
│  ├─benign/
│  └─malignant/
└─valid/
    ├─benign/
    └─malignant/

1 This splits the original Dataset_BUSI_with_GT dataset to three subsets train, test and valid. with the ratios.

 train: 0.5
 test:  0.3
 valid: 0.2

2 Resize each image to 512x512

3 Augment each image in train dataset by rotating the image with an angle in the following range ANGLES, and save the rotated image as a jpg file.

 
ANGLES = [0, 30, 60, 90, 120, 150, 180, 210, 240, 270, 300, 330]

4 Flip each image horizontally and vertically in train dataset, and save the flipped image as a jpg file.

5 Save each image in test dataset as a jpg file without any augmentation.

6 Save each image in valid dataset as a jpg file without any augmentation.

4 Create YOLO Annotation

Please run the following command to create YOLO Annotation from BUSI_augmented_master_512x512.
>python create_yolo_annotation_from_augmented_master.py
This create_yolo_annotation_from_augmented_master.py will create YOLO folder which contains test, train, and valid YOLO annotations from BUSI_augmented_master_512x512 dataset.
By finding the bounding boxes (rectangular region) from each mask-image in the train, test and valid dataset, we have created YOLO annotation for those subsets.
./YOLO/
├─test/
├─train/
└─valid/

Sample YOLO annotation in train:


5 Download YOLO Dataset

You can download this YOLO Dataset (YOLO-BUSI-DATASET-20230414.zip) from here.

6 Download TFRecord Dataset

You can convert this YOLO dataset to TFRecord by using AnnotationConverter
You can alslo download TFRecord Dataset (TFRecord-BUSI-20230414.zip) from here.

7 Download COCO Dataset

You can convert this YOLO dataset to COCO by using AnnotationConverter
You can alslo download COCO Dataset (COCO-BUSI-20230414.zip) from here.