The MHSMA dataset is a collection of human sperm images from 235 patients with male factor infertility. Each image is labeled by experts for normal or abnormal sperm acrosome, head, vacuole, and tail.
The training, validation, and test sets contain 1000, 240, and 300 images, respectively.
Images are available in two different crop sizes: 128x128- and 64x64-pixel. The following figure shows two versions of the same instance.
128x128-pixel | 64x64-pixel |
---|---|
In MHSMA, each instance is a grayscale image capturing a single sperm. The head of the sperm is roughly located at the center of the image. Also, the sperm tail is not entirely visible in the images.
Labels can be either 0
(normal, positive) or 1
(abnormal, negative).
The dataset is available in .npy
format.
You can load the .npy
files using numpy.load.
The details of the files are described in the table below.
File | Shape | Type | Description |
---|---|---|---|
x_128_train.npy |
(1000, 128, 128) |
uint8 |
Training set, 128x128-pixel version |
x_128_valid.npy |
(240, 128, 128) |
uint8 |
Validation set, 128x128-pixel version |
x_128_test.npy |
(300, 128, 128) |
uint8 |
Test set, 128x128-pixel version |
x_64_train.npy |
(1000, 64, 64) |
uint8 |
Training set, 64x64-pixel version |
x_64_valid.npy |
(240, 64, 64) |
uint8 |
Validation set, 64x64-pixel version |
x_64_test.npy |
(300, 64, 64) |
uint8 |
Test set, 64x64-pixel version |
y_acrosome_train.npy |
(1000,) |
uint8 |
Training set labels for acrosome |
y_acrosome_valid.npy |
(240,) |
uint8 |
Validation set labels for acrosome |
y_acrosome_test.npy |
(300,) |
uint8 |
Test set labels for acrosome |
y_head_train.npy |
(1000,) |
uint8 |
Training set labels for head |
y_head_valid.npy |
(240,) |
uint8 |
Validation set labels for head |
y_head_test.npy |
(300,) |
uint8 |
Test set labels for head |
y_vacuole_train.npy |
(1000,) |
uint8 |
Training set labels for vacuole |
y_vacuole_valid.npy |
(240,) |
uint8 |
Validation set labels for vacuole |
y_vacuole_test.npy |
(300,) |
uint8 |
Test set labels for vacuole |
y_tail_train.npy |
(1000,) |
uint8 |
Training set labels for tail |
y_tail_valid.npy |
(240,) |
uint8 |
Validation set labels for tail |
y_tail_test.npy |
(300,) |
uint8 |
Test set labels for tail |
The following table shows the number of positive and negative examples in the dataset.
Set | Label | # Positive | # Negative | % Positive |
---|---|---|---|---|
Whole dataset | Acrosome | 1,086 | 454 | 70.52 |
Head | 1,122 | 418 | 72.86 | |
Vacuole | 1,301 | 239 | 84.48 | |
Tail | 1,471 | 69 | 95.52 | |
Training set | Acrosome | 699 | 301 | 69.90 |
Head | 727 | 273 | 72.70 | |
Vacuole | 830 | 170 | 83.00 | |
Tail | 954 | 46 | 95.40 | |
Validation set | Acrosome | 174 | 66 | 72.50 |
Head | 176 | 64 | 73.33 | |
Vacuole | 209 | 31 | 87.08 | |
Tail | 233 | 7 | 97.08 | |
Test set | Acrosome | 213 | 87 | 71.00 |
Head | 219 | 81 | 73.00 | |
Vacuole | 262 | 38 | 87.33 | |
Tail | 284 | 16 | 94.67 |
If you would like to add a new result, you can open a pull request.
Method | Label | Accuracy | Precision | Recall | F0.5 score | G-mean | AUC | MCC |
---|---|---|---|---|---|---|---|---|
A novel deep learning method for automatic assessment of human sperm images (Apr 2019) | Acrosome | 76.67 | 85.93 | 80.28 | 84.74 | 83.06 | 83.89 | +0.4618 |
Head | 77.00 | 83.48 | 85.39 | 83.86 | 84.43 | 77.80 | +0.4053 | |
Vacuole | 91.33 | 94.36 | 95.80 | 94.65 | 95.08 | 88.08 | +0.5910 | |
Effect of Deep Transfer and Multi-task Learning on Sperm Abnormality Detection (Nov 2020) | Acrosome (DTL) | 79.00 | 80.24 | 93.42 | 82.57 | 86.58 | 79.65 | +0.4447 |
Acrosome (DMTL) | 80.66 | 82.42 | 92.48 | 84.26 | 87.31 | 78.19 | +0.4984 | |
Head (DTL) | 84.00 | 87.01 | 91.78 | 87.92 | 89.36 | 81.56 | +0.5775 | |
Head (DMTL) | 82.00 | 82.60 | 95.43 | 84.89 | 88.78 | 78.40 | +0.5021 | |
Vacuole (DTL) | 94.00 | 95.18 | 98.09 | 95.75 | 96.62 | 94.73 | +0.7082 | |
Vacuole (DMTL) | 92.33 | 94.75 | 96.56 | 95.11 | 95.65 | 93.64 | +0.6348 |
If you use this dataset in your research, please kindly cite our work as:
@article{javadi2019novel,
title={A novel deep learning method for automatic assessment of human sperm images},
author={Javadi, Soroush and Mirroshandel, Seyed Abolghasem},
journal={Computers in Biology and Medicine},
volume={109},
pages={182--194},
year={2019},
doi={10.1016/j.compbiomed.2019.04.030}
}
This dataset is made available under the CC BY-NC-SA 4.0 license.
MHSMA is based on the Human Sperm Morphology Analysis Dataset (HSMA-DS) (Ghasemian et al., 2015).