/mhsma-dataset

MHSMA: The Modified Human Sperm Morphology Analysis Dataset

OtherNOASSERTION

MHSMA: The Modified Human Sperm Morphology Analysis Dataset

The MHSMA dataset is a collection of human sperm images from 235 patients with male factor infertility. Each image is labeled by experts for normal or abnormal sperm acrosome, head, vacuole, and tail.

The training, validation, and test sets contain 1000, 240, and 300 images, respectively.

Images are available in two different crop sizes: 128x128- and 64x64-pixel. The following figure shows two versions of the same instance.

128x128-pixel 64x64-pixel
MHSMA-128 sample MHSMA-64 sample

In MHSMA, each instance is a grayscale image capturing a single sperm. The head of the sperm is roughly located at the center of the image. Also, the sperm tail is not entirely visible in the images.

Labels can be either 0 (normal, positive) or 1 (abnormal, negative).

The dataset is available in .npy format. You can load the .npy files using numpy.load. The details of the files are described in the table below.

File Shape Type Description
x_128_train.npy (1000, 128, 128) uint8 Training set, 128x128-pixel version
x_128_valid.npy (240, 128, 128) uint8 Validation set, 128x128-pixel version
x_128_test.npy (300, 128, 128) uint8 Test set, 128x128-pixel version
x_64_train.npy (1000, 64, 64) uint8 Training set, 64x64-pixel version
x_64_valid.npy (240, 64, 64) uint8 Validation set, 64x64-pixel version
x_64_test.npy (300, 64, 64) uint8 Test set, 64x64-pixel version
y_acrosome_train.npy (1000,) uint8 Training set labels for acrosome
y_acrosome_valid.npy (240,) uint8 Validation set labels for acrosome
y_acrosome_test.npy (300,) uint8 Test set labels for acrosome
y_head_train.npy (1000,) uint8 Training set labels for head
y_head_valid.npy (240,) uint8 Validation set labels for head
y_head_test.npy (300,) uint8 Test set labels for head
y_vacuole_train.npy (1000,) uint8 Training set labels for vacuole
y_vacuole_valid.npy (240,) uint8 Validation set labels for vacuole
y_vacuole_test.npy (300,) uint8 Test set labels for vacuole
y_tail_train.npy (1000,) uint8 Training set labels for tail
y_tail_valid.npy (240,) uint8 Validation set labels for tail
y_tail_test.npy (300,) uint8 Test set labels for tail

The following table shows the number of positive and negative examples in the dataset.

Set Label # Positive # Negative % Positive
Whole dataset Acrosome 1,086 454 70.52
Head 1,122 418 72.86
Vacuole 1,301 239 84.48
Tail 1,471 69 95.52
Training set Acrosome 699 301 69.90
Head 727 273 72.70
Vacuole 830 170 83.00
Tail 954 46 95.40
Validation set Acrosome 174 66 72.50
Head 176 64 73.33
Vacuole 209 31 87.08
Tail 233 7 97.08
Test set Acrosome 213 87 71.00
Head 219 81 73.00
Vacuole 262 38 87.33
Tail 284 16 94.67

Results

If you would like to add a new result, you can open a pull request.

Method Label Accuracy Precision Recall F0.5 score G-mean AUC MCC
A novel deep learning method for automatic assessment of human sperm images (Apr 2019) Acrosome 76.67 85.93 80.28 84.74 83.06 83.89 +0.4618
Head 77.00 83.48 85.39 83.86 84.43 77.80 +0.4053
Vacuole 91.33 94.36 95.80 94.65 95.08 88.08 +0.5910
Effect of Deep Transfer and Multi-task Learning on Sperm Abnormality Detection (Nov 2020) Acrosome (DTL) 79.00 80.24 93.42 82.57 86.58 79.65 +0.4447
Acrosome (DMTL) 80.66 82.42 92.48 84.26 87.31 78.19 +0.4984
Head (DTL) 84.00 87.01 91.78 87.92 89.36 81.56 +0.5775
Head (DMTL) 82.00 82.60 95.43 84.89 88.78 78.40 +0.5021
Vacuole (DTL) 94.00 95.18 98.09 95.75 96.62 94.73 +0.7082
Vacuole (DMTL) 92.33 94.75 96.56 95.11 95.65 93.64 +0.6348

Citation

If you use this dataset in your research, please kindly cite our work as:

@article{javadi2019novel,
  title={A novel deep learning method for automatic assessment of human sperm images},
  author={Javadi, Soroush and Mirroshandel, Seyed Abolghasem},
  journal={Computers in Biology and Medicine},
  volume={109},
  pages={182--194},
  year={2019},
  doi={10.1016/j.compbiomed.2019.04.030}
}

License

This dataset is made available under the CC BY-NC-SA 4.0 license.

Credits

MHSMA is based on the Human Sperm Morphology Analysis Dataset (HSMA-DS) (Ghasemian et al., 2015).