
Instance segmentation: two training stages with transfer learning, two inference stages with EfficientNetV2 and Mask R-CNN R101-FPN, ensemble masks with Weighted Boxes Fusion, heavily rotated Mosaic w/o artifact or tiny bounding boxes, Test time augmentation, five-fold cross-validation, Detectron2 with Albumentations, CIoU loss andGIoU loss.

Primary LanguageJupyter Notebook

Sartorius Cell Instance Segmentation



  • Model 1 (Classifier): EfficientNetV2
    • train with train_semi_supervised and Sartorious dataset
    • result: 0.0 losses and 100% accuracy on training, validation sets
  • Model 2 (Instance Segmentation): Mask R-CNN (R101-FPN)
    • Training stage 1: train with eight-class LIVECell dataset
    • Training stage 2: train with three-class Sartorious dataset with pretrained weights from stage 1
  • Inference: use EfficientNetV2 classifier to predict class which is used to refine final prediction of Mask R-CNN
  • 5 folds cross-validation
  • Ensembling masks from different models with customed Weighted Boxes Fusion

What didn't work

  • Heavy augmentation:
    • 2 x customed Mosaic data augmentation (deleted tiny bouding boxes)
      • RandomRotate + CenterCrop (w/o artifact) || Transpose (Reflection) + RandomCrop
      • RandomCrop + random x_center and y_center
    • MixUp
    • AdditiveGaussianNoise, GaussNoise, MotionBlur, MedianBlur, Blur, CLAHE, Sharpen, Emboss, RandomBrightnessContrast
  • Test time augmentation (HorizontalFlip, VerticalFlip)
  • CIoU loss and GIoU loss




image image image

  • segmentation dict: represents the per-pixel segmentation mask in COCO’s compressed RLE format. The dict should have keys “size” and “counts”.
  • You can convert a uint8 segmentation mask of 0s and 1s into such dict by pycocotools.mask.encode(np.asarray(mask, order="F")).
  • cfg.INPUT.MASK_FORMAT must be set to bitmask if using the default data loader with such format.