Kaggle HPA Single cell classification
Code competition
-
Semi-supervised learning
-
Weakly supervised learning
-
Multi-instance multi-label image classification
-
Transfer learning
-
Cell segmentation
https://www.kaggle.com/c/hpa-single-cell-image-classification
https://www.kaggle.com/koltaibeatrix/hpa-single-cell-classification-submission-2
Koltai Beatrix @koltaib
Horváth Adrienn @Horadry
Differences in the location of proteins can give rise to such cellular heterogeneity. Proteins play essential roles in virtually all cellular processes. Often, many different proteins come together at a specific location to perform a task, and the exact outcome of this task depends on which proteins are present. Different subcellular distributions of one protein can give rise to great functional heterogeneity between cells. Finding such differences, and figuring out how and why they occur, is important for understanding how cells function, how diseases develop, and ultimately how to develop better treatments for those diseases.
The data in the Human Protein Atlas database is freely accessible:
Train and test images: 4 images belong together. Red, blue and yellow filters (images 1-3) show the location of some cell organelles and the green filter (image 4) shows the location of the protein.
Train data: image level labels. Labels refer to cell organelles where the protein occours.
Our aim was to make predictions on protein organelle localization labels for each cell in the images.
- Python
- Keras
- Pytorch
- Tensorflow
1. Cell segmentation
On the base of HPA Cellsegmentetor
!pip install "../input/hpacellsegmentatormaster/HPA-Cell-Segmentation-master"
2. Transfer learning
After degenerating the problem to a multi-instance single-label problem (Zhou et al. 2011) we used EfficientnetB0 as base model for predicting the occurrence of a given label on the images, as a binary classification.
Our aim was to inspect the effectiveness of the EfficientnetB0 in this task. We trained a new model for each label.
With our hyperparameter setting the best performance of our model was:
Base Models | Loss | Accuracy |
---|---|---|
EfficientNetB0 | x.0000 | xx.00 % |
Our EfficientB0 based model system proved to be a working but less effective method for the HPA Single Cell Classification task.
Our future plan is to inspecting the effectiveness of other models so that we can improve the accuracy.
Beatrix Koltai and Adrienn Horváth
11.05.2021