brucechou1983/CheXNet-Keras

How to predict by the trained model

tsaiid opened this issue · 2 comments

I downloaded the trained model weights, and tried to load them to predict. I picked a image from the dataset (00000322_010.png), labelled pneumothorax, and the predicted probability seemed to be extremely low for all classes. Here is how I ran the prediction:

import cv2
import numpy as np
from keras.models import load_model

model = load_model('./CheXNet-Keras/model.h5') # I saved the model from train.py
model.load_weights('./CheXNet-Keras/brucechou1983_CheXNet_Keras_0.3.0_weights.h5')
im = cv2.imread("./CheXNet-Keras/data/images/00000322_010.png")
im_r = cv2.resize(im, (224, 224))
im_t = im_r.reshape(1, 224, 224, 3).astype('float32') / 255
prediction = model.predict(im_t)

The output of prediction was:

array([[2.0528843e-01, 9.4045536e-04, 3.9193404e-01, 9.2863671e-02,
        1.0580428e-02, 1.1802137e-02, 3.6063469e-03, 1.4762501e-01,
        2.5597181e-02, 1.1413006e-02, 4.9169478e-03, 1.2800613e-02,
        4.7103822e-02, 3.9492457e-05]], dtype=float32)

Any suggestion?

This is a multilabel problem instead of a multiclass problem so the outputs are NOT probability values. You have to set a proper threshold for each of the classes to get the predictions. For example if you want to maximize the f1 score, just loop over the roc curve and find the threshold with largest 2 / ( 1 + 1/tp + fp/tp ).

BTW, I actually don't encourage people to use CheXNet for prediction at this time. CheXNet tells us this task could be done by deep learning. However, if you read the dataset source paper, you know that these labels are extracted by text mining radiological reports. The quality of the label could be considerably low.

Last but not least. ChexNet can't actually do diagnosis because complicated reasons. If you're interested in the detail, Luke's blog is recommended.

@brucechou1983 Does that mean chexnet won't be able to make (decent) predictions / detect with bounding box for foreseeable future?