simongrest/chexpert-entries

Not able to get correct multilabel results for single test images.

Closed this issue · 1 comments

I am having issue in getting the multiple correct results from the model (multilabel results). For example for 5 labels (Atelectasis, Cardiomegaly, Consolidation, Edema and Pleural Effusion) the true values are [1, 1, 0, 0, 0] and the predicted values are [0.435084, 0.125264, 0.05981, 0.39272, 0.391482]. I am getting validation loss of 0.470099 and getting avg_auc_metric of 0.834580.

My validation code block:

preds = valid_preds[0]
preds_df = full_valid_df.copy()

for i, c in enumerate(learn.data.classes):
    preds_df[c] = preds[:,i]

preds = preds_df.groupby(['patient','study'])[learn.data.classes].mean().values

for i in range(0,20):
    img = cv2.imread(paths[i])
    plt.imshow(img)
    plt.show()
    print("actual results")
    print(acts[i])
    print("predicted results")
    print(preds[i])

I'm not 100% sure I understand you correctly. Are you looking for a means to convert the numbers output at the end of the network to a vector of binary predictions?

For each image a multi-label prediction is made. These predictions can be interpreted as the predicted probability of each specific condition. In order to convert to a predicted multi-class binary outcome you need to select an appropriate threshold for each condition. This involves making a subjective choice about how many false negatives and false positives you are prepared to accept. As you vary the threshold from 0 to 1 you will move from having no positives and all negatives through to all positives and no negatives - this is actually how the ROC curve is created. It is up to you to choose where on the ROC curve you wish to be - balancing sensitivity and specificity (or equivalently the false positive rate and the false negative rate).