Ground truth pixel values in CamVid
Closed this issue · 1 comments
Hi,
I use CamVid dataset to train a model, and its parameter setting is same as the ENet paper.
After training, I forward a single image with my trained model, and the results are:
From this single image test, it seems that the trained model is powerful enough.
With the image above, and its annotated image, I try to calculate its accuracy.
I considered the pixel values in annotated image indicate its class.
For example, 0 indicates background, 1 indicates sky, and so on.
After forwarding the image, I compare each pixel in output vector with its ground truth,
and get the accuracy 0.06 = 6%.
After that, I saw the following code in loadCamVid.lua
-- load corresponding ground truth
rawImg = image.load(gtPath[i], 1, 'byte'):squeeze():float() + 2
local mask = rawImg:eq(13):float()
rawImg = rawImg - mask * #classes
In original ground truth image, the pixel values are 0-11,
after the process above, the pixel values are 1-11, the index 0 is lost.
What's going wrong when I calculate the accuracy?
Many thanks.
After the code above, the range should be 1-12,
and it can be adjusted to 0-11 for classification with 0-index-start labels.
Closing the issue.