Is there a model pre-trained on MSCOCO with 80classes?
BAILOOL opened this issue · 3 comments
BAILOOL commented
Is there a model pre-trained on MSCOCO with 80classes?
ramprs commented
Hi @BAILOOL, I don't have a VGG model fine-tuned on COCO. I have been using the GoogLe-Net model fine-tuned on COCO from http://www.cs.bu.edu/groups/ivc/data/ExcitationBP/COCO/ for my experiments.
ramprs commented
As far as I know, loadcaffe doesn't allow loading inception type modules trained on caffe. Hence I used the following caffe snippet to get Grad-CAM visualizations from the above model.
def GradCAM(net, img, classID):
topBlobName = 'loss3/classifier'
topLayerName = 'loss3/classifier'
outputLayerName = 'inception_5b/output'
outputBlobName = 'inception_5b/output'
# load image, rescale
minDim = min(img.shape[:2])
newSize = (int(img.shape[0]*imgScale/float(minDim)), int(img.shape[1]*imgScale/float(minDim)))
imgS = transform.resize(img, newSize)
# reshape net
net.blobs['data'].reshape(1,3,newSize[0],newSize[1])
transformer = caffe.io.Transformer({'data': net.blobs['data'].data.shape})
transformer.set_mean('data', np.array([103.939, 116.779, 123.68]))
transformer.set_transpose('data', (2,0,1))
transformer.set_channel_swap('data', (2,1,0))
transformer.set_raw_scale('data', 255.0)
caffe.set_mode_gpu()
# forward pass
net.blobs['data'].data[...] = transformer.preprocess('data', imgS)
out = net.forward(end = topLayerName)
# create grad Input
net.blobs[topBlobName].diff[0][...] = 0
net.blobs[topBlobName].diff[0][classID] = 1
# get feature maps from forward pass
fprop_maps = net.blobs[outputLayerName].data[0]
# backward pass till last conv layer (inception5b/output)
out = net.backward(start = topLayerName, end = outputLayerName)
# get weights of maps
map_weights = net.blobs[outputLayerName].diff[0].sum(1).sum(1)
map_weights = map_weights.repeat(fprop_maps.shape[1]*fprop_maps.shape[2]).reshape(map_weights.shape[0],fprop_maps.shape[1],fprop_maps.shape[2])
gradCAM_beforeReLU = np.multiply(fprop_maps,map_weights).sum(0)
#pass through ReLU
gradCAM = (np.maximum(gradCAM_beforeReLU,0))
gradCAM = transform.resize(Normalize(gradCAM), (img.shape[:2]), order = 3, mode = 'nearest')
return gradCAM
Please let me know if you face any issues. Also, I should be able to get a VGG-16 network fine-tuned on COCO in a weeks time if you would like to stick to torch.