About Top1 and Top5 result
fanghuaqi opened this issue · 5 comments
Hi,
I add a test net for the MobileNet-V1, and get caffe testing net accuracy number like this:
````
I0702 04:25:08.025542 60311 caffe.cpp:310] Loss: 1.24232
I0702 04:25:08.025568 60311 caffe.cpp:322] accuracy@1 = 0.69502
I0702 04:25:08.025573 60311 caffe.cpp:322] accuracy@5 = 0.892437
I0702 04:25:08.025578 60311 caffe.cpp:322] loss = 1.24232 (* 1 = 1.24232 loss)
````
The test net I added as
layer {
name: "data"
type: "Data"
top: "data"
top: "label"
include {
phase: TEST
}
transform_param {
scale: 0.017
mean_value: [103.94, 116.78, 123.68]
mirror: false
crop_size: 224
}
data_param {
source: "bvlc_caffe/examples/imagenet/ilsvrc12_val_lmdb"
batch_size: 20
backend: LMDB
}
}
The result is not as good as the one present in the repo's REAME.md
Thanks
Huaqi
please use original images (not lmdb).
Hi shicai, I also tried to use original images to do inference on mobilenet-v2, the result I get is:
I0704 09:38:38.086778 3370 caffe.cpp:310] Loss: 1.17715
I0704 09:38:38.086797 3370 caffe.cpp:322] accuracy@1 = 0.712339
I0704 09:38:38.086804 3370 caffe.cpp:322] accuracy@5 = 0.901795
I0704 09:38:38.086813 3370 caffe.cpp:322] loss = 1.17715 (* 1 = 1.17715 loss)
This is the test prototxt data input section:
layer {
name: "data"
type: "ImageData"
top: "data"
top: "label"
include {
phase: TEST
}
transform_param {
scale: 0.017
mean_value: [103.94, 116.78, 123.68]
mirror: false
crop_size: 224
}
image_data_param {
source: "val.txt"
new_height: 256
new_width: 256
batch_size: 20
root_folder: "ILSVRC2012/val/"
}
}
The result is similar to the values I directly use LMDB in mobilenet-v2. And still lower than the official one, is there anything wrong in this test prototxt?
Is the steps in eval_image.py correct for inference? I see in that script, the image is cropped then resize to 224x224 not resize to 256xN, then crop to 224x224.
Thanks
Huaqi
eval_image.py
is just an example for evaluating single image.
to reproduce the performance on imagenet val dataset, you should resize the image to 256xN, then crop the center 224x224 regions, then feed it into the model.
@fanghuaqi @shicai @lutzroeder @mn-robot 弱弱的请问一下top1和top5可以详细解释一下吗???