As i test demo, what's wrong?
bleakie opened this issue · 12 comments
Net('/home/sai/YANG/smallhardface/output/face/demo/face_2018_12_11_11_46_04/test.prototxt', 1, weights='')
Traceback (most recent call last):
File "/media/sai/6EB21275B21241CF/softwar/pycharm-2018.3/helpers/pydev/pydevd.py", line 1689, in
main()
File "/media/sai/6EB21275B21241CF/softwar/pycharm-2018.3/helpers/pydev/pydevd.py", line 1683, in main
globals = debugger.run(setup['file'], None, None, is_module)
File "/media/sai/6EB21275B21241CF/softwar/pycharm-2018.3/helpers/pydev/pydevd.py", line 1083, in run
pydev_imports.execfile(file, globals, locals) # execute the script
File "/home/sai/YANG/smallhardface/train_test.py", line 136, in
test_net(imdb, output_dir, target_test, no_cache=cfg.TEST.NO_CACHE)
File "lib/test.py", line 299, in test_net
return demo(target_test, thresh)
File "lib/test.py", line 275, in demo
net = caffe.Net(str(target_test), str(cfg.TEST.MODEL), caffe.TEST)
RuntimeError: Could not open file
Hi, it look like either target_test
or cfg.TEST.MODEL
could not be found. Can you verify that?
Thank you for your perfect code,I tested winder-face-val, but the results were Easy: 0.9411, Medium: 0.9193, Hard: 0.7539, using smallhardface.toml. What's the reason?
Hi, are you using my pretrained model? Can you share your testing command and stderr.log?
链接: https://pan.baidu.com/s/119K-1vBYLaAP2Vll03ZQQA 提取码: umnn
You can check it here.
Hi, I noticed in your cfgs.txt, there are some parameters different from mine. For example TEST.SCORE_THRESH and TEST.NMS_THRESH. Did you changed the toml configuration files?
According to your parameters can achieve the effect of the paper, but TEST.SCORE_THRESH=0.002 is not unreasonable?
To compute AP, we need to compute the precision at different recall level. So it is important to keep a low threshold, to get those (precision, recall) pairs where precision is low and recall is high. For example, S3FD used 0.05 as the threshold and PyramidBox used 0.01 as the threshold. I just tried to eval my model with TEST.SCORE_THRESH = 0.01, the performances are the same.
For visualizing detection, we may use a higher result to keep those good detection-boxes to make the visualization clear.
Oh, I just noticed that, in my code, there are two places limiting the score threshold. Please see https://github.com/bairdzhang/smallhardface/blob/master/lib/test.py#L293. My score threshold is actually 0.05, not 0.002 (as you can inspect detection results in result.tar.gz, the minimum score is actually 0.05).
Anyway, a low test score threshold is not unreasonable.
Thank you very much for your patience. Choosing a small threshold is only for higher accuracy, but a reasonable value is still needed in testing. But is it unreasonable to use different scales in different test sets?
Choosing a small threshold is for higher recall, to compute the AP. Faces in different datasets have different size (e.g. WIDER FACE has small faces), and our method only learns small faces, so we choose different sizes for different datasets, for both speed and accuracy. I believe you can still achieve similar performance on AFW/FDDB/Pascal if you use a unified scale-set. For WIDER we have to zoom in the images.
I just checked that, if you are happier with same scales in different test sets, you can use the configuration in smallhardface-afw.toml (https://github.com/bairdzhang/smallhardface/blob/master/configs/smallhardface-afw.toml#L6-L7) for FDDB, AFW and PascalFace.
I added TEST.FLIP True TEST.SCALES 50,100,200,400,600
at the end of my evaluation command, the performance for FDDB, AFW and Pascal are: 98.7, 99.6 and 99.3.
I see. Thank you so much. Next I'll train my data and try it out.