kentaroy47/frcnn-from-scratch-with-keras

Empty Detections

mtlouie-unm opened this issue · 31 comments

When executing test_frcnn.py it seems that I pass in the path to where my test images are located, but I get zero detections when testing the model that was successfully trained. Wouldn't the testing phase need labeled data to see how well the model detects?

right now, test.py just generates images with detection results. (with --write options).
I guess you want to see the calculated mAP? I will work with that feature so plz wait..

Thanks for responding.

Oh it seems that the network is classifying each of the test images as simply background (bg). Am I supposed to give some training images of just background?

If so, in order to give the network to train on images of just background would my simple data text file look like this:

/data/imgs/img_001.jpg,837,346,981,456,cow
/data/imgs/img_002.jpg,215,312,279,391,cat
/data/imgs/img_002.jpg,22,5,89,84,bird
/data/imgs/img_003.jpg,,,,,

Where the last line (/data/imgs/img_003.jpg,,,,,) would be an example of background.

Hi,

I have same problem, the training results seemed good but all test images are empty detection. If I tested on train dataset, the result was same. Have you managed to fix this?

@mtlouie-unm @franyoadam
I think this issue is fixed by the past commits.
Pulling the newest git should pass this issue,

git pull

I have the same problem.
I am using the latest code, but all images are empty detection.

@tossy-yossy
Yes, I still found this issue when trained with pascal2007 images.
since it was working previously, let me revert my enviroment (tf+keras) back and check if it cures.
if it's urgent, I recommend using pytorch object detection repos which supports coco training as well.
https://github.com/jwyang/faster-rcnn.pytorch
https://github.com/kentaroy47/ObjectDetection.Pytorch

(I have a paper submission this week and get back after that's finished)

@tossy-yossy @mtlouie-unm @franyoadam
This issues have been fixed.
The cause was the positive-negative ratio of the detections.
I fixed the number of RPNs so that the postive-negative ratio will be about 1:3, as in other implementations. Please pull the newest version to activate this.

The training will be stable with using pretrained RPN models.

The pretrained RPN for VGG is uploaded to:
https://drive.google.com/file/d/1teuXIRN4mvmbnIfWlxAAM69hJEceTpUm/view?usp=sharing
The trained vgg frcnn model is uploaded (is underfitting..):
https://drive.google.com/file/d/1IgxPP0aI5pxyPHVSM2ZJjN1p9dtE4_64/view?usp=sharing

Here is the example command.

python train_frcnn.py --network vgg -p to/your/voc --load rpn-mode.pth

@tossy-yossy @mtlouie-unm @franyoadam
This issues have been fixed.
The cause was the positive-negative ratio of the detections.
I fixed the number of RPNs so that the postive-negative ratio will be about 1:3, as in other implementations. Please pull the newest version to activate this.

The training will be stable with using pretrained RPN models.

The pretrained RPN for VGG is uploaded to:
https://drive.google.com/file/d/1teuXIRN4mvmbnIfWlxAAM69hJEceTpUm/view?usp=sharing
The trained vgg frcnn model is uploaded (is underfitting..):
https://drive.google.com/file/d/1IgxPP0aI5pxyPHVSM2ZJjN1p9dtE4_64/view?usp=sharing

Here is the example command.

python train_frcnn.py --network vgg -p to/your/voc --load rpn-mode.pth

Hello,
I have the same problem. I trained the whole network (i didnt pretrained the rpn) with mobilenetv2.
Should I change something at test.py or train.py? Any Ideas? I pulled the latest code

@ianstath
if the mean object per image is under 2 during training, you should pretrain the rpn and use it to train frcnn. or you may train with option -n 6, which will reject more negative proposals.
I haven't tried pascal_voc with mobilenetv2, but I'm thinking that is the case..

@ianstath
if the mean object per image is under 2 during training, you should pretrain the rpn and use it to train frcnn. or you may train with option -n 6, which will reject more negative proposals.
I haven't tried pascal_voc with mobilenetv2, but I'm thinking that is the case..

mean object per image? you mean one finding per image?Yes, in most o the Images, I have only one object to be detected.
Train with -n 6 the rpn or the whole? Now the default is -n 10. Will it make a difference? What if I trained only the mobilenet? Also, I am using my own dataset of images( images of brain MRI with tumors)

Also, I thought that -n 6 controls the size of the batch. If not, how I control the batch size in order not to run out of memory

@ianstath

python train_frcnn.py --network mobilenetv2  -p ../VOCdevkit/ -n 6

may help. I haven't tried on mobilenet so would help if you can.
Thanks.

I'm also encountering the empty detections issue. Trained both the RPN & detection network on my own dataset using vgg pretrained weights. Do you have any idea on how to solve this issue?

I'm also encountering the empty detections issue. Trained both the RPN & detection network on my own dataset using vgg pretrained weights. Do you have any idea on how to solve this issue?

For me, it was matter of dataset and overfitting. The resnet50 seems to be quite big for a dataset of 240 images.
So i tried vgg with data augmentation and I added dropout layers. I am still on the train but the first results are very good. I can detect regions of interest in the most pictures of the test set.
So how many samples are you using?

I'm also encountering the empty detections issue. Trained both the RPN & detection network on my own dataset using vgg pretrained weights. Do you have any idea on how to solve this issue?

For me, it was matter of dataset and overfitting. The resnet50 seems to be quite big for a dataset of 240 images.
So i tried vgg with data augmentation and I added dropout layers. I am still on the train but the first results are very good. I can detect regions of interest in the most pictures of the test set.
So how many samples are you using?

This was really just a testrun on 2 of the like 10 classes I want to detect eventually.
Trained the network on like 3K images/class, so a total of 6K.
Where did you alter the code in order to add the data augmentation and dropout layers?

Also; did you manage to extract the ROI's proposed by the RPN alone?
I'm working on this code as we speak.

I will give some tips for training frcnns.

Check whether the rpn_cls or detector_cls losses aren't too high (>1 after training is quite high).

  • if the rpn is bad, check if the rpn trains well with rpn only training. If it doesn't train well with rpn alone, your train data may be bad.

  • Always use imagenet pretrained weights for the backbone. Improves convergence.

  • Misdetections (or empty detections) occur when the detector isn't training very well. you may want to change the detector layers and making them simpler since they may be underfitting. see vgg.py or resnet.py implementations.

Hello,
I am using my own dataset of images to train the network. When executing test_frcnn.py, the test images are empty detection.And i pulled the newest version. My data set form is the data form of PASCAL_VOC. So the command is
python test_frcnn.py --network resnet50 -p ./test_img --write
But the result is
22.jpg
(133, 4)
Elapsed time = 0.6575415134429932
[]
{}
Do you have any idea on how to solve this issue?

I'm also encountering the empty detections issue. Trained both the RPN & detection network on my own dataset using vgg pretrained weights. Do you have any idea on how to solve this issue?

For me, it was matter of dataset and overfitting. The resnet50 seems to be quite big for a dataset of 240 images.
So i tried vgg with data augmentation and I added dropout layers. I am still on the train but the first results are very good. I can detect regions of interest in the most pictures of the test set.
So how many samples are you using?

This was really just a testrun on 2 of the like 10 classes I want to detect eventually.
Trained the network on like 3K images/class, so a total of 6K.
Where did you alter the code in order to add the data augmentation and dropout layers?

Also; did you manage to extract the ROI's proposed by the RPN alone?
I'm working on this code as we speak.

check the parsers opotions for data augmentation. Also the vgg.py or resnet.py for adding dropout or make any change to vase networks.

I didnt make to extact the ROI's of RPN. I only cropped the proposals of the test.py.
There a lot to be done to the model (i.e checkpoints to the train_frcnn.py or validation loss or mAP addition)

@kentaroy47 I'm getting also Empty Detections, Any idea to solve it?

Which is a good loss in the RPN at the end of the training process?

This is the code i'm using in the test step:

!python test_frcnn.py --network mobilenetv2 -p test/ --load models/mobilenetv2/voc.hdf5 --write

Here is a sample of the ouput:

Using TensorFlow backend.
2019-10-24 03:13:17.835487: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
{0: 'grieta', 1: 'corrocion', 2: 'bg'}
Loading weights from models/mobilenetv2/voc.hdf5
frame1609.png
(300, 4)
[[[1.12452474e-03 4.83090553e-05 9.98827159e-01]
[9.25188791e-03 2.89127696e-04 9.90458965e-01]
...
[2.4799172e-02 2.8327608e-04 9.7491759e-01]]]
Elapsed time = 9.977138996124268
[]
{}

Aymdr commented

@kentaroy47 I'm getting also Empty Detections, Any idea to solve it?

Which is a good loss in the RPN at the end of the training process?

This is the code i'm using in the test step:

!python test_frcnn.py --network mobilenetv2 -p test/ --load models/mobilenetv2/voc.hdf5 --write

Here is a sample of the ouput:

Using TensorFlow backend.
2019-10-24 03:13:17.835487: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
{0: 'grieta', 1: 'corrocion', 2: 'bg'}
Loading weights from models/mobilenetv2/voc.hdf5
frame1609.png
(300, 4)
[[[1.12452474e-03 4.83090553e-05 9.98827159e-01]
[9.25188791e-03 2.89127696e-04 9.90458965e-01]
...
[2.4799172e-02 2.8327608e-04 9.7491759e-01]]]
Elapsed time = 9.977138996124268
[]
{}

In the train_frcnn.py, from R = roi_helpers.rpn_to_roi(P_rpn[0], P_rpn[1], C, K.image_dim_ordering(), use_regr=True, overlap_thresh=0.4, max_boxes=300) the overlap_thresh is 0.4 ,I think it is a mistake and it lead to the high score in 'bg',I met same problem and when I change it to 0.7, it solved. Maybe you can have a try.

@Aymdr So do you recommend me to incress the overlap_thresh=0.4 to overlap_thresh=0.7?

Aymdr commented

@Aymdr So do you recommend me to incress the overlap_thresh=0.4 to overlap_thresh=0.7?

In my training yesterday, in train_frcnn.py, i change the overlap_thresh = 0.9, and in test_frcnn.py ,I change the overlap_thresh = 0.7, bbox_threshold = 0.8, and I do not train the RPN, the result is not empty and have a good effect

I tried recommended steps as above, but i am also getting empty detections for resnet50.
Any ideas, on how to make this work ?

@Aymdr So do you recommend me to incress the overlap_thresh=0.4 to overlap_thresh=0.7?

Is the problem solved? i have the same problem.

Hi all, I have the same problem as above: performances with backbone VGG16 are quite good, but they are much lower when using resnet50 or IRV2 as backbones.
In particular I obtain very often empty detections for both training and test sets, even if accuracy in training is very high (up to 98% for resnet50 and 99% for IRV2) and training loss is well below 1.
I have tried all of the suggestions presented above (for example changing overlap_thresh or num_rois, etc ) but nothing solve the problem.
I execute training in two steps as suggested, training first the RPN and then the whole architecture.
In particular, by means of debugging, it seems that RPN works very well whereas the classifier network after every batch of 1 image has good detection capabilities on the last batches, but it forgets after few iterations.
Thus it seems that catastrophic forgetting takes place, which indeed may affect online learning as in this case (the weights are updated after every image is presented to the network).
Any ideas on how to solve this issue?

Hi all, as anyone tried any hyperparameter tuning methods? such as grid search?

@kentaroy47 still have the same issue....
I can't get any bbox even in the training set...

Hi all, I have the same problem as above: performances with backbone VGG16 are quite good, but they are much lower when using resnet50 or IRV2 as backbones.
In particular I obtain very often empty detections for both training and test sets, even if accuracy in training is very high (up to 98% for resnet50 and 99% for IRV2) and training loss is well below 1.
I have tried all of the suggestions presented above (for example changing overlap_thresh or num_rois, etc ) but nothing solve the problem.
I execute training in two steps as suggested, training first the RPN and then the whole architecture.
In particular, by means of debugging, it seems that RPN works very well whereas the classifier network after every batch of 1 image has good detection capabilities on the last batches, but it forgets after few iterations.
Thus it seems that catastrophic forgetting takes place, which indeed may affect online learning as in this case (the weights are updated after every image is presented to the network).
Any ideas on how to solve this issue?

@alessandrobetti hi, did you get any way to solve the issue?
I have same issue.

@yellowjs0304
Hi, unfortunately I have not solved yet the issue.

Hi all, I have the same problem as above: performances with backbone VGG16 are quite good, but they are much lower when using resnet50 or IRV2 as backbones.
In particular I obtain very often empty detections for both training and test sets, even if accuracy in training is very high (up to 98% for resnet50 and 99% for IRV2) and training loss is well below 1.
I have tried all of the suggestions presented above (for example changing overlap_thresh or num_rois, etc ) but nothing solve the problem.
I execute training in two steps as suggested, training first the RPN and then the whole architecture.
In particular, by means of debugging, it seems that RPN works very well whereas the classifier network after every batch of 1 image has good detection capabilities on the last batches, but it forgets after few iterations.
Thus it seems that catastrophic forgetting takes place, which indeed may affect online learning as in this case (the weights are updated after every image is presented to the network).
Any ideas on how to solve this issue?

did you solved ? i have the same issue, but increasing the max bounding box retrieved from RPN the mAP increased to 60.

@Aymdr @ambigus9 @kentaroy47 no detection i change threshold value to 0.7 still not able to detect.. help me.. and do please about map