AnchorGenerator not generating complete anchors at frequent strides
JohnMBrandt opened this issue · 0 comments
Describe the bug
I am implementing an object detector based on cascade-rcnn_r50_fpn
with a CocoDataset
dataset. I am working on small, dense object detection, and have adjusted the anchor sizes accordingly, to:
anchor_generator=dict(
type='AnchorGenerator',
scales=[1, 1.5, 2, 2.5, 3, 4],
ratios=[0.5, 1, 2],
strides=[4, 8, 16, 32, 64],
base_sizes=[4, 8, 16, 32, 64],),
This network trains and generates good results on a 512 x 512 image. I have noticed, however, that many of my ground truth boxes are missed by the anchor generator, as even a stride of 4 is too big to generate high enough IoU. This was identified through Pyodi: https://gradiant.github.io/pyodi/reference/apps/train-config-evaluation/
However, if I increase the strides from [4, 8, 16, 32, 64],
to [2, 4, 8, 16, 32]
, set the RPN proposal to be able to create this number of regions, and adjust the base_sizes
and scales
accordingly, the behavior is not as expected.
rpn_proposal=dict(
nms_pre=150e3,
max_per_img=100e3,
anchor_generator=dict(
type='AnchorGenerator',
scales=[2, 4, 8],
ratios=[0.5, 1, 2],
strides=[2, 4, 8, 16, 32],
base_sizes=[2, 4, 8, 16, 32],),
Based on the feature pyramid sizes for a 512 x 512 image as 512, 256, 128, 64, 32
, this should generate 256*256 + 128*128 + 64*64 + 32*32 +16*16 = ~87,000 anchors. With the 4, 8, 16, 32, 64
strides, this should generate ~21,000 anchors.
Based on these calculations, I would expect the above modifications to the rpn_proposal
to enable predictions to be generated for the entire input image. However, the model does not predict boxes on the whole image.
Reproduction
-
Adjust the
cascade-rcnn_r50_fpn
config to have the above modifications toanchor_generator
andrpn_proposal
, train on anyCoco
dataset with512 x 512
images, visualize results. -
What dataset did you use? Development on a custom Coco style dataset, also tested with the
balloon
dataset
Here is an example after 5 epochs with strides=[2, 4, 8, 16, 32]
, none of the images get boxes in the bottom/right sections
Here is an example after 5 epochs with strides=[4, 8, 16, 32, 64]
, predictions look as expected