Extract ROI-pooled features for detected box
9thDimension opened this issue · 4 comments
I've had good success training Mask RCNN to detect my objects.
Now my goal is to extract the ROI-pooled features from a detection zone so that I can do further analysis and/or training with them. It looks like prior to being classified each ROI on the final feature maps are resized to 7x7x256
.
It seems like this line https://github.com/matterport/Mask_RCNN/blob/master/mrcnn/model.py#L925 is responsible for doing the ROI pooling - i.e. extracting the fixed size 7x7x256
features from various sized boxes around the image. So I made the following modification:
def fpn_classifier_graph(rois, feature_maps, image_meta,
pool_size, num_classes, train_bn=True,
fc_layers_size=1024):
x = PyramidROIAlign([pool_size, pool_size],
name="roi_align_classifier")([rois, image_meta] + feature_maps)
pooled_roi_features = x # my modification
# do the rest of the stuff in the method...
return mrcnn_class_logits, mrcnn_probs, mrcnn_bbox, pooled_roi_features
So that fpn_classifier_graph()
now returns the actual features, as well as the class logits, bboxes and whatever else...
The pooled_rois
have shape (batch, 1000, 7, 7, 256)
. However shortly after the method refine_detections_graph
takes in the roi object scores and box coordinates, in order to prune away (overlapping?, background?) results I think. https://github.com/matterport/Mask_RCNN/blob/master/mrcnn/model.py#L685 After going through this method only config.DETECTION_MAX_INSTANCES
= 100 detections remain. I can't really follow how they are filtered and possibly re-ordered, however I see that extensive use of NMS and tf.gather
is made. Also I'm not totally sure what is meant when the docstring explains that the input rois: [N, (y1, x1, y2, x2)] in normalized coordinates
. What exactly is the normalization? To the unit square? To the shrunken image size (128x128 by default?)?
I think I need to apply the same filtering operations to reduce the number of ROI-pooled features down from 1000 to 100 inside the refine_detections_graph()
method. Any help with this is much appreciated.
Hi, do you solve this problem? I hope to extract roialigned feature for specific bounding box in the original image, If you find a way to do this, can you share the code? Thanks a lot!
Any word on this? I'm finding the documentation for methods like refine_detections_graph()
to be quite sparse. I am still unable to filter the ROI pooled features alongside the rest of the objects.
@yangshao See my response to #1249 (comment)
@9thDimension Hi, can you please send the link of your model.py file as I'm getting an error by doing the changes mentioned by you.