airsplay/py-bottom-up-attention

A plan to reveal a Batch-based RoI feature extractor?

yanan1989 opened this issue · 3 comments

Thanks for your great job.
I am trying to use the demo tools you have revealed to extract RoI and box features.
Since It is too slow to extract features by inputing single image, would you plan to release a batch-based extractor?

Thanks for the kind words. The demo of batch-wise extraction is here:
https://github.com/airsplay/py-bottom-up-attention/blob/7dd0ab80864a3a401f9ef05f71823f25a0547609/demo/demo_batchwise_feature_extraction.ipynb

It should be still executable. I delete this one in current repo because of two reasons:

  1. Speed: The speed is not improved too much, around 1.3~1.5x speedup. It is because the detection system would scale up the image size from 400-600 to 600-800 during inference. Thus the computational units in GPU are already saturated with one-sample batches.
  2. Feature Quality: The Padding --> Crop in Detectron2 framework to support batch-wise extraction is not exactly the same as single-batch extraction. Thus it would not provide high-quality features, especially on the boundary. I found that it actually would affect the results a little bit.

Thanks for your detailed explanation.
I understood your reply. I would like to share you if I find any faster way to achieve that.
By the way, I am very interested in your work on LXMERT, and I will study it and also watch your other works.

Thanks for your kind words :-).