Ze-Yang/Context-Transformer

Issue about "trainval_1shot.txt"

whsun21 opened this issue · 2 comments

Your work is excellent, but I'm confused with the contents of "trainval_1shot.txt" when I reproducing "Phase 2, Transfer Setting, To finetune on VOC dataset (1 shot)".

In your paper, it is said that "The few-shot training set consists of N images (per category)", so in the above settings, if I understand correctly, the contents of "trainval_1shot.txt" should have 20 categories of boxes in total and have 1 image per category.

However, as shown below, if I didn't count wrongly, there are only 11 categories of boxes in total, and some categories have more than 1 image, which is not consistent with the above and confused me a lot.

I’d be grateful if you could explain the processes of creating all the "trainval_1shot.txt, trainval_2shot.txt, trainval_3shot.txt, ..." files. Thanks a lot!

trainval_1shot.txt
Image_index|box category|# of boxes|
007654 aero 1
003137 bus 1
008442 cat 1
003452 dinnertable 1 chair 6
004141 cow 2 person 1
000249 chair 7 dinnertable 1
005018 dog 2
006177 motorbike 1 person 1 cow 2
004424 person 5
006351 person 1 pottedplant 2
006803 train 1 person 2

Total:
box category|# of images|
aero 1|bus 1|cat 1|dinnertable 2|chair 2|cow 2|person 4|dog 1|motorbike 1|pottedplant 1|train 1

Hi SUN Wenhao, our transfer setting follows the image-shot setup in LSTD task2, except that we use VOC07+12 for the second stage finetuning. As a result, you need to combine the trainval_nshot.txt files in VOC07+12 to form the final few-shot dataset for finetuning, which may explain why you only observe 11 categories (maybe only the part in VOC2007).

Besides, an image may contain several boxes belonging to different categories, however it is counted only according to one of its box categories, leaving remaining boxes uncounted yet used. This explains why some categories own number of boxes greater than N-shot. I think it is a characteristic of image-shot setting. You may also refer to issue #3. Hope that it clarifies your confusion. Thanks.

Thank you for your prompt reply, I did miss the part in VOC2012. Your detailed explanation helped me a lot, thank you very much!