JosephKJ/OWOD

A question about the composition of the dataset

wsjxdy opened this issue · 5 comments

The 2-4th tasks all have a fit operation. How does the fit.txt formed?(e.g. t2_fit.txt has 1743 images. How how many images come from task1, and how many images come from task2?)
Thanks for your answer !

My original intention was to make an own dataset, but I don’t know if there are any limitations in the fit process. Is part of the data from the known task and part from the current task? Looking forward to your reply!

Does each category of the previous task and each category of the current task consist of 50 samples per category? If so, does the data involved in the fit contain annotations?
In addition, does the data of task2-task4 in val contain annotations?
Thank you very much for your patience!

@JosephKJ Looking forward to your answer! Thank you~

Hi @WangShiJie521 : I believe you are referring to ft and not fit. Its data for balanced finetuning step (Sec 4.4 in the paper). The specifics are given in the last few lines of Sec 5.2. You may use this as a reference, but please note that you need to modify it accordingly: https://github.com/JosephKJ/OWOD/blob/master/datasets/coco_utils/balanced_ft.py
Basically, we keep some datapoints per class in each task seen so far in a memory buffer. The number of dataset to keep might change based on your dataset.
Does each category of the previous task and each category of the current task consist of 50 samples per category? If so, does the data involved in the fit contain annotations?

We need not keep data in the current task, but for all the previously seen classes. I would request you to read Sec 4.4 in the paper.

In addition, does the data of task2-task4 in val contain annotations?

Yes

I understand these details. Thank you for your patience.