giddyyupp/coco-minitrain

Regarding instances_minitrain2017.json

bryanbocao opened this issue · 1 comments

One clarification question: is the link of
instances_minitrain2017.json
you shared is the 25k training samples that has been sampled by sample_coco.py for 10M times? Then I don't need to run it for 25M times #17.

So by using this instances_minitrain2017.json, we can essentially download these 25k training images as the same as yours by

python3 coco_download.py --annotation <path_to_instances_minitrain2017.json> --output train2017_mini_25k

?
Thanks in advance @giddyyupp

Hello,
Yes indeed, the json we shared contains sampled 25k coco training images by using the --run_count value of 10M. You don't need to rerun the code with the same parameters. However, since the sampling is random, if you let it run for more let's say iterations, then you may end up with a better similarity to coco in terms of distributions (object counts based on size/class etc.).
Yes you could download the coco-minitrain images using the shared script. If you already have all of the coco training images, then you don't have to download again. just use the shared json file for training.