additional data collected from google image download
ping-Huang opened this issue · 5 comments
Do you have plan to provide the additional data collected which downloads from google image download?
In addition, what keywords do you use for searching in google image download?
Hi, we do not have plans to release any data as data which we used is not ours, but public one. Simple reason is that we do not have rights for these image distribiution. When it comes to google images we manly searched for other (with keywords medicines waste, electric waste, bateries waste, construction waste) and bio waste (garden waste, grass waste, fruits waste) as these are underrepresented in combined classify dataset. For numbers just check the table 2 with statistics: https://www.sciencedirect.com/science/article/pii/S0956053X21006474
Where do you download the name dumped/* in "binary_mixed_test.json"?
from google image download?
How can I get the image by names?
The annotations provided in annotation directory are for detection task, and the way that they were created can be followed (only checked not replicated as provided by us annotations state for result files; we haven't released the source files due to the same reason as for images distribiution) by looking at bash preprocessing script in main directory. The dumped/* images referrs to additional TACO images without annotations form time when the project was created (we annotated them using the resources of Epinote company; and those are enclosed in TACO annotations train/test). The google search was done only for classification task and here we haven't privided the split, just the structure of the directory and number of images in total. We used 20% of data for testing -- the images used in classification come from additional datasets (in egzample TrashNet with one class in the image), where the bounding boxes were not accessible, and therefore our task of trash detection was splited in two steps: litter detection and then classification.
Does that "additional TACO" means the TACO extend in TACO datasets?
Because the filename and image size is different with TACO extend.
How can I align the filenames and labels to additional TACO images ?
Yes, exacly "additional TACO" which I mentioned is TACO extend or whathever we called it previously.
As I mentioned in previous post it is the data from TACO which is unoficialy published in TACO registry. I've just checked the names here, and they agree -> https://raw.githubusercontent.com/pedropro/TACO/master/data/annotations_unofficial.json