google-research/task_adaptation

Questions about the label of the clevr_distance dataset

Opened this issue · 1 comments

Hello, I would like to inquire about how the labels are specifically annotated for the clevr_dist task. I've noticed that some images contain multiple objects with varying distances between them - some are far apart and others are close together. How should the distance label be determined in such cases?

Hi, clevr_distance is (bucketed) distance to the nearest object from the camera.

From the VTAB paper (https://arxiv.org/abs/1910.04867), Appendix A:

Clevr/distance (Johnson et al., 2017) Another synthetic task we create from CLEVR consists of predicting the depth of the
closest object in the image from the camera. The depths are bucketed into size bins.