Yuqifan1117/CaCao

The question for the number of predicates based on Cacao

Yassin-fan opened this issue · 3 comments

Hello, I downloaded the enhanced VG-50 dataset provided by you and compared it with the original VG-50.h5 dataset.

After comparing, I found that only 25 predicates showed an increase in the number and the other 25 predicates even showed a decrease.

In addition, the number of each predicate is also inconsistent with that in Tab6 and Tab7 in the paper.

My method of comparison is to count the number of occurrences of each element in the column 'predicates' in the .h5 file.

May I ask, did you do additional processing and filtering after enhancing the predicates?

Thanks!

Thanks for your attention. To solve the imbalance of long-tail distribution and ensure the quality of generated data, we further (1) filtered out the triples with non-overlapped bounding boxes and (2) mapped the enhanced coarse-grained predicate to the fine-grained target predicate (25 categories), resulting in the enhanced VG-50 dataset. The appendix demonstrates CaCao's visual data generation trends, but this is not the dataset used for final training.

Thank you for your reply, so may I ask if the choice of these 25 categories is based on experience? Or are there certain criteria?

For I realized that the enhanced predicate categories are not the 25 categories with the lowest sample size in the original dataset.

The selection of these 25 classes is based on the relationship dependence in [1], but is a bit different due to the sample size. We choose those relatively few fine-grained predicates, which are expressive of visual semantics instead of simple relationships.

[1] Vincent S Chen, Paroma Varma, Ranjay Krishna, Michael Bernstein, Christopher Re, and Li Fei-Fei. Scene graph prediction with limited labels.