GT DETECTION
Pursue26 opened this issue · 3 comments
For the anno_ bbox.mat on the HICO-DET datasets, when a HOI action GT label (e.g, ride horse) is given from an image, and it is based on a GT human box (named H1) and a GT object box (named O1).
BUT, there may also be another HOI action (e.g, sit on horse), and it is also based on H1 (named H2) and O1 (named O2), but the detection coordinates may be slightly different from the H1, O1. This phenomenon usually exists in GT anno_bbox.mat file on the HICO-DET datasets.
I want to know whether your GT DETECTION regards the above H-O pairs as two samples (i.e., H1-O1 and H2-O2) or one sample (i.e., H1-O1 or H2-O2) . Then these samples are fed into your prediction network.
Thank you.
Since the labeling process of HICO-DET, one person may have multi-box with a slight difference. In our GT test, we did not fuse these "jittered'' boxes but directly input them into the model and run the evaluation, i.e., as two samples.
get.
I also want to ask about your HICO-DET Detector from DRG, do you use the test_HICO_finetuned_v3.pkl file provided by DRG paper for performance evaluation?
We have reorganized the file into our format (the one provided in our repo), but the boxes and threshold are the same as DRG.