How did 有
bityangke opened this issue · 1 comments
First, thank you very much for sharing your work.
I still have a question about the training data.
How did you process the multi-label frames in the training data?
e.g. , for CliffDiving, almost all the frames also belong to Diving,
when assign one-hot labels for these frames, you assign them
[0,0,0,0,0,1,0,0,1,0,......] or just make two copies of the frames and assign them
[0,0,0,0,0,1,0,0,0,......] and [0,0,0,0,0,0,0,0,1,0,......] respectively.
Hi,
The ground truth data used during testing for evaluation is multi-label of 21 classes.
But during training, we simply use one-hot label and only treat frames (that belongs to diving but not belongs to cliffdiving) as diving frames. During prediction, all frames predicted as cliffdiving will also be set as diving to form multi-label prediction.