sauradip/STALE

if action_end - action_start > 1 and gt_label in lbl_dict

Opened this issue · 1 comments

How should i understand action_end - action_start > 1?

corr_sec = float(num_frame) / vid_frame * num_sec
label_list= []
if subset in subset_vid:
    for j in range(len(labels)):
        tmp_info = labels[j]
        clip_factor = self.temporal_scale / ( corr_sec * (self.num_frame+1) )
        action_start = tmp_info['segment'][0]*clip_factor
        snip_start = max(min(1, tmp_info['segment'][0] / corr_sec), 0)
        action_end = tmp_info['segment'][1]*clip_factor
        snip_end = max(min(1, tmp_info['segment'][1] / corr_sec), 0)
        gt_label = tmp_info["label"]

    if action_end - action_start > 1 and gt_label in lbl_dict:
        label_list.append([snip_start,snip_end,gt_label])

屏幕截图 2023-12-28 115209

The picture above is the real label of video(3l7quTy4c2s), and the picture below is the label processed by your code. You can see that the missing part is really a label.

After processing action_end-action_start>1, I found that the small segment in Anet were not added to the calculation? Is this considered a data loss? Or is there a precedent for doing this before?

In our masking free approach, we check if action_Start and action_end atleast point to two different snippets, that is why we keep one snippet gap between the two. This is done as there is no predefined regressor, so if the snippet calculation falls within this 1 snippet/ 1 subsnippet duration , that means start and end pointer for the action represent the same snippet, in that case action duration will be zero. So we remove such snippets.