dingfengshi/TriDet

How do you understand this code?

Opened this issue · 1 comments

    # convert time stamp (in second) into temporal feature grids
    # ok to have small negative values here

(video_item['segments'] * video_item['fps'] - 0.5 * self.num_frames) / feat_stride

the seconds * fps is the frame index. The temporal feature start from the center of the first window (size=num_frame), so the center offset is nun_frame/2. The next feature is the center of the second window, whose index is last_index+feat_stride.