Question about sliding window

Question

Closed this issue 3 years ago · 1 comments

Dear authors,

Do you use sliding window over the input video in both algorithms ? (ICASSP and CVPRW)
Since the input video size is reduced to N-15 frames from N how do you extract per frame labels ( 0 or 1 ) ?

Thank you in advance

Answer 1 · 2021-07-23T11:42:43.000Z

Hi,

Yes, for both algorithms we use the I3D network with sliding window over the input video to extract features (windows size: 16 frames, stride: 1 frame)
Because of the window size of 16 frames for the feature extraction we get a reduced number (N-15) of feature vectors. We assign the feature vector always to the middle frame of the window. Therefore you don't get features for the first and last few frames and also no boundary predictions. In case you need them you can extend the video by half of the window size on both sides.