gulvarol/ltc

Splitting channels with t frames

Opened this issue · 3 comments

For example, I have a 210 frames video and t = 20 then input is created such that each input is made of 58x58x20 but the last one will be of 58x58x10. Could you tell me how was this case considered?

You mean for training or testing? For training, we ensure to choose a random clip with enough frames to avoid this:
https://github.com/gulvarol/ltc/blob/master/donkey.lua#L42

For testing, according to your example, we take the last 20 frames (191-210) of the video, overlapping the first 10 with the previous clip, as this line does:
https://github.com/gulvarol/ltc/blob/master/donkey.lua#L51

As an extra note, if there is not enough frames in the entire video, we pad by copying the video.
"If the number of frames in a video is less than the clip size, we pad the input by repeating the last frames to fill the missing volume." (Section 3.3)
https://github.com/gulvarol/ltc/blob/master/donkey.lua#L78

I don't think these make a big difference in the final results.

I could get the testing part but in the training phase, if there are 200 frames and t = 20, only 20 frames of the 200 frames clip is considered? Correct me if I am wrong :)

No we sample a random starting frame between 1 and 191, so we use all frames.

math.ceil(torch.uniform(1e-2, N-loadSize[2]+1))
N = 210
loadSize[2] = 20
The interval becomes [1, 191]