artest08/LateTemporalModeling3DCNN

Frame-level classification using BERT

loubnabnl opened this issue · 0 comments

Thank you for this great work.

I am working on a similar problem, I want to apply BERT for some frame features extracted using I3D, but I want to perform frame-level classification rather than video classification. I was wondering how I can adapt your implementation since you use the classification token which is defined on video-level and not frame-level.

I also had a question regarding the training of the model, I wanted to know why you don't include the loss from the predictions of the masked frames?

Any help would be appreciated 😄 !