dingfengshi/TriDet

on realistic videos

Opened this issue · 1 comments

Hi,
Thanks for your solid theoretical and experiment proof. I'd like to try to use your method for my own recorded videos and output the action categories and temporal positioning, what do you suggest for that! Looking forward to your reply.

Hi, maybe you can try to extract the feature sequence for each video with a pretrained backbone and feed them into TriDet. If your dataset is small, you can try the same config with THUMOS14 first.