on realistic videos
Opened this issue · 1 comments
ailsa0506 commented
Hi,
Thanks for your solid theoretical and experiment proof. I'd like to try to use your method for my own recorded videos and output the action categories and temporal positioning, what do you suggest for that! Looking forward to your reply.
dingfengshi commented
Hi, maybe you can try to extract the feature sequence for each video with a pretrained backbone and feed them into TriDet. If your dataset is small, you can try the same config with THUMOS14 first.