[ASK]Input questions about action model

Question

[ASK]Input questions about action model

YLiyu opened this issue 4 years ago · 4 comments

Description

Follow the example given(01_training_introduction | Introduction to action recognition: training, evaluating, predicting
)，I noticed that the input's shape of action model for trainning is [8,3,8,112,112],I thought for a long time,Does it mean that takes 8 frames in a row from the video every time,if that's the case,does it enough for action model? or there are other explanations? Thanks

Other Comments

Answer 1 · 2020-10-09T13:06:59.000Z

Yes, that's correct, by default the number of consecutive input frames is 8. Note that this can be increase to e.g. 32, by setting MODEL_INPUT_SIZE = 32 in this notebook:
https://github.com/microsoft/computervision-recipes/blob/master/scenarios/action_recognition/01_training_introduction.ipynb

Answer 2 · 2020-10-10T02:37:39.000Z

OK,I'm curious why set the the number of consecutive input frames is 8 or 32,why not 9 or 31 or others?Is it because of what the paper says limited by computing power?

Answer 3 · 2020-10-12T12:50:20.000Z

The authors in the papers tried 8 and 32, but yes, you can also set it to other frame numbers. The higher though the slower it will get.

Answer 4 · 2020-10-12T12:52:37.000Z

OK,I got it,thx