[ASK]Input questions about action model
YLiyu opened this issue · 4 comments
Description
Follow the example given(01_training_introduction | Introduction to action recognition: training, evaluating, predicting
),I noticed that the input's shape of action model for trainning is [8,3,8,112,112],I thought for a long time,Does it mean that takes 8 frames in a row from the video every time,if that's the case,does it enough for action model? or there are other explanations? Thanks
Other Comments
Yes, that's correct, by default the number of consecutive input frames is 8. Note that this can be increase to e.g. 32, by setting MODEL_INPUT_SIZE = 32 in this notebook:
https://github.com/microsoft/computervision-recipes/blob/master/scenarios/action_recognition/01_training_introduction.ipynb
OK,I'm curious why set the the number of consecutive input frames is 8 or 32,why not 9 or 31 or others?Is it because of what the paper says limited by computing power?
The authors in the papers tried 8 and 32, but yes, you can also set it to other frame numbers. The higher though the slower it will get.
OK,I got it,thx