GuyTevet/MotionCLIP

issue about rendering the sample

sygyq305 opened this issue · 13 comments

In the visualize.py, the param, "interval", is 1000/fps. May I know why this is set by default or why it is 1000.

I think it is because the units of interval are [mSec]

Thanks.
I have a confusion about the rendering duration. In the default, the fps is 20. In the paper-model's param, 'num_frames' is 60. So all rendered sample durations are 3 seconds. What should I do if I want to render 6-second-sample.
I once tried to change 'num_frames' from 60 to 120. Although the total duration has been to 6 seconds, it only adds 3 seconds to the rendering time. The first three seconds are the same as before, and the last three seconds are stationary.

This model was trained for motions with a fixed length of 60 frames. Please try and retrain it for 120 frames.

Thanks.
I have another confusion about the rendering. All rendered motions used paper-model can appear the rendered frames and their text description used during training. Just like Figure 4 in the paper.
image

Do you ask about the appearance? If so, they all have the same appearance as in Fig 4
https://drive.google.com/file/d/1F8VLY4AC2XPaV3DqKZefQJNWn4KY2z_c/view?usp=sharing

No.
My question is that fig.4 is the training-phase frame. Does the inference phase also have such frames

At inference, you do text-to-motion - i.e. encode text and decode notion, so the rendered frames are unnecessary.

What can I do if I want to see more details about the inference frames.

Do you mean that you want to render the results with a more elaborate body model such as SMPL, instead of the stick figures?

No, my mean is that I want to know how the frame numbers are allocated for each action.
For example, the input text is ’360 degree left jump and standing and turning back‘. There are three actions. They are jump, stand and turn back. How are the frame numbers of jump, stand and turn back allocated.

Got you. So they are not explicitly allocated by the user, but by the model. If you want to interpret the model decisions you can try and adapt transformer interoperability papers to the motion domain. Anyway, that isn't a trivial one.

Thanks.
And which files are the transformer interoperability papers

Sorry, I'm not enough familiar with this field.