tobiascz/VideoPose3D

Couple of Questions on Input and Output

alecda573 opened this issue · 0 comments

First, very nice work!

Following your paper the 3d reconstruction network takes in an action sequence of 2D joint keypoint locations of window_size = 243 and produces only one 3d skeleton according to the architecture diagram on page 3 of the paper? If this is the case how do you end up producing the whole action sequence in 3D?

Also what I found confusing was according to this issue #4
you state that the input to the network is actually the video file?

Just wondering if you could clarify these two points for me? Thanks!